2025-07-23 23:23:35 +01:00
2025-07-23 23:23:35 +01:00
2025-07-23 23:23:35 +01:00
2025-07-23 23:23:35 +01:00
2025-07-23 23:23:35 +01:00

DS Task: Tag Scan System

Project Overview

DS Task Tag Scan is an AI-powered clothing tag identification and similarity search system that analyzes clothing tag images, identifies brands using computer vision, and finds similar tags from a database. The system uses advanced AI techniques including image embeddings and text similarity to provide accurate tag matching and recommendations.

Features

  • Tag Identification: Uses computer vision to identify clothing tag brands from images
  • Text-Based Matching: Implements TF-IDF and cosine similarity for tag name matching
  • Image Similarity Search: Uses CLIP embeddings to find visually similar tag images
  • LLM Enhancement: Optional LLM analysis for improved similarity filtering
  • Metadata Extraction: Provides appraisal values, years, and status information for similar tags

Tech Stack

  • Computer Vision: CLIP or ViT models (free from Hugging Face)
  • Text Processing: TF-IDF vectorization and cosine similarity
  • LLM Enhancement: Groq Llama4 Scout/Maverick (optional)
  • Backend: Flask
  • Image Processing: Pillow, OpenCV
  • Data Processing: Pandas, NumPy, scikit-learn

File Structure

ds_task/
│-- backend/
│   │-- app.py  # Main Flask application
│   │-- tag_identification.py  # Tag identification using vision models
│   │-- tag_match.py  # Text-based tag matching using TF-IDF
│   │-- image_similarity.py  # CLIP-based image similarity search
│   │-- config.py  # Configuration settings
│   │-- requirements.txt  # Dependencies
│
│-- data/
│   │-- tag_guides_clean.json  # Tag database with historical images
│   │-- expert_data.csv  # Expert dataset with tag images and metadata
│   │-- community_data.csv  # Community dataset with tag images and metadata
│
│-- docs/
│   │-- README.md  # Project documentation
│   │-- API_Documentation.md  # API details
│
│-- .env  # Environment variables
│-- .gitignore  # Git ignore file
│-- LICENSE  # License information

Setup & Installation

1. Clone the Repository

git clone <repository-url>
cd ds_task

2. Set Up the Backend

cd backend
pip install -r requirements.txt
python app.py

Tag Scan Workflow

The system processes clothing tag images through several stages:

  1. Image Upload: Receive tag image URL
  2. Tag Identification: Use vision model to identify brand/tag
  3. Text Matching: Match identified tag against database using TF-IDF
  4. Similarity Search: Find similar images using CLIP embeddings
  5. Result Aggregation: Return similar images with metadata

Example Tag Identification Code (Python)

from transformers import CLIPProcessor, CLIPModel
import torch

def identify_tag(image_url):
    model = CLIPModel.from_pretrained("openai/clip-vit-base-patch16")
    processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch16")
    
    # Process image and get embeddings
    image = load_image(image_url)
    inputs = processor(images=image, return_tensors="pt")
    image_features = model.get_image_features(**inputs)
    
    # Compare with tag database
    return find_best_match(image_features)

API Endpoints

  • POST /get_tag: Upload image URL and get similar tag images with metadata
  • GET /status: Health check endpoint

Example Request

{
  "image_url": "https://example.com/tag_image.jpg"
}

Example Response

{
  "results": [
    {
      "tag": "Jerzees T-Shirt Tags",
      "year_start": "1985",
      "year_end": "1998"
    },
    {
      "similar_images": ["url1", "url2"],
      "appraisal_value": [150.0, 75.0],
      "years": ["1998", "1997"],
      "status": ["public", "private"]
    }
  ]
}

Key Implementation Tasks

  1. Tag Identification: Implement vision-based tag recognition using free models
  2. Text Matching: Use TF-IDF and cosine similarity for tag matching
  3. Image Similarity: Implement CLIP-based image embedding and search
  4. LLM Enhancement: Add optional LLM analysis for better results
  5. Data Processing: Handle image downloads and metadata extraction
  6. API Design: Create clean Flask endpoints with proper error handling

Data Sources

  • tag_guides_clean.json: Contains tag information with historical images and year ranges
  • expert_data.csv: Contains tag images with appraisal values, status, and metadata
  • community_data.csv: Contains tag images with appraisal values, status, and metadata

Vision Model Options

You can use any of these free models from Hugging Face:

  • CLIP (image-text matching)
  • ViT (image classification)
  • EasyOCR (text extraction)
  • ResNet (image classification)

What Success Looks Like

  • Can extract text from tag images
  • Can match brands using TF-IDF similarity
  • Can find visually similar images
  • Optional: LLM enhances similarity analysis
  • Returns properly formatted JSON response
  • Response time under 60 seconds
S
Description
No description provided
Readme 5.2 MiB
Languages
Python 100%