2025-07-03 15:27:59 +01:00
2025-07-03 15:27:59 +01:00
2025-07-03 15:27:59 +01:00
2025-07-03 15:27:59 +01:00
2025-07-03 15:27:59 +01:00
2025-07-03 15:27:59 +01:00

Smart Farm Photo Keyword Tagging AI

Project Overview

This project aims to automate the generation of high-quality, agriculture-relevant keyword tags for agricultural stock photos using AI. The system will replace the current manual keyword tagging process, saving significant time and improving consistency.

What is Expected

  • AI Model: A model trained to generate 510 relevant, high-quality keywords per image, with a focus on agricultural context and subtle distinctions (e.g., farmer vs. rancher, male vs. female farmer).
  • Title Generation: Optionally generate a descriptive product title for each photo (e.g., "Farmer and son walking in cornfield").
  • Location Extraction: If location metadata is present in the image, extract and use it as a keyword (e.g., "Iowa").
  • CSV Output: For each photo, output a CSV row with:
    • Photo file name
    • Human-entered keywords (for comparison)
    • AI-generated keywords
    • AI-generated title (if available)
    • Location (if available)
  • Training: The system should be trainable on a dataset of ~30,000 currently keyword-tagged photos.
  • Scalability: Should handle at least 1,000 photos/month (in batches of 500), with potential to double in 3 years.
  • Quality: Keywords and titles must be accurate, relevant, and reflect subtle ag-specific concepts.

Folder Structure

.
├── data/         # Datasets: training, validation, test images, and CSVs
│   ├── raw/      # Raw, unprocessed images and metadata
│   ├── processed/# Preprocessed data ready for modeling
│   └── ...
├── notebooks/    # Jupyter notebooks for EDA, prototyping, and experiments
├── src/          # Source code
│   ├── data/     # Data loading, preprocessing scripts
│   ├── model/    # Model architecture, training, inference code
│   ├── utils/    # Utility functions
│   └── main.py   # Main entry point for training/inference
├── outputs/      # Generated outputs (CSVs, predictions, logs)
├── docs.txt      # Project requirements and notes
├── README.md     # Project overview and instructions
└── .gitignore    # Files and folders to ignore in git

Directory Details

  • data/: All datasets. Use raw/ for original files, processed/ for cleaned/ready-to-use data.
  • notebooks/: Jupyter notebooks for data exploration, prototyping, and model development.
  • src/: All source code, organized by function (data, model, utils). main.py is the main script.
  • outputs/: All generated outputs, including CSVs with AI-generated tags/titles, logs, and model predictions.
  • docs.txt: The original requirements and project notes.
  • README.md: This file.
  • .gitignore: Keeps unnecessary files out of version control.

Deliverables

  • Well-documented code in src/
  • At least one Jupyter notebook showing EDA and model prototyping
  • Example CSV output as described above
  • Instructions for running the system
  • (Optional) Trained model weights

Deadline

All deliverables are expected within 3 days of project start.

S
Description
No description provided
Readme 28 KiB
Languages
Text 100%