03f827f298ee467e77c2938e1e3674ace01b7d58
Smart Farm Photo Keyword Tagging AI
Project Overview
This project aims to automate the generation of high-quality, agriculture-relevant keyword tags for agricultural stock photos using AI. The system will replace the current manual keyword tagging process, saving significant time and improving consistency.
What is Expected
- AI Model: A model trained to generate 5–10 relevant, high-quality keywords per image, with a focus on agricultural context and subtle distinctions (e.g., farmer vs. rancher, male vs. female farmer).
- Title Generation: Optionally generate a descriptive product title for each photo (e.g., "Farmer and son walking in cornfield").
- Location Extraction: If location metadata is present in the image, extract and use it as a keyword (e.g., "Iowa").
- CSV Output: For each photo, output a CSV row with:
- Photo file name
- Human-entered keywords (for comparison)
- AI-generated keywords
- AI-generated title (if available)
- Location (if available)
- Training: The system should be trainable on a dataset of ~30,000 currently keyword-tagged photos.
- Scalability: Should handle at least 1,000 photos/month (in batches of 500), with potential to double in 3 years.
- Quality: Keywords and titles must be accurate, relevant, and reflect subtle ag-specific concepts.
Folder Structure
.
├── data/ # Datasets: training, validation, test images, and CSVs
│ ├── raw/ # Raw, unprocessed images and metadata
│ ├── processed/# Preprocessed data ready for modeling
│ └── ...
├── notebooks/ # Jupyter notebooks for EDA, prototyping, and experiments
├── src/ # Source code
│ ├── data/ # Data loading, preprocessing scripts
│ ├── model/ # Model architecture, training, inference code
│ ├── utils/ # Utility functions
│ └── main.py # Main entry point for training/inference
├── outputs/ # Generated outputs (CSVs, predictions, logs)
├── docs.txt # Project requirements and notes
├── README.md # Project overview and instructions
└── .gitignore # Files and folders to ignore in git
Directory Details
- data/: All datasets. Use
raw/for original files,processed/for cleaned/ready-to-use data. - notebooks/: Jupyter notebooks for data exploration, prototyping, and model development.
- src/: All source code, organized by function (data, model, utils).
main.pyis the main script. - outputs/: All generated outputs, including CSVs with AI-generated tags/titles, logs, and model predictions.
- docs.txt: The original requirements and project notes.
- README.md: This file.
- .gitignore: Keeps unnecessary files out of version control.
Deliverables
- Well-documented code in
src/ - At least one Jupyter notebook showing EDA and model prototyping
- Example CSV output as described above
- Instructions for running the system
- (Optional) Trained model weights
Deadline
All deliverables are expected within 3 days of project start.
Description
Languages
Python
91%
Jupyter Notebook
9%