setup repo structure
This commit is contained in:
+41
@@ -0,0 +1,41 @@
|
|||||||
|
# Byte-compiled / optimized / DLL files
|
||||||
|
__pycache__/
|
||||||
|
*.py[cod]
|
||||||
|
*$py.class
|
||||||
|
|
||||||
|
# C extensions
|
||||||
|
*.so
|
||||||
|
|
||||||
|
# Distribution / packaging
|
||||||
|
.Python
|
||||||
|
env/
|
||||||
|
build/
|
||||||
|
develop-eggs/
|
||||||
|
dist/
|
||||||
|
downloads/
|
||||||
|
eggs/
|
||||||
|
.eggs/
|
||||||
|
lib/
|
||||||
|
lib64/
|
||||||
|
parts/
|
||||||
|
sdist/
|
||||||
|
var/
|
||||||
|
*.egg-info/
|
||||||
|
.installed.cfg
|
||||||
|
*.egg
|
||||||
|
|
||||||
|
# Jupyter Notebook checkpoints
|
||||||
|
.ipynb_checkpoints
|
||||||
|
|
||||||
|
# PyCharm
|
||||||
|
.idea/
|
||||||
|
|
||||||
|
# VS Code
|
||||||
|
.vscode/
|
||||||
|
|
||||||
|
# Data and outputs
|
||||||
|
data/
|
||||||
|
outputs/
|
||||||
|
|
||||||
|
# OS files
|
||||||
|
.DS_Store
|
||||||
@@ -0,0 +1,56 @@
|
|||||||
|
# Smart Farm Photo Keyword Tagging AI
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
This project aims to automate the generation of high-quality, agriculture-relevant keyword tags for agricultural stock photos using AI. The system will replace the current manual keyword tagging process, saving significant time and improving consistency.
|
||||||
|
|
||||||
|
## What is Expected
|
||||||
|
- **AI Model**: A model trained to generate 5–10 relevant, high-quality keywords per image, with a focus on agricultural context and subtle distinctions (e.g., farmer vs. rancher, male vs. female farmer).
|
||||||
|
- **Title Generation**: Optionally generate a descriptive product title for each photo (e.g., "Farmer and son walking in cornfield").
|
||||||
|
- **Location Extraction**: If location metadata is present in the image, extract and use it as a keyword (e.g., "Iowa").
|
||||||
|
- **CSV Output**: For each photo, output a CSV row with:
|
||||||
|
- Photo file name
|
||||||
|
- Human-entered keywords (for comparison)
|
||||||
|
- AI-generated keywords
|
||||||
|
- AI-generated title (if available)
|
||||||
|
- Location (if available)
|
||||||
|
- **Training**: The system should be trainable on a dataset of ~30,000 currently keyword-tagged photos.
|
||||||
|
- **Scalability**: Should handle at least 1,000 photos/month (in batches of 500), with potential to double in 3 years.
|
||||||
|
- **Quality**: Keywords and titles must be accurate, relevant, and reflect subtle ag-specific concepts.
|
||||||
|
|
||||||
|
## Folder Structure
|
||||||
|
```
|
||||||
|
.
|
||||||
|
├── data/ # Datasets: training, validation, test images, and CSVs
|
||||||
|
│ ├── raw/ # Raw, unprocessed images and metadata
|
||||||
|
│ ├── processed/# Preprocessed data ready for modeling
|
||||||
|
│ └── ...
|
||||||
|
├── notebooks/ # Jupyter notebooks for EDA, prototyping, and experiments
|
||||||
|
├── src/ # Source code
|
||||||
|
│ ├── data/ # Data loading, preprocessing scripts
|
||||||
|
│ ├── model/ # Model architecture, training, inference code
|
||||||
|
│ ├── utils/ # Utility functions
|
||||||
|
│ └── main.py # Main entry point for training/inference
|
||||||
|
├── outputs/ # Generated outputs (CSVs, predictions, logs)
|
||||||
|
├── docs.txt # Project requirements and notes
|
||||||
|
├── README.md # Project overview and instructions
|
||||||
|
└── .gitignore # Files and folders to ignore in git
|
||||||
|
```
|
||||||
|
|
||||||
|
### Directory Details
|
||||||
|
- **data/**: All datasets. Use `raw/` for original files, `processed/` for cleaned/ready-to-use data.
|
||||||
|
- **notebooks/**: Jupyter notebooks for data exploration, prototyping, and model development.
|
||||||
|
- **src/**: All source code, organized by function (data, model, utils). `main.py` is the main script.
|
||||||
|
- **outputs/**: All generated outputs, including CSVs with AI-generated tags/titles, logs, and model predictions.
|
||||||
|
- **docs.txt**: The original requirements and project notes.
|
||||||
|
- **README.md**: This file.
|
||||||
|
- **.gitignore**: Keeps unnecessary files out of version control.
|
||||||
|
|
||||||
|
## Deliverables
|
||||||
|
- Well-documented code in `src/`
|
||||||
|
- At least one Jupyter notebook showing EDA and model prototyping
|
||||||
|
- Example CSV output as described above
|
||||||
|
- Instructions for running the system
|
||||||
|
- (Optional) Trained model weights
|
||||||
|
|
||||||
|
## Deadline
|
||||||
|
**All deliverables are expected within 3 days of project start.**
|
||||||
@@ -0,0 +1,33 @@
|
|||||||
|
You want to build a custom AI-powered system to automatically generate keyword tags for agricultural stock photos.
|
||||||
|
|
||||||
|
You want the system to help eliminate your current manual keyword tagging process, which is currently handled by an assistant and takes about 10 hours/month.
|
||||||
|
|
||||||
|
You need to process 1,000 photos per month, in batches of 500, and this number may scale up over time (possibly doubling in 3 years).
|
||||||
|
|
||||||
|
You want the system to generate 5 to 10 high-quality keywords per image, with a focus on agricultural relevance.
|
||||||
|
|
||||||
|
You want to be able to train the AI using your current keyword-tagged photo dataset, which contains about 30,000 photos.
|
||||||
|
|
||||||
|
The system must differentiate subtle ag-specific concepts, such as:
|
||||||
|
|
||||||
|
Farmer vs. rancher
|
||||||
|
|
||||||
|
Dairy farmer vs. rancher
|
||||||
|
|
||||||
|
Chicken farmer (not rancher)
|
||||||
|
|
||||||
|
Male vs. female farmers (for diversity tagging)
|
||||||
|
|
||||||
|
You want the system to optionally generate a descriptive product title like: “Farmer and son walking in cornfield.”
|
||||||
|
|
||||||
|
If location metadata is available in the image file, you want the system to extract and use that data as a keyword (e.g., “Iowa”).
|
||||||
|
|
||||||
|
You want the final output in CSV format, with each photo’s file name matched to its:
|
||||||
|
|
||||||
|
Human-entered keywords (for comparison, if needed)
|
||||||
|
|
||||||
|
AI-generated keywords
|
||||||
|
|
||||||
|
AI-generated title (if available)
|
||||||
|
|
||||||
|
Location (if available)
|
||||||
Reference in New Issue
Block a user