updated readme

This commit is contained in:
OwusuBlessing
2025-08-06 22:49:29 +01:00
parent fef3f5ae35
commit fd54d4be39
+46 -46
View File
@@ -2,7 +2,7 @@
A comprehensive framework for fine-tuning NLP models with organized YAML configurations, supporting multiple tasks (classification, completion, styling, matching).
## 🎯 Supported Tasks
## Supported Tasks
This framework supports multiple NLP tasks with organized configurations:
@@ -13,60 +13,60 @@ This framework supports multiple NLP tasks with organized configurations:
### Current Implementation Status
- **Classification**: Fully implemented with emotion classification example
- 🔄 **Completion**: Planned for future updates
- 🔄 **Styling**: Planned for future updates
- 🔄 **Matching**: Planned for future updates
- **Classification**: Fully implemented with emotion classification example
- **Completion**: Planned for future updates
- **Styling**: Planned for future updates
- **Matching**: Planned for future updates
**Note**: Currently only classification task is supported. Other tasks (completion, styling, matching) are planned for future updates.
## 🏗️ Project Structure
## Project Structure
```
fine-tune-task/
├── configs/ # YAML configuration files
│ ├── classification/ # Implemented
│ ├── classification/ # Implemented
│ │ ├── emotion.yaml # Emotion classification
│ │ └── custom.yaml # Custom dataset
│ ├── completion/ # 🔄 Planned for future updates
│ ├── styling/ # 🔄 Planned for future updates
│ └── matching/ # 🔄 Planned for future updates
│ ├── completion/ # Planned for future updates
│ ├── styling/ # Planned for future updates
│ └── matching/ # Planned for future updates
├── data/ # Data directories
│ ├── raw/ # Raw input data
│ │ ├── classification/ # Implemented
│ │ ├── completion/ # 🔄 Planned for future updates
│ │ ├── styling/ # 🔄 Planned for future updates
│ │ └── matching/ # 🔄 Planned for future updates
│ │ ├── classification/ # Implemented
│ │ ├── completion/ # Planned for future updates
│ │ ├── styling/ # Planned for future updates
│ │ └── matching/ # Planned for future updates
│ └── processed/ # Processed data
│ ├── classification/ # Implemented
│ ├── completion/ # 🔄 Planned for future updates
│ ├── styling/ # 🔄 Planned for future updates
│ └── matching/ # 🔄 Planned for future updates
│ ├── classification/ # Implemented
│ ├── completion/ # Planned for future updates
│ ├── styling/ # Planned for future updates
│ └── matching/ # Planned for future updates
├── pipelines/ # Core pipeline scripts
│ ├── classification/ # Implemented
│ ├── classification/ # Implemented
│ │ ├── data_processor.py # Data processing
│ │ ├── train.py # Training
│ │ └── inference.py # Inference
│ ├── completion/ # 🔄 Framework ready
│ ├── styling/ # 🔄 Framework ready
│ └── matching/ # 🔄 Framework ready
│ ├── completion/ # Planned for future updates
│ ├── styling/ # Planned for future updates
│ └── matching/ # Planned for future updates
├── scripts/ # User-friendly scripts
│ ├── classification/ # Implemented
│ ├── classification/ # Implemented
│ │ ├── data_processor.py # Data processing script
│ │ ├── trainer.py # Training script
│ │ └── inference.py # Inference script
│ ├── completion/ # 🔄 Framework ready
│ ├── styling/ # 🔄 Framework ready
│ └── matching/ # 🔄 Framework ready
│ ├── completion/ # Planned for future updates
│ ├── styling/ # Planned for future updates
│ └── matching/ # Planned for future updates
├── results/ # Model outputs
│ ├── classification/ # Implemented
│ ├── completion/ # 🔄 Ready
│ ├── styling/ # 🔄 Ready
│ └── matching/ # 🔄 Ready
│ ├── classification/ # Implemented
│ ├── completion/ # Planned for future updates
│ ├── styling/ # Planned for future updates
│ └── matching/ # Planned for future updates
└── utils/ # Shared utility modules
```
## 🚀 Quick Start (Classification Task)
## Quick Start (Classification Task)
### 1. Setup Environment
@@ -93,7 +93,7 @@ ls -la ./data/processed/classification/emotion/classification/
**Expected Output:**
```
Data processing completed successfully!
Data processing completed successfully!
Data source: huggingface
Dataset: dair-ai/emotion
Total samples: 2999
@@ -117,7 +117,7 @@ ls -la ./results/classification/emotion_model/
**Expected Output:**
```
Training completed successfully!
Training completed successfully!
Model: bert-base-uncased
Data directory: ./data/processed/classification/emotion
Training for 3 epochs with batch size 16
@@ -136,7 +136,7 @@ python scripts/classification/inference.py --config configs/classification/emoti
**Expected Output:**
```
Inference completed successfully!
Inference completed successfully!
Loading model from: ./results/classification/emotion_model
Predicted label: joy
Confidence: 0.8542
@@ -146,7 +146,7 @@ python scripts/classification/inference.py --config configs/classification/emoti
- surprise: 0.0224
```
## 🔧 Adding New Tasks
## Adding New Tasks
To add a new task (e.g., completion, styling, matching), follow these steps:
@@ -288,7 +288,7 @@ python scripts/completion/trainer.py --config configs/completion/text_generation
python scripts/completion/inference.py --config configs/completion/text_generation.yaml --input-text "Once upon a time"
```
## 📋 YAML Configuration Guide
## YAML Configuration Guide
### Configuration Structure
@@ -335,7 +335,7 @@ inference:
- `configs/classification/emotion.yaml` - Emotion classification with HuggingFace dataset
- `configs/classification/custom.yaml` - Custom dataset processing
## 🔧 Usage Examples
## Usage Examples
### Data Processing Examples
@@ -385,7 +385,7 @@ python scripts/classification/inference.py --config configs/classification/emoti
python scripts/classification/inference.py examples
```
## 🐛 Troubleshooting Common Errors
## Troubleshooting Common Errors
### 1. ModuleNotFoundError: No module named 'utils'
@@ -405,7 +405,7 @@ python scripts/classification/data_processor.py --config configs/classification/
**Error:**
```
Model path not found: ./results/classification/emotion_model
Model path not found: ./results/classification/emotion_model
```
**Solution:**
@@ -421,7 +421,7 @@ python scripts/classification/inference.py --config configs/classification/emoti
**Error:**
```
Data directory not found: ./data/processed/classification/emotion
Data directory not found: ./data/processed/classification/emotion
```
**Solution:**
@@ -480,7 +480,7 @@ python scripts/classification/trainer.py --config configs/classification/emotion
python scripts/classification/trainer.py --config configs/classification/emotion.yaml --device cpu
```
## 📊 Monitoring and Logs
## Monitoring and Logs
### Check Processing Status
@@ -510,7 +510,7 @@ tail -f logs/training.log
└── label_info.json # Label mappings
```
## 🔄 Workflow Summary
## Workflow Summary
1. **Setup**: Install dependencies and set PYTHONPATH
2. **Data Processing**: Process raw data into organized splits
@@ -518,7 +518,7 @@ tail -f logs/training.log
4. **Inference**: Use trained model for predictions
5. **Monitoring**: Check logs and outputs for errors
## 📝 Creating Custom Configurations
## Creating Custom Configurations
### For New Datasets
@@ -560,7 +560,7 @@ data:
python scripts/classification/data_processor.py --config configs/classification/custom.yaml
```
## 🎯 Best Practices
## Best Practices
1. **Always check output directories** before running next step
2. **Use small datasets for testing** before full runs
@@ -569,7 +569,7 @@ python scripts/classification/data_processor.py --config configs/classification/
5. **Use version control** for YAML files
6. **Test with CLI overrides** for quick experiments
## 📞 Support
## Support
For issues and questions:
1. Check the troubleshooting section above
@@ -579,4 +579,4 @@ For issues and questions:
---
**Happy fine-tuning! 🚀**
**Happy fine-tuning!**