updated readme

2025-08-06 22:49:29 +01:00
parent fef3f5ae35
commit fd54d4be39
1 changed files with 46 additions and 46 deletions
@@ -2,7 +2,7 @@

 A comprehensive framework for fine-tuning NLP models with organized YAML configurations, supporting multiple tasks (classification, completion, styling, matching).

-## 🎯 Supported Tasks
+## Supported Tasks

 This framework supports multiple NLP tasks with organized configurations:

@@ -13,60 +13,60 @@ This framework supports multiple NLP tasks with organized configurations:

 ### Current Implementation Status

- ✅ **Classification**: Fully implemented with emotion classification example
- 🔄 **Completion**: Planned for future updates
- 🔄 **Styling**: Planned for future updates
- 🔄 **Matching**: Planned for future updates
+- **Classification**: Fully implemented with emotion classification example
+- **Completion**: Planned for future updates
+- **Styling**: Planned for future updates
+- **Matching**: Planned for future updates

 **Note**: Currently only classification task is supported. Other tasks (completion, styling, matching) are planned for future updates.

-## 🏗️ Project Structure
+## Project Structure

 ```
 fine-tune-task/
 ├── configs/                    # YAML configuration files
-│   ├── classification/         # ✅ Implemented
+│   ├── classification/         # Implemented
 │   │   ├── emotion.yaml       # Emotion classification
 │   │   └── custom.yaml        # Custom dataset
-│   ├── completion/             # 🔄 Planned for future updates
-│   ├── styling/               # 🔄 Planned for future updates
-│   └── matching/              # 🔄 Planned for future updates
+│   ├── completion/             # Planned for future updates
+│   ├── styling/               # Planned for future updates
+│   └── matching/              # Planned for future updates
 ├── data/                       # Data directories
 │   ├── raw/                    # Raw input data
-│   │   ├── classification/     # ✅ Implemented
-│   │   ├── completion/         # 🔄 Planned for future updates
-│   │   ├── styling/           # 🔄 Planned for future updates
-│   │   └── matching/          # 🔄 Planned for future updates
+│   │   ├── classification/     # Implemented
+│   │   ├── completion/         # Planned for future updates
+│   │   ├── styling/           # Planned for future updates
+│   │   └── matching/          # Planned for future updates
 │   └── processed/              # Processed data
-│       ├── classification/     # ✅ Implemented
-│       ├── completion/         # 🔄 Planned for future updates
-│       ├── styling/           # 🔄 Planned for future updates
-│       └── matching/          # 🔄 Planned for future updates
+│       ├── classification/     # Implemented
+│       ├── completion/         # Planned for future updates
+│       ├── styling/           # Planned for future updates
+│       └── matching/          # Planned for future updates
 ├── pipelines/                  # Core pipeline scripts
-│   ├── classification/         # ✅ Implemented
+│   ├── classification/         # Implemented
 │   │   ├── data_processor.py  # Data processing
 │   │   ├── train.py          # Training
 │   │   └── inference.py      # Inference
-│   ├── completion/            # 🔄 Framework ready
-│   ├── styling/              # 🔄 Framework ready
-│   └── matching/             # 🔄 Framework ready
+│   ├── completion/            # Planned for future updates
+│   ├── styling/              # Planned for future updates
+│   └── matching/             # Planned for future updates
 ├── scripts/                    # User-friendly scripts
-│   ├── classification/         # ✅ Implemented
+│   ├── classification/         # Implemented
 │   │   ├── data_processor.py  # Data processing script
 │   │   ├── trainer.py        # Training script
 │   │   └── inference.py      # Inference script
-│   ├── completion/            # 🔄 Framework ready
-│   ├── styling/              # 🔄 Framework ready
-│   └── matching/             # 🔄 Framework ready
+│   ├── completion/            # Planned for future updates
+│   ├── styling/              # Planned for future updates
+│   └── matching/             # Planned for future updates
 ├── results/                    # Model outputs
-│   ├── classification/         # ✅ Implemented
-│   ├── completion/            # 🔄 Ready
-│   ├── styling/              # 🔄 Ready
-│   └── matching/             # 🔄 Ready
+│   ├── classification/         # Implemented
+│   ├── completion/            # Planned for future updates
+│   ├── styling/              # Planned for future updates
+│   └── matching/             # Planned for future updates
 └── utils/                      # Shared utility modules
 ```

-## 🚀 Quick Start (Classification Task)
+## Quick Start (Classification Task)

 ### 1. Setup Environment

@@ -93,7 +93,7 @@ ls -la ./data/processed/classification/emotion/classification/

 **Expected Output:**
 ```
-✅ Data processing completed successfully!
+Data processing completed successfully!
  Data source: huggingface
  Dataset: dair-ai/emotion
  Total samples: 2999
@@ -117,7 +117,7 @@ ls -la ./results/classification/emotion_model/

 **Expected Output:**
 ```
-✅ Training completed successfully!
+Training completed successfully!
  Model: bert-base-uncased
  Data directory: ./data/processed/classification/emotion
  Training for 3 epochs with batch size 16
@@ -136,7 +136,7 @@ python scripts/classification/inference.py --config configs/classification/emoti

 **Expected Output:**
 ```
-✅ Inference completed successfully!
+Inference completed successfully!
  Loading model from: ./results/classification/emotion_model
  Predicted label: joy
  Confidence: 0.8542
@@ -146,7 +146,7 @@ python scripts/classification/inference.py --config configs/classification/emoti
    - surprise: 0.0224
 ```

-## 🔧 Adding New Tasks
+## Adding New Tasks

 To add a new task (e.g., completion, styling, matching), follow these steps:

@@ -288,7 +288,7 @@ python scripts/completion/trainer.py --config configs/completion/text_generation
 python scripts/completion/inference.py --config configs/completion/text_generation.yaml --input-text "Once upon a time"
 ```

-## 📋 YAML Configuration Guide
+## YAML Configuration Guide

 ### Configuration Structure

@@ -335,7 +335,7 @@ inference:
 - `configs/classification/emotion.yaml` - Emotion classification with HuggingFace dataset
 - `configs/classification/custom.yaml` - Custom dataset processing

-## 🔧 Usage Examples
+## Usage Examples

 ### Data Processing Examples

@@ -385,7 +385,7 @@ python scripts/classification/inference.py --config configs/classification/emoti
 python scripts/classification/inference.py examples
 ```

-## 🐛 Troubleshooting Common Errors
+## Troubleshooting Common Errors

 ### 1. ModuleNotFoundError: No module named 'utils'

@@ -405,7 +405,7 @@ python scripts/classification/data_processor.py --config configs/classification/

 **Error:**
 ```
-❌ Model path not found: ./results/classification/emotion_model
+Model path not found: ./results/classification/emotion_model
 ```

 **Solution:**
@@ -421,7 +421,7 @@ python scripts/classification/inference.py --config configs/classification/emoti

 **Error:**
 ```
-❌ Data directory not found: ./data/processed/classification/emotion
+Data directory not found: ./data/processed/classification/emotion
 ```

 **Solution:**
@@ -480,7 +480,7 @@ python scripts/classification/trainer.py --config configs/classification/emotion
 python scripts/classification/trainer.py --config configs/classification/emotion.yaml --device cpu
 ```

-## 📊 Monitoring and Logs
+## Monitoring and Logs

 ### Check Processing Status

@@ -510,7 +510,7 @@ tail -f logs/training.log
 └── label_info.json   # Label mappings
 ```

-## 🔄 Workflow Summary
+## Workflow Summary

 1. **Setup**: Install dependencies and set PYTHONPATH
 2. **Data Processing**: Process raw data into organized splits
@@ -518,7 +518,7 @@ tail -f logs/training.log
 4. **Inference**: Use trained model for predictions
 5. **Monitoring**: Check logs and outputs for errors

-## 📝 Creating Custom Configurations
+## Creating Custom Configurations

 ### For New Datasets

@@ -560,7 +560,7 @@ data:
 python scripts/classification/data_processor.py --config configs/classification/custom.yaml
 ```

-## 🎯 Best Practices
+## Best Practices

 1. **Always check output directories** before running next step
 2. **Use small datasets for testing** before full runs
@@ -569,7 +569,7 @@ python scripts/classification/data_processor.py --config configs/classification/
 5. **Use version control** for YAML files
 6. **Test with CLI overrides** for quick experiments

-## 📞 Support
+## Support

 For issues and questions:
 1. Check the troubleshooting section above
@@ -579,4 +579,4 @@ For issues and questions:

 ---

-**Happy fine-tuning! 🚀**
+**Happy fine-tuning!**