# Adriana James Marketing Assistant AI This project fine-tunes a language model to generate marketing content in the voice and style of Adriana James, based on her book content, past campaigns, and style guidelines. ## Project Structure - `generate_dataset.py`: Script to generate fine-tuning datasets from book content, past campaigns, and style guidelines - `finetune_model.py`: Script to fine-tune the model using the generated datasets - `data/`: Directory containing source data - `book.pdf`: Adriana James' book content - `past_campaigns/`: Directory containing past marketing campaigns - `style_guidelines/`: Directory containing brand style guidelines - `datasets/`: Directory containing generated fine-tuning datasets - `adriana_model/`: Directory containing the fine-tuned model ## Setup 1. Install the required dependencies: ``` pip install -r requirements.txt ``` 2. Generate the fine-tuning datasets: ``` python generate_dataset.py ``` This will create the following datasets in the `datasets/` directory: - `stage1_book_content.json`: Dataset for fine-tuning on book content - `stage2_marketing_content.json`: Dataset for fine-tuning on marketing content - `stage3_style_alignment.json`: Dataset for fine-tuning on style alignment - `combined_dataset.json`: Combined dataset for all stages ## Fine-tuning the Model The fine-tuning process follows a progressive approach with three stages: 1. **Stage 1**: Fine-tune on book content to establish Adriana James' core voice 2. **Stage 2**: Fine-tune on marketing content to adapt to marketing formats 3. **Stage 3**: Fine-tune on style alignment to ensure style consistency ### Running the Fine-tuning Script To run the complete progressive fine-tuning process: ``` python finetune_model.py --stage all ``` To run a specific stage: ``` python finetune_model.py --stage 1 # Fine-tune on book content only python finetune_model.py --stage 2 # Fine-tune on marketing content only python finetune_model.py --stage 3 # Fine-tune on style alignment only ``` ### Command-line Arguments - `--model_name`: Base model to fine-tune (default: "mistralai/Mistral-7B-v0.1") - `--output_dir`: Directory to save the fine-tuned model (default: "adriana_model") - `--stage`: Fine-tuning stage (choices: "1", "2", "3", "all", default: "all") - `--num_epochs`: Number of epochs for each stage (default: 3) - `--seed`: Random seed for reproducibility (default: 42) ## Model Selection The default base model is Mistral-7B-v0.1, which is a good balance between performance and resource requirements. For better results, you can use larger models like: - `meta-llama/Llama-2-13b-hf` (requires access) - `tiiuae/falcon-40b` (larger model with good performance) - `google/flan-t5-xxl` (good for instruction following) To use a different model, specify it with the `--model_name` argument: ``` python finetune_model.py --model_name tiiuae/falcon-40b ``` ## Hardware Requirements Fine-tuning large language models requires significant computational resources: - **Minimum**: 16GB GPU RAM (for 7B parameter models) - **Recommended**: 24GB+ GPU RAM (for 13B+ parameter models) - **Optimal**: Multiple GPUs or a high-end GPU with 40GB+ RAM For models larger than 7B parameters, you may need to use techniques like: - 8-bit quantization (already enabled in the script) - Gradient checkpointing - LoRA or QLoRA fine-tuning ## Using the Fine-tuned Model After fine-tuning, the model will be saved in the `adriana_model/final` directory. You can load and use it with the Transformers library: ```python from transformers import AutoModelForCausalLM, AutoTokenizer # Load the fine-tuned model model_path = "adriana_model/final" tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForCausalLM.from_pretrained(model_path) # Generate content prompt = "Write a marketing email for a professional development workshop." inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=200, num_return_sequences=1) generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) print(generated_text) ``` ## License This project is licensed under the MIT License - see the LICENSE file for details.