enrichment of questions

multiple area tags
fix
2025-07-16 12:46:14 +01:00 · 2025-07-08 15:20:10 +01:00 · 2025-06-13 19:32:33 +01:00 · 2025-06-13 19:12:08 +01:00 · 2025-06-13 19:09:44 +01:00 · 2025-06-13 19:06:18 +01:00
21 changed files with 1476 additions and 229 deletions
@@ -0,0 +1,301 @@
+# Enrich Questions API Documentation
+
+## Overview
+
+The `enrich-questions` endpoint is a reverse API that takes existing questions and assigns them to specific areas and members. This endpoint returns the exact same response structure as `generate_questions_from_sop_v3`. Each question is intelligently assigned to the most relevant area_tag and member using OpenAI analysis.
+
+## Endpoint
+
+```
+POST /api/v1/common/enrich-questions
+```
+
+## Authentication
+
+Requires Bearer token authentication:
+
+```
+Authorization: Bearer <your-api-key>
+```
+
+## Request Format
+
+### Headers
+
+```
+Content-Type: application/json
+Authorization: Bearer <your-api-key>
+```
+
+### Request Body
+
+The request body should be a JSON array of question objects. Each question object must contain:
+
+- `question` (string): The question text
+- `role` (string): The role associated with the question
+- `position_id` (integer): The position ID (used as role ID in response)
+- `area_tags` (array): Array of area tag objects with `name` and `id` (OpenAI selects the most relevant one)
+- `members` (array): Array of member objects with `id` (algorithm selects the most appropriate one)
+
+### Example Request
+
+```json
+[
+  {
+    "question": "Is the system monitoring working properly?",
+    "role": "IT Expert",
+    "position_id": 522,
+    "area_tags": [
+      {
+        "name": "IT Operations",
+        "id": 1276
+      },
+      {
+        "name": "Communication & Coordination",
+        "id": 1426
+      },
+      {
+        "name": "Quality Assurance",
+        "id": 1427
+      }
+    ],
+    "members": [
+      {
+        "id": 159
+      }
+    ]
+  },
+  {
+    "question": "Are safety protocols being followed?",
+    "role": "IT Expert",
+    "position_id": 522,
+    "area_tags": [
+      {
+        "name": "IT Operations",
+        "id": 1276
+      },
+      {
+        "name": "Safety Protocols",
+        "id": 1436
+      }
+    ],
+    "members": [
+      {
+        "id": 159
+      }
+    ]
+  }
+]
+```
+
+## Response Format
+
+### Success Response (200 OK)
+
+The response structure is identical to `generate_questions_from_sop_v3`. Each question is assigned to ONE area_tag and ONE member:
+
+```json
+{
+  "questions": {
+    "items": [
+      {
+        "area_tag": 1276,
+        "area_name": "IT Operations",
+        "assigned_to": 159,
+        "questions": "Is the system monitoring working properly?",
+        "role": 522
+      },
+      {
+        "area_tag": 1436,
+        "area_name": "Safety Protocols",
+        "assigned_to": 159,
+        "questions": "Are safety protocols being followed?",
+        "role": 522
+      }
+    ]
+  }
+}
+```
+
+### Response Structure Explanation
+
+- Each question creates exactly ONE item in the response
+- OpenAI analyzes the question content and selects the most relevant `area_tag` from available options
+- The algorithm selects the most appropriate `member` from the available members
+- `area_tag`: The OpenAI-selected area tag ID
+- `area_name`: The OpenAI-selected area tag name
+- `assigned_to`: The selected member ID
+- `questions`: The question text
+- `role`: The position_id from the request (used as role identifier)
+
+## AI-Powered Assignment Algorithm
+
+### OpenAI Area Tag Selection
+
+The system uses OpenAI's GPT-4o-mini model to intelligently analyze each question and select the most relevant area tag:
+
+1. **Content Analysis**: OpenAI analyzes the question content, context, and meaning
+2. **Domain Matching**: Determines which area/domain the question is actually testing or assessing
+3. **Relevance Scoring**: Considers the purpose and intent of the question
+4. **Smart Selection**: Chooses the most specific and primary area tag from available options
+5. **Fallback**: If OpenAI analysis fails, defaults to the first available area tag
+
+**OpenAI Prompt Guidelines:**
+
+- Analyze question content and context
+- Match questions to appropriate area tags based on meaning and purpose
+- Consider what domain/area the question is actually testing
+- Choose only ONE area tag per question - the most relevant one
+- If multiple areas seem relevant, choose the most specific or primary one
+
+### Member Selection
+
+Currently uses a simple selection algorithm (first member), but can be enhanced to consider:
+
+- Member skills and expertise
+- Current workload distribution
+- Availability and capacity
+- Historical performance
+
+### Error Responses
+
+#### 400 Bad Request - Invalid Input Format
+
+```json
+{
+  "error": "Invalid input",
+  "message": "Input data must be in JSON format."
+}
+```
+
+#### 400 Bad Request - Missing Required Fields
+
+```json
+{
+  "error": "Invalid data",
+  "message": "Question object at index 0 is missing required field 'question'."
+}
+```
+
+#### 400 Bad Request - Invalid Array Structure
+
+```json
+{
+  "error": "Invalid input",
+  "message": "Input data must be an array of question objects."
+}
+```
+
+#### 401 Unauthorized
+
+```json
+{
+  "error": "Unauthorized",
+  "message": "API key is missing or invalid."
+}
+```
+
+#### 500 Internal Server Error
+
+```json
+{
+  "error": "Internal Server Error",
+  "message": "An unexpected error occurred."
+}
+```
+
+## Usage Examples
+
+### Basic Usage
+
+```bash
+curl -X POST "http://localhost:5402/api/v1/common/enrich-questions" \
+  -H "Authorization: Bearer your-api-key" \
+  -H "Content-Type: application/json" \
+  -d '[
+    {
+      "question": "Is the system performance being monitored?",
+      "role": "Developer",
+      "position_id": 123,
+      "area_tags": [
+        {"name": "Development", "id": 1},
+        {"name": "Performance Monitoring", "id": 2}
+      ],
+      "members": [
+        {"id": 456}
+      ]
+    }
+  ]'
+```
+
+### Python Example
+
+```python
+import requests
+import json
+
+url = "http://localhost:5402/api/v1/common/enrich-questions"
+headers = {
+    "Authorization": "Bearer your-api-key",
+    "Content-Type": "application/json"
+}
+
+payload = [
+    {
+        "question": "Is the system performance being monitored?",
+        "role": "Developer",
+        "position_id": 123,
+        "area_tags": [
+            {"name": "Development", "id": 1},
+            {"name": "Performance Monitoring", "id": 2}
+        ],
+        "members": [
+            {"id": 456}
+        ]
+    }
+]
+
+response = requests.post(url, json=payload, headers=headers)
+result = response.json()
+print(result)
+```
+
+## Validation Rules
+
+1. **Input must be a JSON array** of question objects
+2. **Each question object must contain all required fields**:
+   - `question`: Non-empty string
+   - `role`: Non-empty string
+   - `position_id`: Integer
+   - `area_tags`: Array of objects with `name` and `id`
+   - `members`: Array of objects with `id`
+3. **Area tags must be valid objects** with both `name` (string) and `id` (integer/string)
+4. **Members must be valid objects** with `id` (integer/string)
+5. **Arrays can be empty** but must be present
+
+## Response Logic
+
+The endpoint uses AI to intelligently assign each question to the most relevant area and member:
+
+- **Input**: 2 questions with multiple area_tags and members each
+- **Output**: 2 items (one per question) with the best area_tag and member selected for each
+- **AI Analysis**: OpenAI analyzes question content and meaning to find the most relevant area_tag
+- **Smart Assignment**: Uses natural language understanding to make intelligent assignments
+- **No Cartesian Product**: Each question gets exactly one area assignment and one member assignment
+
+## Performance Considerations
+
+- **Batch Processing**: OpenAI analysis is performed in batches for efficiency
+- **Caching**: Consider implementing caching for frequently assigned questions
+- **Fallback**: Robust fallback mechanisms ensure the endpoint always returns valid assignments
+- **Error Handling**: Comprehensive error handling for OpenAI API failures
+
+## Integration with Existing System
+
+This endpoint complements the existing question generation APIs:
+
+- `POST /api/v1/qs/generate_questions_from_sop` - Generates questions from SOPs
+- `POST /api/v1/qs/generate_questions_from_sop-latest` - Enhanced question generation
+- `POST /api/v1/common/enrich-questions` - Enriches existing questions (NEW)
+
+The enrich-questions endpoint returns the **exact same structure** as `generate_questions_from_sop_v3`, with AI-powered intelligent assignment of questions to the most relevant areas and members, making it seamlessly interchangeable in your application workflow.
@@ -4,3 +4,4 @@ langchain-openai
 pydantic
 flask
 python-dotenv
+reportlab
@@ -3,6 +3,7 @@ from flask import Flask
 from src.api.routes.sops import sops_bp
 from src.api.routes.questions import qs_b
 from src.api.routes.chatbot import bot
+from src.api.routes.common import common_bp

 def create_app():
    app = Flask(__name__)
@@ -11,6 +12,7 @@ def create_app():
    app.register_blueprint(sops_bp, url_prefix='/api/v1/sop')
    app.register_blueprint(qs_b,url_prefix='/api/v1/qs')
    app.register_blueprint(bot,url_prefix='/api/v1/bot')
+    app.register_blueprint(common_bp, url_prefix='/api/v1/common')

    # Set up the upload folder configuration inside the src directory
    UPLOAD_FOLDER = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../../uploads')
@@ -0,0 +1,53 @@
+import os
+from flask import Blueprint, request, jsonify
+from src.utils.auth import auth_check
+from src.services.question_enrichment import QuestionEnrichmentService
+import json
+
+# Initialize the Blueprint
+common_bp = Blueprint('common', __name__)
+
+@common_bp.route('/enrich-questions', methods=['POST'])
+@auth_check
+def enrich_questions():
+    """
+    Reverse API endpoint that takes questions and assigns them to areas and members.
+    Returns the exact same structure as generate_questions_from_sop_v3.
+    Expected payload: Array of question objects with question, role, position_id, area_tags, and members.
+    
+    Example payload:
+    [
+        {
+            "question": "Minor",
+            "role": "IT Expert",
+            "position_id": 522,
+            "area_tags": [
+                {"name": "IT Operations", "id": 1276},
+                {"name": "Communication & Coordination", "id": 1426}
+            ],
+            "members": [
+                {"id": 159}
+            ]
+        }
+    ]
+    """
+    if not request.is_json:
+        return jsonify({"error": "Invalid input", "message": "Input data must be in JSON format."}), 400
+
+    input_data = request.get_json()
+
+    try:
+        # Initialize the question enrichment service
+        enrichment_service = QuestionEnrichmentService()
+        
+        # Enrich the questions
+        result = enrichment_service.enrich_questions(input_data)
+        
+        if not result['success']:
+            return jsonify({"error": "Invalid data", "message": result['error']}), 400
+        
+        # Return the exact same structure as generate_questions_from_sop_v3
+        return jsonify({"questions": result['questions']}), 200
+
+    except Exception as e:
+        return jsonify({"error": "Internal Server Error", "message": str(e)}), 500 
@@ -81,6 +81,10 @@ def generate_questions_from_sop_v3():
            "duration": input_data['duration']
        }
        
+        # Add area_tags if provided in the payload (optional)
+        if 'area_tags' in input_data:
+            generator_input['area_tags'] = input_data['area_tags']
+
        # Generate questions using the QuestionGenerator
        generator = QuestionsGeneratorV2()
        questions_response = generator.generate_questions_for_all(generator_input)
@@ -7,7 +7,11 @@ from src.utils.auth import auth_check

 from src.utils.utils import delete_all_files_in_directory
 from src.utils.document_loader import load_document  
-from flask import Blueprint, jsonify, request, make_response
+import os
+import tempfile
+import datetime
+from flask import send_file
+from flask import Blueprint, jsonify, request, make_response,after_this_request
 import json
 # Initialize the Blueprint
 sops_bp = Blueprint('sops', __name__)
@@ -29,6 +33,7 @@ def get_roles():
        return jsonify({"error": "No file part", "message": "Please upload a file with the key 'document'."}), 400

    file = request.files['document']
+    role_slug = request.form.get('role_slug')

    # If the user does not select a file, the browser may also submit an empty part without filename
    if file.filename == '':
@@ -48,7 +53,7 @@ def get_roles():
            
            # Generate roles from the docs
            parser = DocumentParser()
-            roles = parser.get_roles(docs)["roles"]
+            roles = parser.get_roles_using_slug(docs,role_slug)["roles"]
            
            # Cleanup: Delete all files in the upload directory after processing
            delete_all_files_in_directory(upload_folder)
@@ -70,6 +75,7 @@ def get_roles():
 def get_roles_questionnaire():
    # Check if the post request has the file part
    questionnaire_data = request.json
+    role_slug = questionnaire_data.get("role_slug")

    # Validate the required fields in the questionnaire data
    if not questionnaire_data.get('questionnaire_response'):
@@ -80,7 +86,7 @@ def get_roles_questionnaire():
    
    generator = SopPersonalAssessment()

-    roles = generator.generate_roles_from_questionnaire(questionnaire_data)
+    roles = generator.generate_roles_from_questionnaire(questionnaire_data,role_slug)

    if not roles:
        return jsonify({"error": "No roles found", "message": "No roles were extracted from the questionnaire."}), 404
@@ -88,8 +94,6 @@ def get_roles_questionnaire():
    return jsonify({"roles": roles, "message": "Roles successfully extracted from the questionnaire."}), 200


-
-
@sops_bp.route('/personal_assessment/generate_sops_from_doc', methods=['POST'])
@auth_check
 def generate_sops():
@@ -225,6 +229,62 @@ def generate_sops_by_roles_and_areas():



+@sops_bp.route('/general/generate_sops_pdf', methods=['POST']) 
+@auth_check 
+def generate_sops_pdf(): 
+    """ 
+    Generate a PDF file of SOPs based on the SOP JSON data provided in the request body.
+    Returns the PDF as a downloadable file and then deletes the temporary file.
+    """ 
+    try: 
+        # Get the SOP JSON data from the request body 
+        sop_data = request.json
+        
+        # Validate the presence of SOP data
+        if not sop_data or not isinstance(sop_data, dict) or 'sop_details' not in sop_data: 
+            return make_response(jsonify({
+                "error": "Invalid input", 
+                "message": "The request body should contain valid SOP data with a 'sop_details' field."
+            }), 400)
+        
+        # Create a unique temporary filename for the PDF
+        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+        temp_dir = tempfile.gettempdir()
+        pdf_filename = f"sop_document_{timestamp}.pdf"
+        pdf_path = os.path.join(temp_dir, pdf_filename)
+        
+        # Import the PDF conversion function
+        from ...utils.sop_pdf_creator import convert_sop_to_pdf
+        
+        # Generate the PDF file
+        convert_sop_to_pdf(sop_data, pdf_path, "")
+        
+        # Send the file as a download attachment
+        @after_this_request
+        def remove_file(response):
+            try:
+                # Delete the temporary file after the response is sent
+                if os.path.exists(pdf_path):
+                    os.remove(pdf_path)
+            except Exception as e:
+               print(f"Error removing temporary PDF file: {str(e)}")
+            return response
+        
+        # Return the file as a downloadable attachment
+        return send_file(
+            pdf_path,
+            as_attachment=True,
+            download_name=f"SOP_Document_{timestamp}.pdf",
+            mimetype='application/pdf'
+        )
+ 
+    except Exception as e: 
+        print(f"Error generating PDF: {str(e)}")
+        return make_response(jsonify({ 
+            "error": "Processing error", 
+            "message": f"An error occurred while generating the SOP PDF: {str(e)}" 
+        }), 500)
+    

@sops_bp.route('/executive/generate_sop_mission_from_vision', methods=['POST'])
@auth_check
@@ -312,11 +372,9 @@ def generate_executive_goals_from_doc():
    if 'document' not in request.files:
        return jsonify({"error": "No file part", "message": "Please upload a file with the key 'document'."}), 400
    
-    if 'departments' not in request.form:
-        return jsonify({"error": "department missing", "message": "Please provide departments'."}), 400
   
    try:
-        departments = request.form.get('departments')
+        departments = request.form.get('departments','[]')
        # Manually load roles from the string to JSON
        departments = json.loads(departments)
        
@@ -351,17 +409,8 @@ def generate_executive_goals_from_doc():
            if not vision_mission:
                return jsonify({"error": "Vision and Mission generation error ", "message": "Error in generating mssion and viso."}), 400
            
-            # Check if both vision and mission are empty
-            if not vision_mission.get('vision') and not vision_mission.get('mission'):
-                # Cleanup: Delete all files in the upload directory if parsing fails
-                delete_all_files_in_directory(upload_folder)
-                return jsonify({"vision": [], "mission": [], "message": "The document does not contain mission and vision."}), 200
-
-            print(f"Vision and mission: {vision_mission}")
-            vission = vision_mission.get('vision')
-            mission = vision_mission.get('mission')
-            
-            return jsonify({"mission": mission, "vission": vission, "message": "vision and mission generated successfully"}), 200
+            response = json.loads(vision_mission.get("response"))
+            return jsonify({"data": response, "message": "vision and mission generated successfully"}), 200

        except Exception as e:
            # Cleanup: Delete all files in the upload directory if an error occurs
@@ -371,8 +420,6 @@ def generate_executive_goals_from_doc():
    return jsonify({"error": "File type not allowed", "message": "The uploaded file type is not allowed. Please upload a PDF, DOC, or DOCX file."}), 400


-
-
@sops_bp.route('/executive/generate_sop_managers_doc', methods=['POST'])
@auth_check
 def generate_sop_managers_doc():
@@ -466,10 +513,9 @@ def generate_vision_goals_quest():
        
        sop_generator = SopGeneratorExecutive()
        vision_mission = sop_generator.generate_vision_mission_from_questionnaire(questionnaire_data)
-        vission = vision_mission.get('vision')
-        mission = vision_mission.get('mission')
+        data = json.loads(vision_mission.get('response'))
        
-        return jsonify({"mission": mission, "vission": vission, "message": "vision and mission generated successfully"}), 200
+        return jsonify({"data": data, "message": "vision and mission generated successfully"}), 200

    except Exception as e:
        return make_response(jsonify({"error": "Processing error", "message": f"An error occurred while processing the request: {str(e)}"}), 500)
@@ -1,17 +1,3 @@
-'''from pydantic import BaseModel
-from typing import List, Dict
-
-class Question(BaseModel):
-    assigned_to: str
-    role: str
-    questions: str
-    area_tag:str
-
-class AssementQuestion(BaseModel):
-    number: int
-    questions: List[Question]'''
-
-
 from pydantic import BaseModel
 from typing import List, Dict
 from typing import Optional
@@ -33,5 +19,8 @@ class Questions(BaseModel):
 class AssessmentQuestions(BaseModel):
    questions: Questions

+class AllQuestionsItems(BaseModel):
+    items: List[Question]
+
 class AllQuestions(BaseModel):
-    questions : List[Question]
+    questions: AllQuestionsItems
@@ -10,6 +10,7 @@ class Categories(BaseModel):

 class RoleSops(BaseModel):
    role:str
+    narrative:str
    sops:Categories

 class RoleSopssLists(BaseModel):
@@ -35,9 +36,17 @@ class DeptMisiion(BaseModel):
    departments: str
    goals: List[str]

-class VisionMissionResponse2(BaseModel):
-    vision: List[str]
+class VisionMissionResponse3(BaseModel):
    mission: List[str]
+    vision: List[str]
+    
+class VisionMissionResponse2(BaseModel):
+    response: List[str]
+
+class VisionMissionResponseV3(BaseModel):
+    response: str
+
+

 class VisionMissionResponse(BaseModel):
    vision: List[str]
@@ -236,8 +236,25 @@ def get_questions_prompt_v5():
     NOTE: !!! MAKE SURE YOU CORRECTLY ATTACH "assigned_to" AS THE ID OF THE MEMBER OF THE ROLE AS STATED IN THE SOP. CHECK MEMBERS UNDER THE ROLE IN THE PROVIDED SOP AND USE THE CORRECT ID OF THE MEMBER, DO NOT USE MEMBER iD THAT IS NOT PROVIDED AS "assigned_to" pls FOLLOW THIS STRICTLY!!!
     NOTE: CHECK THE "role_id" UNDER THE PROVIDED SOP AND USE THE CORRECT ID OF THE ROLE, DO NOT USE OR FORMULATE "role_id" THAT IS NOT PROVIDED AS "role" pls FOLLOW THIS STRICTLY!!!

-     NOTE: IF area tags is not provided for specicfic role SOPS, kindly formulate an area name based on the sop and make the area_tag null but forumlate rea name, only do this if area tags info is not provided for specific role sops
-     NOTE: Use exactly the area names provided if available unless area tags is missing and you need to forumalate one
+     NOTE: IF area tags is not provided for specific role SOPS, kindly formulate an area name based on the sop and make the area_tag null but formulate area name, only do this if area tags info is not provided for specific role sops
+     NOTE: Use exactly the area names provided if available unless area tags is missing and you need to formulate one
+     
+     CRITICAL AREA DIVERSIFICATION REQUIREMENT:
+     When area_tags are not provided or the area_tags array is empty, you MUST generate questions across AT LEAST 2-3 DIFFERENT area_names. Do NOT generate all questions under a single area_name. Analyze the SOPs and create diverse area_names such as:
+     - "Communication & Coordination"
+     - "Process Compliance"  
+     - "Quality Assurance"
+     - "Timeline Management"
+     - "Technical Execution"
+     - "Safety Protocols"
+     - "Documentation & Reporting"
+     - "Resource Management"
+     - "Performance Monitoring"
+     - "Risk Assessment"
+     - "Training & Development"
+     - "Customer Relations"
+     
+     MANDATORY: Distribute your questions across these different area_names to ensure comprehensive coverage of the organizational processes. Each area_name should have multiple questions assigned to it. Do not concentrate all questions in one area - this defeats the purpose of comprehensive assessment.

    """
    return prompt
@@ -15,19 +15,35 @@ def get_sop_extraction_from_doc():


 def get_roles_extraction_from_questionnaire():
-    return '''Your task is to extract the "Roles" from the provided questionnaire responses.
-    You must identify and categorize the roles based on the information provided.
+    return '''
+
+    You are a specialized role extractor for company documents. Your task is to identify and extract job roles/positions mentioned in the provided Questionairre data.
+
+    TASK:
+    1. Extract ALL job roles/positions mentioned in the in questionairre response data  as a list.
+    2. Filter the extracted roles based on the provided role_slug that will be provided.
+    3. Return the filtered roles as a JSON list.
+
+    RULES:
+    - Return an empty list if no matching roles are found.
+    - The role_slug is a keyword or category used to filter relevant roles.
+    - Only include roles that semantically relate to the role_slug.
+    - Be precise in extracting official job titles rather than general descriptions.
+
+    EXAMPLES:
+
+    Example 1:
+    Text: "Our company is looking to hire a Senior Data Scientist, Junior Data Analyst, and Database Administrator for the Analytics department. We also have openings for Financial Manager and Customer Support Manager."
+    Role_slug: "data"
+    Expected output: ["Senior Data Scientist", "Junior Data Analyst", "Database Administrator"]
+
+    Example 2:
+    Text: "The restructuring process will affect several departments including the Financial Analysis team, Customer Relations department, and Sales Management. We are currently seeking a Regional Sales Manager, Sales Team Supervisor, and Customer Support Manager."
+    Role_slug: "manager"
+    Expected output: ["Financial Manager", "Regional Sales Manager", "Customer Support Manager"]


-    Instructions:
-    1. **Roles**: Extract the roles mentioned in the questionnaire.
-    2. **Vision**: If applicable, extract the vision of the company or organization as it relates to the roles.
-    3. **Mission**: If applicable, extract the mission of the company or organization as it relates to the roles.
-    4. **Role-specific SOPs**: 
-        - Identify any role-specific Standard Operating Procedures (SOPs) mentioned in the questionnaire.
-        - If SOPs for the role are not explicitly stated, infer them from the context, but only if there is clear evidence within the questionnaire. Do not generate or assume SOPs that are not directly supported by the information provided.
-        - If no roles or SOPs are found, return an empty list for each category.
-    Provide the extracted roles and any relevant sections exactly as they appear in the questionnaire.'''
+    '''


 def get_sop_personalassessment_from_questionnaire():
@@ -89,6 +105,7 @@ def get_sop_personalassessment_from_area_rolev2():
            NOTE: IF AREAS ARE NOT PROVIDED (AREA IS "NOT PROVIDED"), INTUITIVELY PROVIDE THE SOP BASED ON THE ROLE NAME.
            NOTE: MAKE SURE SOPS ARE NOT MISSING FOR THE PROVIDED TYPES.
            NOTE: FOR SOP TYPES NOT SELECTED RETURN AN EMPTY LIST and not "null"  E.G IF "SHALL" AND "WILL" ARE SELECTED BUT "MUST" IS NOT AMONG, MUST WILL BE AN EMPTY LIST
+            also for each role , add a "narrative" which is short description of the role in question

           NOTE !!!: IF A ROLE POINTS TO A SPECIFIC SOP TYPE (E.G., "SHALL" AND "MUST"), THESE TWO MUST NEVER BE EMPTY FOR THAT ROLE.
                   : FORMAT: SOPS SHOULD BE CLEAR, DIRECT, AND CONCISE. EACH ROLE SHOULD HAVE 5-7 BULLET POINTS PER SOP TYPE ("WILL," "SHALL," OR "MUST"). FOR COMPLEX ROLES, EACH SOP TYPE MAY HAVE A MAXIMUM OF 7-10 BULLET POINTS, NOT TOTAL ACROSS ALL TYPES, BUT PER SOP TYPE.
@@ -153,9 +170,52 @@ def get_vision_mission_extraction_from_doc2():
            8. If vision and mission is found in the document , extract them as it is ,no changes
        NOTED: if the goal(mission) and vision cant not be found at all, make it empty please
        NOTE: MAKE SURE YOU EXTRACT EVERY INFORMATION FOUND FOR VISION AND GOALS FROM THE DOCUMENT.DO NOT OMIT ANY
+**You must return the response in the exact HTML `<p>` and `<b>` format shown below, including the numbering, lettered sub-points, `<br>` tags for line breaks, and the double `<br><br>` between departments. Adhere to this format precisely.**
+
+**Instructions:**
+- If **no departments are explicitly mentioned**, assume all content applies to all departments mentioned in the document.
+- Group goals by department when possible.
+- Each goal should have a **title** and **a short description**.
+- Format your output with:
+    - An **HTML section** for rendering on the frontend.
+    - A **structured JSON section** with department-wise goal breakdown.
+
+**Example Output Format:**
+
+{
+  "html": [
+        "<p>Vision: To create safe, broadly beneficial AI systems that ensure the advantages of artificial general intelligence (AGI) are shared equitably across society. Our vision emphasizes the importance of safety and alignment, ensuring that AI systems align with human values and remain under human control. We aspire to collaborate with other research and policy institutions to address global challenges associated with advanced AI, prioritizing the safe development of increasingly capable systems that act in the best interests of humanity.</p>",
+        "<p>Company Goals:</p><p> <b> 1. Audit Department: </b> <br>    a. Enhance Risk Management and Internal Controls: Strengthen the organization’s risk posture by identifying, assessing, and recommending improvements to internal controls.<br>    b. Ensure Regulatory and Policy Compliance: Monitor adherence to relevant laws, regulations, standards, and internal policies.<br>    c. Support Organizational Governance: Provide independent assurance to senior management and the board on the effectiveness of governance processes.<br>    d. Drive Operational Efficiency: Identify inefficiencies and areas for process improvement during audits.<br>    e. Leverage Technology and Data Analytics: Use automated tools and analytics for continuous auditing and real-time risk monitoring.<br>    f. Develop Audit Talent and Capabilities: Invest in training and upskilling for the audit team.<br>    g. Enhance Stakeholder Communication: Improve reporting clarity, timeliness, and relevance to stakeholders.<br><br> <b> 2. Finance Department: </b> <br>    a. Ensure Financial Stability and Sustainability: Maintain strong cash flow and optimize working capital.<br>    b. Improve Financial Planning and Analysis (FP&A): Provide accurate forecasting and budgeting to support strategic decision-making.<br>    c. Enhance Cost Management and Operational Efficiency: Identify cost-saving opportunities and drive efficient use of financial resources.<br>    d. Ensure Regulatory and Compliance Integrity: Maintain full compliance with financial regulations and internal controls.<br>    e. Enable Strategic Investment and Capital Allocation: Evaluate and fund initiatives that align with the organization’s long-term growth strategy.<br>    f. Improve Financial Reporting and Transparency: Deliver timely and accurate financial reports for stakeholders.<br>    g. Leverage Financial Technology and Automation: Implement tools to streamline financial operations.<br>    h. Develop Finance Talent and Leadership: Upskill finance staff and promote cross-functional knowledge.<br><br> <b> 3. Account Department:  </b> <br>    a. Maintain Accurate and Timely Financial Records: Ensure all transactions are recorded properly.<br>    b. Ensure Regulatory and Tax Compliance: Comply with all financial regulations and timely tax filings.<br>    c. Streamline Month-End and Year-End Closing Processes: Reduce the time and complexity of closing periods.<br>    d. Support Strategic Financial Planning: Provide reliable data for budgeting and forecasting.<br>    e. Implement Automation and Digital Tools: Use accounting software and RPA to increase efficiency.<br>    f. Strengthen Internal Controls and Risk Management: Ensure robust controls over financial transactions.<br>    g. Improve Financial Transparency and Reporting Quality: Produce clear and consistent reports for stakeholders.<br>    h. Develop Accounting Team Skills and Expertise: Provide continuous learning and leadership development for accounting staff.<br></p>"
+    ],
+  
+  "department_goals": [
+    {
+      "department": "Account Management",
+      "goals": [
+        {
+          "goal_title": "Customer Satisfaction",
+          "goal_description": "Manage accounts effectively to enhance customer satisfaction and retention."
+        }
+      ]
+    },
+    {
+      "department": "Finance",
+      "goals": [
+        {
+          "goal_title": "Financial Stability",
+          "goal_description": "Finance the company to ensure long-term sustainability and growth."
+        }
+      ]
+    }
+  ]
+}
+
+NOTE, VERY CRITICAL: FOLLOW THIS RESPONSE FORMAT STRICTLY , NOTHING BEFORE OR AFTER PLEASE !!!
+
    """


+        
 ''' def get_sop_executive_for_managers():
    return Your task is to extract the "Vision", "Mission", and executive-generated Standard Operating Procedures (SOPs) specifically for managers from the provided document.
    
@@ -175,7 +235,7 @@ def get_vision_mission_extraction_from_doc2():
 def get_vision_mission_extraction_from_questionnaire_executive():
    return """
    
-    You are provided with an organization's response from a questionnaire, and your role is to extract the vision and mission (also called goals) from the questionnaire response:
+    You are provided with an organization's response from a questionnaire, and your role is to extract the vision and mission (also called goals) from the for each of the departments found questionnaire response:
    
    - Generate the vision(at least one paragraph)of the organization based on the questionnaire and 
    - Generate the goals (mission) of the company based on the provided departmental goals and overall questionairre response
@@ -191,6 +251,49 @@ def get_vision_mission_extraction_from_questionnaire_executive():
    
    NOTE: If the goal and mission of a can not be gotten from the questionaire response, make it empty.
    NOTE: Ensure you extract every piece of information found for the vision and goals from the questionnaire. DO NOT OMIT ANYTHING.
+   
+    NOTE: Group the goals based on the departments found in the questions see example response below pointing to sales, marketing and product develpoment
+    NOTE: ADHERE STRICTLY TO THIS OUTPUT FORMAT , DO NOT CHANGE IT PLEASE
+ **Example Output Format:** Two texts (one for vision and for goal)
+
+- Format your output with:
+    - An **HTML section** for rendering on the frontend.
+    - A **structured JSON section** with department-wise goal breakdown.
+
+**Example Output Format:**
+
+{
+  "html": [
+        "<p>Vision: To create safe, broadly beneficial AI systems that ensure the advantages of artificial general intelligence (AGI) are shared equitably across society. Our vision emphasizes the importance of safety and alignment, ensuring that AI systems align with human values and remain under human control. We aspire to collaborate with other research and policy institutions to address global challenges associated with advanced AI, prioritizing the safe development of increasingly capable systems that act in the best interests of humanity.</p>",
+        "<p>Company Goals:</p><p> <b> 1. Audit Department: </b> <br>    a. Enhance Risk Management and Internal Controls: Strengthen the organization’s risk posture by identifying, assessing, and recommending improvements to internal controls.<br>    b. Ensure Regulatory and Policy Compliance: Monitor adherence to relevant laws, regulations, standards, and internal policies.<br>    c. Support Organizational Governance: Provide independent assurance to senior management and the board on the effectiveness of governance processes.<br>    d. Drive Operational Efficiency: Identify inefficiencies and areas for process improvement during audits.<br>    e. Leverage Technology and Data Analytics: Use automated tools and analytics for continuous auditing and real-time risk monitoring.<br>    f. Develop Audit Talent and Capabilities: Invest in training and upskilling for the audit team.<br>    g. Enhance Stakeholder Communication: Improve reporting clarity, timeliness, and relevance to stakeholders.<br><br> <b> 2. Finance Department: </b> <br>    a. Ensure Financial Stability and Sustainability: Maintain strong cash flow and optimize working capital.<br>    b. Improve Financial Planning and Analysis (FP&A): Provide accurate forecasting and budgeting to support strategic decision-making.<br>    c. Enhance Cost Management and Operational Efficiency: Identify cost-saving opportunities and drive efficient use of financial resources.<br>    d. Ensure Regulatory and Compliance Integrity: Maintain full compliance with financial regulations and internal controls.<br>    e. Enable Strategic Investment and Capital Allocation: Evaluate and fund initiatives that align with the organization’s long-term growth strategy.<br>    f. Improve Financial Reporting and Transparency: Deliver timely and accurate financial reports for stakeholders.<br>    g. Leverage Financial Technology and Automation: Implement tools to streamline financial operations.<br>    h. Develop Finance Talent and Leadership: Upskill finance staff and promote cross-functional knowledge.<br><br> <b> 3. Account Department: </b> <br>    a. Maintain Accurate and Timely Financial Records: Ensure all transactions are recorded properly.<br>    b. Ensure Regulatory and Tax Compliance: Comply with all financial regulations and timely tax filings.<br>    c. Streamline Month-End and Year-End Closing Processes: Reduce the time and complexity of closing periods.<br>    d. Support Strategic Financial Planning: Provide reliable data for budgeting and forecasting.<br>    e. Implement Automation and Digital Tools: Use accounting software and RPA to increase efficiency.<br>    f. Strengthen Internal Controls and Risk Management: Ensure robust controls over financial transactions.<br>    g. Improve Financial Transparency and Reporting Quality: Produce clear and consistent reports for stakeholders.<br>    h. Develop Accounting Team Skills and Expertise: Provide continuous learning and leadership development for accounting staff.<br></p>"
+    ],
+    
+  "vision": "To create safe, broadly beneficial AI systems that ensure the advantages of artificial general intelligence (AGI) are shared equitably across society. Our vision emphasizes the importance of safety and alignment, ensuring that AI systems align with human values and remain under human control. We aspire to collaborate with other research and policy institutions to address global challenges associated with advanced AI, prioritizing the safe development of increasingly capable systems that act in the best interests of humanity.",
+  
+  "department_goals": [
+    {
+      "department": "Account Management",
+      "goals": [
+        {
+          "goal_title": "Customer Satisfaction",
+          "goal_description": "Manage accounts effectively to enhance customer satisfaction and retention."
+        }
+      ]
+    },
+    {
+      "department": "Finance",
+      "goals": [
+        {
+          "goal_title": "Financial Stability",
+          "goal_description": "Finance the company to ensure long-term sustainability and growth."
+        }
+      ]
+    }
+  ]
+}
+
+NOTE: FOLLOW THIS RESPONSE FORMAT STRICTLY , NOTHING BEFORE OR AFTER PLEASE !!!
+
    """


@@ -1,5 +1,6 @@
 import os
 import json
+import re
 from openai import OpenAI
 from pydantic import BaseModel, Field
 from typing import List, Dict, Optional
@@ -80,9 +81,14 @@ class Chatbot:
        self.client = OpenAI(api_key=self.api_key)
        self.model = "gpt-4o-mini"
        
+    def clean_text(self, text):
+        # Remove all surrogate characters
+        return re.sub(r'[\uD800-\uDFFF]', '', text)
+
    def _extract_text_from_docs(self, docs):
        """Extract text content from document objects."""
-        return [doc.page_content for doc in docs]
+        print(docs)
+        return [self.clean_text(doc.page_content) for doc in docs]
    # Existing methods...

    def validate_worker(self, question, docs) -> VisionMissionResponse:
@@ -0,0 +1,271 @@
+import os
+from typing import List, Dict, Any
+from datetime import datetime
+import json
+import random
+from openai import OpenAI
+from dotenv import load_dotenv
+
+load_dotenv()
+
+class QuestionEnrichmentService:
+    """
+    Service class to handle question enrichment with area and member assignments.
+    This is the reverse of question generation - it takes existing questions and assigns them to areas and members.
+    """
+    
+    def __init__(self):
+        self.api_key = os.getenv("OPENAI_API_KEY")
+        self.client = OpenAI(api_key=self.api_key)
+        self.model = "gpt-4o-mini"
+    
+    def validate_question_object(self, question_obj: Dict[str, Any], index: int) -> Dict[str, str]:
+        """
+        Validate a single question object structure.
+        
+        Args:
+            question_obj: The question object to validate
+            index: The index of the question object in the array (for error messages)
+            
+        Returns:
+            Dict with 'valid' boolean and 'error' message if invalid
+        """
+        required_fields = ['question', 'role', 'position_id', 'area_tags', 'members']
+        
+        for field in required_fields:
+            if field not in question_obj:
+                return {
+                    'valid': False,
+                    'error': f"Question object at index {index} is missing required field '{field}'."
+                }
+        
+        # Validate area_tags structure
+        if not isinstance(question_obj['area_tags'], list):
+            return {
+                'valid': False,
+                'error': f"Question object at index {index}: 'area_tags' must be an array."
+            }
+        
+        for area_idx, area_tag in enumerate(question_obj['area_tags']):
+            if not isinstance(area_tag, dict) or 'name' not in area_tag or 'id' not in area_tag:
+                return {
+                    'valid': False,
+                    'error': f"Question object at index {index}: area_tag at index {area_idx} must have 'name' and 'id' fields."
+                }
+        
+        # Validate members structure
+        if not isinstance(question_obj['members'], list):
+            return {
+                'valid': False,
+                'error': f"Question object at index {index}: 'members' must be an array."
+            }
+        
+        for member_idx, member in enumerate(question_obj['members']):
+            if not isinstance(member, dict) or 'id' not in member:
+                return {
+                    'valid': False,
+                    'error': f"Question object at index {index}: member at index {member_idx} must have 'id' field."
+                }
+        
+        return {'valid': True}
+    
+    def _get_question_area_assignment_prompt(self):
+        """
+        Get the prompt for OpenAI to assign questions to the most relevant area tags.
+        """
+        return """
+        You are an expert at analyzing questions and determining which area/domain they belong to.
+        
+        Your task is to analyze each question and assign it to the most relevant area tag from the provided list.
+        
+        Guidelines:
+        1. Analyze the question content and context
+        2. Match the question to the most appropriate area tag based on its meaning and purpose
+        3. Consider what domain/area the question is actually testing or assessing
+        4. Choose only ONE area tag per question - the most relevant one
+        5. If multiple areas seem relevant, choose the most specific or primary one
+        
+        Return your response as a JSON object with the question text as key and the selected area tag ID as value.
+        
+        Example format:
+        {
+            "Is the system monitoring working properly?": 1276,
+            "Are safety protocols being followed?": 1436
+        }
+        """
+    
+    def _use_openai_for_area_assignment(self, questions_data: List[Dict[str, Any]]) -> Dict[str, int]:
+        """
+        Use OpenAI to intelligently assign questions to the most relevant area tags.
+        
+        Args:
+            questions_data: List of question objects
+            
+        Returns:
+            Dict mapping question text to selected area tag ID
+        """
+        try:
+            # Prepare the data for OpenAI
+            questions_info = []
+            all_area_tags = {}
+            
+            for question_obj in questions_data:
+                question_text = question_obj['question']
+                area_tags = question_obj['area_tags']
+                
+                questions_info.append({
+                    "question": question_text,
+                    "available_area_tags": area_tags
+                })
+                
+                # Collect all unique area tags
+                for area_tag in area_tags:
+                    all_area_tags[area_tag['id']] = area_tag['name']
+            
+            # Create the prompt content
+            prompt_content = f"""
+            Questions to analyze and assign:
+            {json.dumps(questions_info, indent=2)}
+            
+            Available area tags:
+            {json.dumps(all_area_tags, indent=2)}
+            
+            For each question, select the most relevant area tag ID from its available_area_tags list.
+            """
+            
+            response = self.client.chat.completions.create(
+                model=self.model,
+                messages=[
+                    {"role": "system", "content": self._get_question_area_assignment_prompt()},
+                    {"role": "user", "content": prompt_content}
+                ],
+                temperature=0.1,
+                max_tokens=1000
+            )
+            
+            # Parse the response
+            response_content = response.choices[0].message.content
+            
+            # Try to extract JSON from the response
+            try:
+                # Look for JSON in the response
+                start_idx = response_content.find('{')
+                end_idx = response_content.rfind('}') + 1
+                if start_idx != -1 and end_idx != -1:
+                    json_str = response_content[start_idx:end_idx]
+                    assignments = json.loads(json_str)
+                    return assignments
+            except:
+                pass
+            
+            # Fallback: return empty dict if parsing fails
+            return {}
+            
+        except Exception as e:
+            print(f"Error in OpenAI area assignment: {e}")
+            return {}
+    
+    def _find_best_area_tag_for_question(self, question_text: str, area_tags: List[Dict], openai_assignments: Dict[str, int]) -> Dict:
+        """
+        Find the most relevant area tag for a given question using OpenAI assignments.
+        
+        Args:
+            question_text: The question text to match
+            area_tags: List of available area tags
+            openai_assignments: OpenAI assignments from batch processing
+            
+        Returns:
+            The most relevant area tag
+        """
+        # First try to use OpenAI assignment
+        if question_text in openai_assignments:
+            selected_area_id = openai_assignments[question_text]
+            for area_tag in area_tags:
+                if area_tag['id'] == selected_area_id:
+                    return area_tag
+        
+        # Fallback to first area tag if OpenAI assignment not found
+        return area_tags[0] if area_tags else None
+    
+    def _select_member_for_question(self, question_text: str, members: List[Dict]) -> Dict:
+        """
+        Select the most appropriate member for a given question.
+        For now, this is a simple selection, but could be enhanced with more logic.
+        
+        Args:
+            question_text: The question text
+            members: List of available members
+            
+        Returns:
+            Selected member
+        """
+        # For now, just select the first member
+        # In a real implementation, this could consider member skills, workload, etc.
+        return members[0] if members else None
+    
+    def enrich_questions(self, questions_data: List[Dict[str, Any]]) -> Dict[str, Any]:
+        """
+        Enrich multiple questions with area and member assignments.
+        Each question gets assigned to ONE area_tag and ONE member based on OpenAI analysis.
+        Returns the exact same structure as generate_questions_from_sop_v3.
+        
+        Args:
+            questions_data: List of question objects to enrich
+            
+        Returns:
+            Dict in the same format as AllQuestions model
+        """
+        # Validate input is a list
+        if not isinstance(questions_data, list):
+            return {
+                'success': False,
+                'error': "Input data must be an array of question objects."
+            }
+        
+        # Validate each question object
+        for idx, question_obj in enumerate(questions_data):
+            validation_result = self.validate_question_object(question_obj, idx)
+            if not validation_result['valid']:
+                return {
+                    'success': False,
+                    'error': validation_result['error']
+                }
+        
+        # Use OpenAI to get intelligent area assignments
+        openai_assignments = self._use_openai_for_area_assignment(questions_data)
+        
+        # Process the enriched questions - each question gets ONE area_tag and ONE member
+        enriched_items = []
+        
+        for question_obj in questions_data:
+            # Find the best area tag for this question using OpenAI
+            best_area_tag = self._find_best_area_tag_for_question(
+                question_obj['question'], 
+                question_obj['area_tags'],
+                openai_assignments
+            )
+            
+            # Select the best member for this question
+            selected_member = self._select_member_for_question(
+                question_obj['question'],
+                question_obj['members']
+            )
+            
+            # Create a single item for this question
+            if best_area_tag and selected_member:
+                item = {
+                    "area_tag": best_area_tag['id'],
+                    "area_name": best_area_tag['name'],
+                    "assigned_to": selected_member['id'],
+                    "questions": question_obj['question'],
+                    "role": question_obj['position_id']  # Using position_id as role ID
+                }
+                enriched_items.append(item)
+        
+        # Return in the exact same format as generate_questions_from_sop_v3
+        return {
+            'success': True,
+            'questions': {
+                'items': enriched_items
+            }
+        } 
@@ -16,7 +16,6 @@ load_dotenv()
 class QuestionsGenerator:
    def __init__(self):
        self.api_key = os.getenv("OPENAI_API_KEY")
-        self.client = OpenAI(api_key=self.api_key)
        self.model = "gpt-4o-mini"
    
    def generate_questions(self, input_data: Dict) -> AssessmentQuestions:
@@ -204,6 +203,7 @@ class QuestionsGeneratorV2:
            sops = input_data['sops']
            assessment_type = input_data['assessment_type']
            total_duration = input_data['duration']
+            area_tags = input_data.get('area_tags', [])  # Get area_tags if provided, default to empty list
            
            
            # Chunk the SOPs into smaller pieces
@@ -211,16 +211,24 @@ class QuestionsGeneratorV2:
            docs_text = [sops[i:i + chunk_size] for i in range(0, len(sops), chunk_size)]
            docs = [{"type": "text", "text": text} for text in docs_text]

-
-            response = self.client.beta.chat.completions.parse(
-                model=self.model,
-                messages=[
+            # Prepare messages for the API call
+            messages = [
                {"role": "system", "content": get_questions_prompt_v5()},
                {"role": "user", "content": f"The SOPs are provided below."},
                {"role": "user", "content": json.dumps(docs)},
                {"role": "user", "content": f"Assessment Type: {assessment_type}"},
                {"role": "user", "content": f"Duration: {total_duration}"}
-                ],
+            ]
+            
+            # Add area_tags information if provided
+            if area_tags:
+                messages.append({"role": "user", "content": f"Available Area Tags: {json.dumps(area_tags)}"})
+            else:
+                messages.append({"role": "user", "content": "Area Tags: Not provided (empty array) - Please generate questions across multiple diverse area_names as instructed."})
+
+            response = self.client.beta.chat.completions.parse(
+                model=self.model,
+                messages=messages,
                temperature=0.1,
                response_format=AllQuestions,  # Ensure you specify the correct format
                max_tokens=6000
@@ -106,10 +106,11 @@ class DocumentParser:
        """

        try:
+                MODEL = "gpt-4o"
                docs_text = self._extract_text_from_docs(docs)
                prompt = get_vision_mission_extraction_from_doc2()
                response = self.client.beta.chat.completions.parse(
-                    model=self.model,
+                    model=MODEL,
                    messages=[
                        {
                            "role": "system",
@@ -117,15 +118,15 @@ class DocumentParser:
                        },
                         {
                            "role": "user",
-                            "content": f"Department to consider for the company goals generation{departments}"
+                            "content": f"Department to consider for the company goals generation: {departments}"
                        },
                        {
                            "role": "user",
                            "content": [{"type": "text", "text": text} for text in docs_text],
                        }
                    ],
-                    response_format=VisionMissionResponse2,
-                    max_tokens=10000,
+                    response_format=VisionMissionResponseV3,
+                    max_tokens=8000,
                    temperature=0.1
                )

@@ -200,6 +201,58 @@ class DocumentParser:
        temperature=0.1
        )
    
+        return json.loads(response.choices[0].message.content)
+    def get_roles_using_slug(self, docs, role_slug):
+    # Extract the text content from the Document objects
+        docs_text = [doc.page_content for doc in docs] 
+        response = self.client.beta.chat.completions.parse(
+        model=self.model,
+        messages=[
+            {
+                "role": "system",
+                "content": f'''You are a specialized role extractor for company documents. Your task is to identify and extract job roles/positions mentioned in the provided text.
+
+    TASK:
+    1. Extract ALL job roles/positions mentioned in the text as a list.
+    2. Filter the extracted roles based on the provided role_slug: "{role_slug}".
+    3. Return the filtered roles as a JSON list.
+
+    RULES:
+    - Return an empty list if no matching roles are found.
+    - The role_slug is a keyword or category used to filter relevant roles.
+    - Only include roles that semantically relate to the role_slug.
+    - Be precise in extracting official job titles rather than general descriptions.
+
+    EXAMPLES:
+
+    Example 1:
+    Text: "Our company is looking to hire a Senior Data Scientist, Junior Data Analyst, and Database Administrator for the Analytics department. We also have openings for Financial Manager and Customer Support Manager."
+    Role_slug: "data"
+    Expected output: ["Senior Data Scientist", "Junior Data Analyst", "Database Administrator"]
+
+    Example 2:
+    Text: "The restructuring process will affect several departments including the Financial Analysis team, Customer Relations department, and Sales Management. We are currently seeking a Regional Sales Manager, Sales Team Supervisor, and Customer Support Manager."
+    Role_slug: "manager"
+    Expected output: ["Financial Manager", "Regional Sales Manager", "Customer Support Manager"]
+
+    Provide the result as a valid JSON array of strings.
+    ''',
+            },
+            {
+                "role": "user",
+                "content": [
+                    {
+                        "type": "text",
+                        "text": text
+                    } for text in docs_text
+                ]
+            }
+        ],
+        response_format=Roles_response,
+        max_tokens=4096,
+        temperature=0.1
+        )
+
        return json.loads(response.choices[0].message.content)
    
    '''def extract_departments_and_managers(self, docs):
@@ -123,7 +123,7 @@ class SopPersonalAssessment:
            return False

    
-    def generate_roles_from_questionnaire(self, questionnaire_data: List[dict]) -> Roles_response:
+    def generate_roles_from_questionnaire(self, questionnaire_data: List[dict],role_slug:str) -> Roles_response:

        try:
            # List of areas: ["communication", "development", etc.]
@@ -139,10 +139,14 @@ class SopPersonalAssessment:
                    {
                        "role": "user",
                        "content": f'''Questionairre data : {questionnaire_data}''',
+                    },
+                    {
+                        "role": "user",
+                        "content": f'''Role slug to consider : {role_slug}''',
                    }
                ],
                response_format=Roles_response,
-                max_tokens=16000,
+                max_tokens=4096,
                temperature=0.1
            )   
            extracted_text = json.loads(response.choices[0].message.content)
@@ -303,7 +307,7 @@ class SopGeneratorExecutive:
                    {"role": "system", "content": prompt},
                    {"role": "user", "content": f"questionnaire response:\n{user_content}"}
                ],
-                response_format=VisionMissionResponse2,
+                response_format=VisionMissionResponseV3,
                max_tokens=16000,
                temperature=0.1
            )
@@ -1,32 +1,55 @@
 import os
-from spire.doc import Document, FileFormat
-from langchain_community.document_loaders import PyPDFLoader
+from docx import Document as DocxDocument
+from reportlab.lib.pagesizes import letter
+from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
+from reportlab.lib.styles import getSampleStyleSheet
+from langchain_community.document_loaders import PyPDFLoader, UnstructuredWordDocumentLoader
+import pdfplumber
+from langchain_core.documents import Document
+
+def load_pdf_with_pdfplumber(file_path):
+    docs = []
+    with pdfplumber.open(file_path) as pdf:
+        for i, page in enumerate(pdf.pages):
+            text = page.extract_text()
+            docs.append(Document(page_content=text, metadata={"page": i}))
+    return docs

 def convert_word_to_pdf(doc_path: str) -> str:
    """
-    Convert a .doc or .docx file to PDF using Spire.Doc.
+    Convert a .docx file to PDF using python-docx and reportlab.
    
    Args:
-        doc_path (str): The path to the .doc or .docx file.
+        doc_path (str): The path to the .docx file.

    Returns:
        str: The path to the converted PDF file.
    """
    pdf_path = os.path.splitext(doc_path)[0] + '.pdf'
    
-    # Create a Document object
-    document = Document()
    # Load the Word document
-    document.LoadFromFile(doc_path)
-    # Save as PDF
-    document.SaveToFile(pdf_path, FileFormat.PDF)
-    document.Close()
+    doc = DocxDocument(doc_path)
+    
+    # Create a PDF
+    pdf = SimpleDocTemplate(pdf_path, pagesize=letter)
+    styles = getSampleStyleSheet()
+    flowables = []
+    
+    # Extract text from paragraphs and add to PDF
+    for para in doc.paragraphs:
+        if para.text:
+            p = Paragraph(para.text, styles['Normal'])
+            flowables.append(p)
+            flowables.append(Spacer(1, 12))
+    
+    # Build the PDF
+    pdf.build(flowables)
    
    return pdf_path

 def load_document(file_path: str):
    """
-    Utility function to load a PDF, DOCX, or DOC file by first converting it to PDF.
+    Utility function to load a PDF, DOCX, or DOC file.

    Args:
        file_path (str): The path to the file to load.
@@ -34,15 +57,24 @@ def load_document(file_path: str):
    Returns:
        List[Document]: A list of Document objects representing the contents of the file.
    """
+    
+    try:
        extension = os.path.splitext(file_path)[1].lower()
        
-    if extension in ['.doc', '.docx']:
-        # Convert .doc or .docx to PDF first
+        if extension == '.docx':
+            # For .docx files, use UnstructuredWordDocumentLoader directly
+            loader = UnstructuredWordDocumentLoader(file_path)
+            return loader.load()
+        elif extension == '.doc':
+            # Convert .doc to .pdf first
            pdf_path = convert_word_to_pdf(file_path)
            loader = PyPDFLoader(pdf_path)
+            return loader.load()
        elif extension == '.pdf':
-        loader = PyPDFLoader(file_path)
+            return load_pdf_with_pdfplumber(file_path)
        else:
            raise ValueError(f"Unsupported file type: {extension}. Only .pdf, .docx, and .doc are supported.")
        
-    return loader.load()
+    except Exception as e:
+        print(f"Error loading document: {str(e)}")
+        return None
@@ -0,0 +1,244 @@
+import json
+import datetime
+from reportlab.lib.pagesizes import letter
+from reportlab.lib import colors
+from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle, PageBreak, Image
+from reportlab.platypus import Frame, PageTemplate, NextPageTemplate
+from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
+from reportlab.lib.enums import TA_CENTER, TA_JUSTIFY, TA_LEFT, TA_RIGHT
+from reportlab.lib.units import inch, cm
+from reportlab.pdfgen import canvas
+
+def header_footer(canvas, doc):
+    """Add the header and footer to each page"""
+    canvas.saveState()
+    
+    # Header
+    header_text = "Standard Operating Procedures"
+    canvas.setFont('Helvetica-Bold', 10)
+    canvas.drawString(72, letter[1] - 40, header_text)
+    
+    # Add a line below the header
+    canvas.setStrokeColor(colors.lightgrey)
+    canvas.line(72, letter[1] - 50, letter[0] - 72, letter[1] - 50)
+    
+    # Footer with page number and date
+    current_date = datetime.datetime.now().strftime("%B %d, %Y")
+    page_num = f"Page {doc.page} | {current_date}"
+    canvas.setFont('Helvetica', 8)
+    canvas.drawString(letter[0] - 150, 40, page_num)
+    
+    # Add a line above the footer
+    canvas.line(72, 50, letter[0] - 72, 50)
+    
+    canvas.restoreState()
+
+def convert_sop_to_pdf(sop_data, output_pdf="sop_document.pdf", company_name="Company Name"):
+    """
+    Convert SOP data to a well-formatted PDF document
+    
+    Args:
+        sop_data (dict or str): SOP data in dictionary format or JSON string
+        output_pdf (str): Output PDF filename
+        company_name (str): Name of the company to display on the cover page
+    """
+    # Parse JSON string if needed
+    if isinstance(sop_data, str):
+        sop_data = json.loads(sop_data)
+    
+    # Extract SOP details
+    sop_details = sop_data.get("sop_details", [])
+    
+    # Create PDF document
+    doc = SimpleDocTemplate(output_pdf, pagesize=letter, 
+                           rightMargin=72, leftMargin=72,
+                           topMargin=90, bottomMargin=72)
+    
+    # Container for PDF elements
+    elements = []
+    
+    # Get styles
+    styles = getSampleStyleSheet()
+    
+    # Define custom styles
+    title_style = ParagraphStyle(
+        'TitleStyle',
+        parent=styles['Heading1'],
+        fontSize=24,
+        alignment=TA_CENTER,
+        spaceAfter=12,
+        fontName='Helvetica-Bold',
+        textColor=colors.darkblue
+    )
+    
+    subtitle_style = ParagraphStyle(
+        'SubtitleStyle',
+        parent=styles['Heading2'],
+        fontSize=16,
+        alignment=TA_CENTER,
+        spaceAfter=36,
+        fontName='Helvetica-Bold',
+        textColor=colors.darkblue
+    )
+    
+    heading2_style = ParagraphStyle(
+        'Heading2Style',
+        parent=styles['Heading2'],
+        fontSize=16,
+        spaceBefore=12,
+        spaceAfter=6,
+        fontName='Helvetica-Bold',
+        textColor=colors.darkblue,
+        borderWidth=0,
+        borderColor=colors.lightgrey,
+        borderPadding=5,
+        borderRadius=5
+    )
+    
+    heading3_style = ParagraphStyle(
+        'Heading3Style',
+        parent=styles['Heading3'],
+        fontSize=12,
+        spaceBefore=6,
+        spaceAfter=3,
+        fontName='Helvetica-Bold',
+        textColor=colors.darkslategray
+    )
+    
+    normal_style = ParagraphStyle(
+        'NormalStyle',
+        parent=styles['Normal'],
+        fontSize=10,
+        alignment=TA_JUSTIFY,
+        leading=14,
+        fontName='Helvetica'
+    )
+    
+    bullet_style = ParagraphStyle(
+        'BulletStyle',
+        parent=normal_style,
+        leftIndent=20,
+        firstLineIndent=-15,
+        spaceBefore=2,
+        spaceAfter=2
+    )
+    
+    # Create cover page
+    elements.append(Spacer(1, 2*inch))
+    elements.append(Paragraph("Standard Operating Procedures", title_style))
+    elements.append(Spacer(1, 0.5*inch))
+    #elements.append(Paragraph(company_name, subtitle_style))
+    elements.append(Spacer(1, 2*inch))
+    
+    # Add current date
+    current_date = datetime.datetime.now().strftime("%B %d, %Y")
+    date_style = ParagraphStyle(
+        'DateStyle',
+        parent=styles['Normal'],
+        fontSize=12,
+        alignment=TA_CENTER,
+        fontName='Helvetica'
+    )
+    elements.append(Paragraph(f"Generated on: {current_date}", date_style))
+    
+    # Add a page break after the cover page
+    elements.append(PageBreak())
+    
+
+    # Process each role
+    for role_data in sop_details:
+        role_name = role_data.get("role", "Unnamed Role")
+        sops = role_data.get("sops", {})
+        narrative = role_data.get("narrative")
+        
+        # Add role header with decorative element
+        elements.append(Paragraph(role_name, heading2_style))
+        
+        # Add horizontal rule after heading
+        elements.append(Spacer(1, 0.05*inch))
+        
+        # Add narrative if available
+        if narrative and narrative != "Narrative" and narrative is not None:
+            elements.append(Paragraph("Narrative:", heading3_style))
+            elements.append(Paragraph(narrative, normal_style))
+            elements.append(Spacer(1, 0.2*inch))
+        
+        # Process SOPs
+        for sop_type in ["must", "shall", "will"]:
+            sop_items = sops.get(sop_type, [])
+            
+            if sop_items:
+                # Capitalize the first letter of SOP type and make it bold
+                sop_type_title = sop_type.capitalize()
+                elements.append(Paragraph(f"{sop_type_title}:", heading3_style))
+                
+                # Create bullet points for each SOP item with better formatting
+                for item in sop_items:
+                    elements.append(Paragraph(f"• {item}", bullet_style))
+                
+                elements.append(Spacer(1, 0.15*inch))
+        
+        # Add a page break between roles for cleaner separation
+        elements.append(PageBreak())
+    
+    # Build the PDF document with header and footer
+    doc.build(elements, onFirstPage=header_footer, onLaterPages=header_footer)
+    
+    return output_pdf
+
+def main():
+    # Example usage
+    sop_json = """
+    {
+        "sop_details": [
+            {
+                "role": "Sales Manager",
+                "role_id": 140,
+                "sops": {
+                    "must": [],
+                    "shall": [
+                        "Shall develop and implement sales strategies to achieve company targets.",
+                        "Shall conduct regular performance reviews with the sales team to ensure targets are met.",
+                        "Shall provide training and support to sales staff to enhance their skills and performance.",
+                        "Shall analyze market trends and adjust sales strategies accordingly.",
+                        "Shall maintain accurate records of sales activities and customer interactions."
+                    ],
+                    "will": [
+                        "Will lead the sales team to achieve monthly and quarterly sales goals.",
+                        "Will collaborate with marketing to align sales strategies with promotional campaigns.",
+                        "Will report sales performance to upper management on a regular basis.",
+                        "Will identify potential new markets and customer segments for growth.",
+                        "Will foster a positive team environment to motivate sales staff."
+                    ]
+                },
+                "areas": [],
+                "narrative": "The Sales Manager is responsible for leading and developing the sales team to achieve business targets and growth objectives. They will implement effective sales strategies, provide coaching to team members, and maintain strong customer relationships while ensuring all sales activities align with the company's overall goals."
+            },
+            {
+                "role": "Campaign Manager",
+                "role_id": 141,
+                "sops": {
+                    "must": [],
+                    "shall": [],
+                    "will": [
+                        "Will develop and execute marketing campaigns to promote products and services.",
+                        "Will analyze campaign performance metrics to optimize future campaigns.",
+                        "Will collaborate with cross-functional teams to ensure campaign alignment with business objectives.",
+                        "Will manage campaign budgets and ensure effective allocation of resources.",
+                        "Will stay updated on industry trends to inform campaign strategies."
+                    ]
+                },
+                "areas": [],
+                "narrative": "The Campaign Manager oversees the planning, execution, and analysis of marketing campaigns across various channels to drive brand awareness and customer acquisition. They work closely with creative teams, external vendors, and stakeholders to ensure campaigns are effective, on-brand, and deliver measurable results."
+            }
+        ]
+    }
+    """
+    
+    # You can replace the company name with your own
+    output_file = convert_sop_to_pdf(sop_json, company_name="Strategic Business Solutions, Inc.")
+    print(f"PDF successfully generated: {output_file}")
+
+if __name__ == "__main__":
+    main()
+
@@ -1,140 +1,244 @@
-import requests
-import pandas as pd
-from dotenv import load_dotenv
-load_dotenv()
-import os
+import json
+import datetime
+from reportlab.lib.pagesizes import letter
+from reportlab.lib import colors
+from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle, PageBreak, Image
+from reportlab.platypus import Frame, PageTemplate, NextPageTemplate
+from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
+from reportlab.lib.enums import TA_CENTER, TA_JUSTIFY, TA_LEFT, TA_RIGHT
+from reportlab.lib.units import inch, cm
+from reportlab.pdfgen import canvas

-DATA_KEY = os.getenv("AI_DATA_KEY")
-# Constants for API requests
-URL = "https://erpai.mkdlabs.com//v3/api/custom/erpai/common/get-data-ai"
-HEADERS = {
-    "x-project": DATA_KEY # Replace with your actual key
-}
+def header_footer(canvas, doc):
+    """Add the header and footer to each page"""
+    canvas.saveState()
    
-# JSON bodies for API requests
-def create_json_body(area_type, company_id):
-    return {
-        "type": area_type,
-        "options": {
-            "company_id": company_id
+    # Header
+    header_text = "Standard Operating Procedures"
+    canvas.setFont('Helvetica-Bold', 10)
+    canvas.drawString(72, letter[1] - 40, header_text)
+    
+    # Add a line below the header
+    canvas.setStrokeColor(colors.lightgrey)
+    canvas.line(72, letter[1] - 50, letter[0] - 72, letter[1] - 50)
+    
+    # Footer with page number and date
+    current_date = datetime.datetime.now().strftime("%B %d, %Y")
+    page_num = f"Page {doc.page} | {current_date}"
+    canvas.setFont('Helvetica', 8)
+    canvas.drawString(letter[0] - 150, 40, page_num)
+    
+    # Add a line above the footer
+    canvas.line(72, 50, letter[0] - 72, 50)
+    
+    canvas.restoreState()
+
+def convert_sop_to_pdf(sop_data, output_pdf="sop_document.pdf", company_name="Company Name"):
+    """
+    Convert SOP data to a well-formatted PDF document
+    
+    Args:
+        sop_data (dict or str): SOP data in dictionary format or JSON string
+        output_pdf (str): Output PDF filename
+        company_name (str): Name of the company to display on the cover page
+    """
+    # Parse JSON string if needed
+    if isinstance(sop_data, str):
+        sop_data = json.loads(sop_data)
+    
+    # Extract SOP details
+    sop_details = sop_data.get("sop_details", [])
+    
+    # Create PDF document
+    doc = SimpleDocTemplate(output_pdf, pagesize=letter, 
+                           rightMargin=72, leftMargin=72,
+                           topMargin=90, bottomMargin=72)
+    
+    # Container for PDF elements
+    elements = []
+    
+    # Get styles
+    styles = getSampleStyleSheet()
+    
+    # Define custom styles
+    title_style = ParagraphStyle(
+        'TitleStyle',
+        parent=styles['Heading1'],
+        fontSize=24,
+        alignment=TA_CENTER,
+        spaceAfter=12,
+        fontName='Helvetica-Bold',
+        textColor=colors.darkblue
+    )
+    
+    subtitle_style = ParagraphStyle(
+        'SubtitleStyle',
+        parent=styles['Heading2'],
+        fontSize=16,
+        alignment=TA_CENTER,
+        spaceAfter=36,
+        fontName='Helvetica-Bold',
+        textColor=colors.darkblue
+    )
+    
+    heading2_style = ParagraphStyle(
+        'Heading2Style',
+        parent=styles['Heading2'],
+        fontSize=16,
+        spaceBefore=12,
+        spaceAfter=6,
+        fontName='Helvetica-Bold',
+        textColor=colors.darkblue,
+        borderWidth=0,
+        borderColor=colors.lightgrey,
+        borderPadding=5,
+        borderRadius=5
+    )
+    
+    heading3_style = ParagraphStyle(
+        'Heading3Style',
+        parent=styles['Heading3'],
+        fontSize=12,
+        spaceBefore=6,
+        spaceAfter=3,
+        fontName='Helvetica-Bold',
+        textColor=colors.darkslategray
+    )
+    
+    normal_style = ParagraphStyle(
+        'NormalStyle',
+        parent=styles['Normal'],
+        fontSize=10,
+        alignment=TA_JUSTIFY,
+        leading=14,
+        fontName='Helvetica'
+    )
+    
+    bullet_style = ParagraphStyle(
+        'BulletStyle',
+        parent=normal_style,
+        leftIndent=20,
+        firstLineIndent=-15,
+        spaceBefore=2,
+        spaceAfter=2
+    )
+    
+    # Create cover page
+    elements.append(Spacer(1, 2*inch))
+    elements.append(Paragraph("Standard Operating Procedures", title_style))
+    elements.append(Spacer(1, 0.5*inch))
+    #elements.append(Paragraph(company_name, subtitle_style))
+    elements.append(Spacer(1, 2*inch))
+    
+    # Add current date
+    current_date = datetime.datetime.now().strftime("%B %d, %Y")
+    date_style = ParagraphStyle(
+        'DateStyle',
+        parent=styles['Normal'],
+        fontSize=12,
+        alignment=TA_CENTER,
+        fontName='Helvetica'
+    )
+    elements.append(Paragraph(f"Generated on: {current_date}", date_style))
+    
+    # Add a page break after the cover page
+    elements.append(PageBreak())
+    
+
+    # Process each role
+    for role_data in sop_details:
+        role_name = role_data.get("role", "Unnamed Role")
+        sops = role_data.get("sops", {})
+        narrative = role_data.get("narrative")
+        
+        # Add role header with decorative element
+        elements.append(Paragraph(role_name, heading2_style))
+        
+        # Add horizontal rule after heading
+        elements.append(Spacer(1, 0.05*inch))
+        
+        # Add narrative if available
+        if narrative and narrative != "Narrative" and narrative is not None:
+            elements.append(Paragraph("Narrative:", heading3_style))
+            elements.append(Paragraph(narrative, normal_style))
+            elements.append(Spacer(1, 0.2*inch))
+        
+        # Process SOPs
+        for sop_type in ["must", "shall", "will"]:
+            sop_items = sops.get(sop_type, [])
+            
+            if sop_items:
+                # Capitalize the first letter of SOP type and make it bold
+                sop_type_title = sop_type.capitalize()
+                elements.append(Paragraph(f"{sop_type_title}:", heading3_style))
+                
+                # Create bullet points for each SOP item with better formatting
+                for item in sop_items:
+                    elements.append(Paragraph(f"• {item}", bullet_style))
+                
+                elements.append(Spacer(1, 0.15*inch))
+        
+        # Add a page break between roles for cleaner separation
+        elements.append(PageBreak())
+    
+    # Build the PDF document with header and footer
+    doc.build(elements, onFirstPage=header_footer, onLaterPages=header_footer)
+    
+    return output_pdf
+
+def main():
+    # Example usage
+    sop_json = """
+    {
+        "sop_details": [
+            {
+                "role": "Sales Manager",
+                "role_id": 140,
+                "sops": {
+                    "must": [],
+                    "shall": [
+                        "Shall develop and implement sales strategies to achieve company targets.",
+                        "Shall conduct regular performance reviews with the sales team to ensure targets are met.",
+                        "Shall provide training and support to sales staff to enhance their skills and performance.",
+                        "Shall analyze market trends and adjust sales strategies accordingly.",
+                        "Shall maintain accurate records of sales activities and customer interactions."
+                    ],
+                    "will": [
+                        "Will lead the sales team to achieve monthly and quarterly sales goals.",
+                        "Will collaborate with marketing to align sales strategies with promotional campaigns.",
+                        "Will report sales performance to upper management on a regular basis.",
+                        "Will identify potential new markets and customer segments for growth.",
+                        "Will foster a positive team environment to motivate sales staff."
+                    ]
+                },
+                "areas": [],
+                "narrative": "The Sales Manager is responsible for leading and developing the sales team to achieve business targets and growth objectives. They will implement effective sales strategies, provide coaching to team members, and maintain strong customer relationships while ensuring all sales activities align with the company's overall goals."
+            },
+            {
+                "role": "Campaign Manager",
+                "role_id": 141,
+                "sops": {
+                    "must": [],
+                    "shall": [],
+                    "will": [
+                        "Will develop and execute marketing campaigns to promote products and services.",
+                        "Will analyze campaign performance metrics to optimize future campaigns.",
+                        "Will collaborate with cross-functional teams to ensure campaign alignment with business objectives.",
+                        "Will manage campaign budgets and ensure effective allocation of resources.",
+                        "Will stay updated on industry trends to inform campaign strategies."
+                    ]
+                },
+                "areas": [],
+                "narrative": "The Campaign Manager oversees the planning, execution, and analysis of marketing campaigns across various channels to drive brand awareness and customer acquisition. They work closely with creative teams, external vendors, and stakeholders to ensure campaigns are effective, on-brand, and deliver measurable results."
            }
+        ]
    }
+    """
    
-# Function to fetch data from the API
-def fetch_data(json_body):
-    json_body["options"]["company_id"] = json_body["options"].get("company_id")  # Ensure company_id is included
-    response = requests.post(URL, headers=HEADERS, json=json_body)
-    response.raise_for_status()  # Raise an error for bad responses
-    return response.json()
-
-
-
-def convert_assessment_data_to_dataframe(assessment_data):
-    df_assessment = []
-    for assessment in assessment_data.get("data", []):
-        assessment_id = assessment["assessment_id"]
-        assessment_name = assessment["assessment_name"]
-        start_date = assessment["start_date"]
-        open_items = assessment["open_items"]
-        completed_items = assessment["completed_items"]
-        total_assigned_items = assessment["total_assigned_items"]
-        red_flags = assessment["red_flags"]
-
-        for user in assessment.get("user_details", []):
-            user_name = user["name"]
-            user_total_items = user["total_assigned_items"]
-            user_completed_items = user["completed_items"]
-            
-            for area in user.get("area_list", []):
-                df_assessment.append({
-                    "assessment_id": assessment_id,
-                    "assessment_name": assessment_name,
-                    "start_date": start_date,
-                    "open_items_overall": open_items,
-                    "completed_items_overall": completed_items,
-                    "total_assigned_items_overall": total_assigned_items,
-                    "user_name": user_name,
-                    "user_total_assigned_items": user_total_items,
-                    "user_completed_items": user_completed_items,
-                    "area": area,
-                    "red_flags": red_flags
-                })
-    return pd.DataFrame(df_assessment)
-
-# Convert to DataFrame
-
-
-# Summary statistics for overall assessment level
-def generate_summary_statistics(df):
-    total_assessments = df['assessment_id'].nunique()
-    avg_open_items = df.groupby('assessment_id')['open_items_overall'].mean().mean()
-    avg_completed_items = df.groupby('assessment_id')['completed_items_overall'].mean().mean()
-    avg_total_assigned_items = df.groupby('assessment_id')['total_assigned_items_overall'].mean().mean()
-    avg_red_flags = df['red_flags'].mean()
-
-    total_users = df['user_name'].nunique()
-    avg_user_total_items = df.groupby('user_name')['user_total_assigned_items'].mean().mean()
-    avg_user_completed_items = df.groupby('user_name')['user_completed_items'].mean().mean()
-    completion_rate_per_user = (df['user_completed_items'].sum() / df['user_total_assigned_items'].sum()) * 100 if df['user_total_assigned_items'].sum() > 0 else 0
-    
-    area_summary = df['area'].value_counts()
-
-    return {
-        "total_assessments": total_assessments,
-        "avg_open_items_per_assessment": avg_open_items,
-        "avg_completed_items_per_assessment": avg_completed_items,
-        "avg_total_assigned_items_per_assessment": avg_total_assigned_items,
-        "avg_red_flags": avg_red_flags,
-        "total_users": total_users,
-        "avg_user_total_assigned_items": avg_user_total_items,
-        "avg_user_completed_items": avg_user_completed_items,
-        "completion_rate_per_user": completion_rate_per_user,
-        "area_summary": area_summary.to_dict()
-    }
-
-# Additional statistics for efficiency and areas
-def generate_extended_statistics(df):
-    df['user_completion_rate'] = (df['user_completed_items'] / df['user_total_assigned_items']).fillna(0) * 100
-
-    top_5_efficient_users = df.groupby('user_name')['user_completion_rate'].mean().nlargest(5).to_dict()
-    bottom_5_least_efficient_users = df.groupby('user_name')['user_completion_rate'].mean().nsmallest(5).to_dict()
-
-    df['uncompleted_items'] = df['user_total_assigned_items'] - df['user_completed_items']
-    areas_with_most_uncompleted_items = df.groupby('area')['uncompleted_items'].sum().nlargest(5).to_dict()
-
-    return {
-        "top_5_efficient_users": top_5_efficient_users,
-        "bottom_5_least_efficient_users": bottom_5_least_efficient_users,
-        "areas_with_most_uncompleted_items": areas_with_most_uncompleted_items
-    }
-
-# Generate statistics for problematic areas
-def generate_problematic_area_statistics(df):
-    total_open_items = df.groupby('name')['open_items'].sum().sort_values(ascending=False)
-    total_red_flags = df.groupby('name')['red_flags'].sum().sort_values(ascending=False)
-
-    return pd.DataFrame({
-        "total_open_items": total_open_items,
-        "total_red_flags": total_red_flags
-    }).fillna(0)
-
-def generate_summary_stats(assessment_data, area_data):
-    assessment_df = convert_assessment_data_to_dataframe(assessment_data)
-    problematic_area_df = pd.DataFrame(area_data.get("data", []))
-    
-    summary_stats = generate_summary_statistics(assessment_df)
-    extended_stats = generate_extended_statistics(assessment_df)
-    summary_stats["users(Workers) based stats"] = extended_stats
-    
-    problematic_stats = generate_problematic_area_statistics(problematic_area_df)
-    summary_stats["Area based stats"] = problematic_stats.to_dict(orient='index')
-
-    return summary_stats
-
+    # You can replace the company name with your own
+    output_file = convert_sop_to_pdf(sop_json, company_name="Strategic Business Solutions, Inc.")
+    print(f"PDF successfully generated: {output_file}")

 if __name__ == "__main__":
-    from src.services.chatbot import Chatbot
-    bot = Chatbot()
-    res = bot.predict_next_n_assessment(companyid=12,N=3)
+    main()

-    print(res)
Author	SHA1	Message	Date
teslim	ec8ec6f190	enrichment of questions	2025-07-16 12:46:14 +01:00
teslim	d654707751	multiple area tags	2025-07-08 15:20:10 +01:00
teslim	31d85a83e7	fix	2025-06-13 19:32:33 +01:00
teslim	bfcdcf4786	fix	2025-06-13 19:12:08 +01:00
teslim	2180cd856a	fix	2025-06-13 19:09:44 +01:00
teslim	dd616913e0	fix	2025-06-13 19:06:18 +01:00
teslim	76e0fe1cae	fix	2025-06-13 18:50:24 +01:00
teslim	3ad557edf9	fix	2025-06-10 17:50:52 +01:00
teslim	ff0882ca9f	fixes	2025-05-21 16:15:56 +01:00
kowshik	ae25b7cf0c	updated mission and vsion responses	2025-05-13 23:22:07 +00:00
kowshik	d005200145	added sop pdf generator	2025-05-12 21:55:50 +00:00
kowshik	dcae438e64	updated roles getting using slug	2025-05-05 21:13:58 +00:00
kowshik	2398cb867b	updated sops from questionairre prompt	2025-05-02 16:26:40 +00:00
kowshik	4c612fd0e6	updated visiona and mission responses	2025-05-02 09:57:41 +00:00
kowshik	a522141cc6	updated missiona ndvsion generation from doc	2025-04-29 15:09:05 +00:00