Compare commits

..

15 Commits

Author SHA1 Message Date
teslim ec8ec6f190 enrichment of questions 2025-07-16 12:46:14 +01:00
teslim d654707751 multiple area tags 2025-07-08 15:20:10 +01:00
teslim 31d85a83e7 fix 2025-06-13 19:32:33 +01:00
teslim bfcdcf4786 fix 2025-06-13 19:12:08 +01:00
teslim 2180cd856a fix 2025-06-13 19:09:44 +01:00
teslim dd616913e0 fix 2025-06-13 19:06:18 +01:00
teslim 76e0fe1cae fix 2025-06-13 18:50:24 +01:00
teslim 3ad557edf9 fix 2025-06-10 17:50:52 +01:00
teslim ff0882ca9f fixes 2025-05-21 16:15:56 +01:00
kowshik ae25b7cf0c updated mission and vsion responses 2025-05-13 23:22:07 +00:00
kowshik d005200145 added sop pdf generator 2025-05-12 21:55:50 +00:00
kowshik dcae438e64 updated roles getting using slug 2025-05-05 21:13:58 +00:00
kowshik 2398cb867b updated sops from questionairre prompt 2025-05-02 16:26:40 +00:00
kowshik 4c612fd0e6 updated visiona and mission responses 2025-05-02 09:57:41 +00:00
kowshik a522141cc6 updated missiona ndvsion generation from doc 2025-04-29 15:09:05 +00:00
21 changed files with 1476 additions and 229 deletions
+301
View File
@@ -0,0 +1,301 @@
# Enrich Questions API Documentation
## Overview
The `enrich-questions` endpoint is a reverse API that takes existing questions and assigns them to specific areas and members. This endpoint returns the exact same response structure as `generate_questions_from_sop_v3`. Each question is intelligently assigned to the most relevant area_tag and member using OpenAI analysis.
## Endpoint
```
POST /api/v1/common/enrich-questions
```
## Authentication
Requires Bearer token authentication:
```
Authorization: Bearer <your-api-key>
```
## Request Format
### Headers
```
Content-Type: application/json
Authorization: Bearer <your-api-key>
```
### Request Body
The request body should be a JSON array of question objects. Each question object must contain:
- `question` (string): The question text
- `role` (string): The role associated with the question
- `position_id` (integer): The position ID (used as role ID in response)
- `area_tags` (array): Array of area tag objects with `name` and `id` (OpenAI selects the most relevant one)
- `members` (array): Array of member objects with `id` (algorithm selects the most appropriate one)
### Example Request
```json
[
{
"question": "Is the system monitoring working properly?",
"role": "IT Expert",
"position_id": 522,
"area_tags": [
{
"name": "IT Operations",
"id": 1276
},
{
"name": "Communication & Coordination",
"id": 1426
},
{
"name": "Quality Assurance",
"id": 1427
}
],
"members": [
{
"id": 159
}
]
},
{
"question": "Are safety protocols being followed?",
"role": "IT Expert",
"position_id": 522,
"area_tags": [
{
"name": "IT Operations",
"id": 1276
},
{
"name": "Safety Protocols",
"id": 1436
}
],
"members": [
{
"id": 159
}
]
}
]
```
## Response Format
### Success Response (200 OK)
The response structure is identical to `generate_questions_from_sop_v3`. Each question is assigned to ONE area_tag and ONE member:
```json
{
"questions": {
"items": [
{
"area_tag": 1276,
"area_name": "IT Operations",
"assigned_to": 159,
"questions": "Is the system monitoring working properly?",
"role": 522
},
{
"area_tag": 1436,
"area_name": "Safety Protocols",
"assigned_to": 159,
"questions": "Are safety protocols being followed?",
"role": 522
}
]
}
}
```
### Response Structure Explanation
- Each question creates exactly ONE item in the response
- OpenAI analyzes the question content and selects the most relevant `area_tag` from available options
- The algorithm selects the most appropriate `member` from the available members
- `area_tag`: The OpenAI-selected area tag ID
- `area_name`: The OpenAI-selected area tag name
- `assigned_to`: The selected member ID
- `questions`: The question text
- `role`: The position_id from the request (used as role identifier)
## AI-Powered Assignment Algorithm
### OpenAI Area Tag Selection
The system uses OpenAI's GPT-4o-mini model to intelligently analyze each question and select the most relevant area tag:
1. **Content Analysis**: OpenAI analyzes the question content, context, and meaning
2. **Domain Matching**: Determines which area/domain the question is actually testing or assessing
3. **Relevance Scoring**: Considers the purpose and intent of the question
4. **Smart Selection**: Chooses the most specific and primary area tag from available options
5. **Fallback**: If OpenAI analysis fails, defaults to the first available area tag
**OpenAI Prompt Guidelines:**
- Analyze question content and context
- Match questions to appropriate area tags based on meaning and purpose
- Consider what domain/area the question is actually testing
- Choose only ONE area tag per question - the most relevant one
- If multiple areas seem relevant, choose the most specific or primary one
### Member Selection
Currently uses a simple selection algorithm (first member), but can be enhanced to consider:
- Member skills and expertise
- Current workload distribution
- Availability and capacity
- Historical performance
### Error Responses
#### 400 Bad Request - Invalid Input Format
```json
{
"error": "Invalid input",
"message": "Input data must be in JSON format."
}
```
#### 400 Bad Request - Missing Required Fields
```json
{
"error": "Invalid data",
"message": "Question object at index 0 is missing required field 'question'."
}
```
#### 400 Bad Request - Invalid Array Structure
```json
{
"error": "Invalid input",
"message": "Input data must be an array of question objects."
}
```
#### 401 Unauthorized
```json
{
"error": "Unauthorized",
"message": "API key is missing or invalid."
}
```
#### 500 Internal Server Error
```json
{
"error": "Internal Server Error",
"message": "An unexpected error occurred."
}
```
## Usage Examples
### Basic Usage
```bash
curl -X POST "http://localhost:5402/api/v1/common/enrich-questions" \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '[
{
"question": "Is the system performance being monitored?",
"role": "Developer",
"position_id": 123,
"area_tags": [
{"name": "Development", "id": 1},
{"name": "Performance Monitoring", "id": 2}
],
"members": [
{"id": 456}
]
}
]'
```
### Python Example
```python
import requests
import json
url = "http://localhost:5402/api/v1/common/enrich-questions"
headers = {
"Authorization": "Bearer your-api-key",
"Content-Type": "application/json"
}
payload = [
{
"question": "Is the system performance being monitored?",
"role": "Developer",
"position_id": 123,
"area_tags": [
{"name": "Development", "id": 1},
{"name": "Performance Monitoring", "id": 2}
],
"members": [
{"id": 456}
]
}
]
response = requests.post(url, json=payload, headers=headers)
result = response.json()
print(result)
```
## Validation Rules
1. **Input must be a JSON array** of question objects
2. **Each question object must contain all required fields**:
- `question`: Non-empty string
- `role`: Non-empty string
- `position_id`: Integer
- `area_tags`: Array of objects with `name` and `id`
- `members`: Array of objects with `id`
3. **Area tags must be valid objects** with both `name` (string) and `id` (integer/string)
4. **Members must be valid objects** with `id` (integer/string)
5. **Arrays can be empty** but must be present
## Response Logic
The endpoint uses AI to intelligently assign each question to the most relevant area and member:
- **Input**: 2 questions with multiple area_tags and members each
- **Output**: 2 items (one per question) with the best area_tag and member selected for each
- **AI Analysis**: OpenAI analyzes question content and meaning to find the most relevant area_tag
- **Smart Assignment**: Uses natural language understanding to make intelligent assignments
- **No Cartesian Product**: Each question gets exactly one area assignment and one member assignment
## Performance Considerations
- **Batch Processing**: OpenAI analysis is performed in batches for efficiency
- **Caching**: Consider implementing caching for frequently assigned questions
- **Fallback**: Robust fallback mechanisms ensure the endpoint always returns valid assignments
- **Error Handling**: Comprehensive error handling for OpenAI API failures
## Integration with Existing System
This endpoint complements the existing question generation APIs:
- `POST /api/v1/qs/generate_questions_from_sop` - Generates questions from SOPs
- `POST /api/v1/qs/generate_questions_from_sop-latest` - Enhanced question generation
- `POST /api/v1/common/enrich-questions` - Enriches existing questions (NEW)
The enrich-questions endpoint returns the **exact same structure** as `generate_questions_from_sop_v3`, with AI-powered intelligent assignment of questions to the most relevant areas and members, making it seamlessly interchangeable in your application workflow.
+1
View File
@@ -4,3 +4,4 @@ langchain-openai
pydantic
flask
python-dotenv
reportlab
View File
+2
View File
@@ -3,6 +3,7 @@ from flask import Flask
from src.api.routes.sops import sops_bp
from src.api.routes.questions import qs_b
from src.api.routes.chatbot import bot
from src.api.routes.common import common_bp
def create_app():
app = Flask(__name__)
@@ -11,6 +12,7 @@ def create_app():
app.register_blueprint(sops_bp, url_prefix='/api/v1/sop')
app.register_blueprint(qs_b,url_prefix='/api/v1/qs')
app.register_blueprint(bot,url_prefix='/api/v1/bot')
app.register_blueprint(common_bp, url_prefix='/api/v1/common')
# Set up the upload folder configuration inside the src directory
UPLOAD_FOLDER = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../../uploads')
+53
View File
@@ -0,0 +1,53 @@
import os
from flask import Blueprint, request, jsonify
from src.utils.auth import auth_check
from src.services.question_enrichment import QuestionEnrichmentService
import json
# Initialize the Blueprint
common_bp = Blueprint('common', __name__)
@common_bp.route('/enrich-questions', methods=['POST'])
@auth_check
def enrich_questions():
"""
Reverse API endpoint that takes questions and assigns them to areas and members.
Returns the exact same structure as generate_questions_from_sop_v3.
Expected payload: Array of question objects with question, role, position_id, area_tags, and members.
Example payload:
[
{
"question": "Minor",
"role": "IT Expert",
"position_id": 522,
"area_tags": [
{"name": "IT Operations", "id": 1276},
{"name": "Communication & Coordination", "id": 1426}
],
"members": [
{"id": 159}
]
}
]
"""
if not request.is_json:
return jsonify({"error": "Invalid input", "message": "Input data must be in JSON format."}), 400
input_data = request.get_json()
try:
# Initialize the question enrichment service
enrichment_service = QuestionEnrichmentService()
# Enrich the questions
result = enrichment_service.enrich_questions(input_data)
if not result['success']:
return jsonify({"error": "Invalid data", "message": result['error']}), 400
# Return the exact same structure as generate_questions_from_sop_v3
return jsonify({"questions": result['questions']}), 200
except Exception as e:
return jsonify({"error": "Internal Server Error", "message": str(e)}), 500
+4
View File
@@ -81,6 +81,10 @@ def generate_questions_from_sop_v3():
"duration": input_data['duration']
}
# Add area_tags if provided in the payload (optional)
if 'area_tags' in input_data:
generator_input['area_tags'] = input_data['area_tags']
# Generate questions using the QuestionGenerator
generator = QuestionsGeneratorV2()
questions_response = generator.generate_questions_for_all(generator_input)
+70 -24
View File
@@ -7,7 +7,11 @@ from src.utils.auth import auth_check
from src.utils.utils import delete_all_files_in_directory
from src.utils.document_loader import load_document
from flask import Blueprint, jsonify, request, make_response
import os
import tempfile
import datetime
from flask import send_file
from flask import Blueprint, jsonify, request, make_response,after_this_request
import json
# Initialize the Blueprint
sops_bp = Blueprint('sops', __name__)
@@ -29,6 +33,7 @@ def get_roles():
return jsonify({"error": "No file part", "message": "Please upload a file with the key 'document'."}), 400
file = request.files['document']
role_slug = request.form.get('role_slug')
# If the user does not select a file, the browser may also submit an empty part without filename
if file.filename == '':
@@ -48,7 +53,7 @@ def get_roles():
# Generate roles from the docs
parser = DocumentParser()
roles = parser.get_roles(docs)["roles"]
roles = parser.get_roles_using_slug(docs,role_slug)["roles"]
# Cleanup: Delete all files in the upload directory after processing
delete_all_files_in_directory(upload_folder)
@@ -70,6 +75,7 @@ def get_roles():
def get_roles_questionnaire():
# Check if the post request has the file part
questionnaire_data = request.json
role_slug = questionnaire_data.get("role_slug")
# Validate the required fields in the questionnaire data
if not questionnaire_data.get('questionnaire_response'):
@@ -80,7 +86,7 @@ def get_roles_questionnaire():
generator = SopPersonalAssessment()
roles = generator.generate_roles_from_questionnaire(questionnaire_data)
roles = generator.generate_roles_from_questionnaire(questionnaire_data,role_slug)
if not roles:
return jsonify({"error": "No roles found", "message": "No roles were extracted from the questionnaire."}), 404
@@ -88,8 +94,6 @@ def get_roles_questionnaire():
return jsonify({"roles": roles, "message": "Roles successfully extracted from the questionnaire."}), 200
@sops_bp.route('/personal_assessment/generate_sops_from_doc', methods=['POST'])
@auth_check
def generate_sops():
@@ -225,6 +229,62 @@ def generate_sops_by_roles_and_areas():
@sops_bp.route('/general/generate_sops_pdf', methods=['POST'])
@auth_check
def generate_sops_pdf():
"""
Generate a PDF file of SOPs based on the SOP JSON data provided in the request body.
Returns the PDF as a downloadable file and then deletes the temporary file.
"""
try:
# Get the SOP JSON data from the request body
sop_data = request.json
# Validate the presence of SOP data
if not sop_data or not isinstance(sop_data, dict) or 'sop_details' not in sop_data:
return make_response(jsonify({
"error": "Invalid input",
"message": "The request body should contain valid SOP data with a 'sop_details' field."
}), 400)
# Create a unique temporary filename for the PDF
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
temp_dir = tempfile.gettempdir()
pdf_filename = f"sop_document_{timestamp}.pdf"
pdf_path = os.path.join(temp_dir, pdf_filename)
# Import the PDF conversion function
from ...utils.sop_pdf_creator import convert_sop_to_pdf
# Generate the PDF file
convert_sop_to_pdf(sop_data, pdf_path, "")
# Send the file as a download attachment
@after_this_request
def remove_file(response):
try:
# Delete the temporary file after the response is sent
if os.path.exists(pdf_path):
os.remove(pdf_path)
except Exception as e:
print(f"Error removing temporary PDF file: {str(e)}")
return response
# Return the file as a downloadable attachment
return send_file(
pdf_path,
as_attachment=True,
download_name=f"SOP_Document_{timestamp}.pdf",
mimetype='application/pdf'
)
except Exception as e:
print(f"Error generating PDF: {str(e)}")
return make_response(jsonify({
"error": "Processing error",
"message": f"An error occurred while generating the SOP PDF: {str(e)}"
}), 500)
@sops_bp.route('/executive/generate_sop_mission_from_vision', methods=['POST'])
@auth_check
@@ -312,11 +372,9 @@ def generate_executive_goals_from_doc():
if 'document' not in request.files:
return jsonify({"error": "No file part", "message": "Please upload a file with the key 'document'."}), 400
if 'departments' not in request.form:
return jsonify({"error": "department missing", "message": "Please provide departments'."}), 400
try:
departments = request.form.get('departments')
departments = request.form.get('departments','[]')
# Manually load roles from the string to JSON
departments = json.loads(departments)
@@ -351,17 +409,8 @@ def generate_executive_goals_from_doc():
if not vision_mission:
return jsonify({"error": "Vision and Mission generation error ", "message": "Error in generating mssion and viso."}), 400
# Check if both vision and mission are empty
if not vision_mission.get('vision') and not vision_mission.get('mission'):
# Cleanup: Delete all files in the upload directory if parsing fails
delete_all_files_in_directory(upload_folder)
return jsonify({"vision": [], "mission": [], "message": "The document does not contain mission and vision."}), 200
print(f"Vision and mission: {vision_mission}")
vission = vision_mission.get('vision')
mission = vision_mission.get('mission')
return jsonify({"mission": mission, "vission": vission, "message": "vision and mission generated successfully"}), 200
response = json.loads(vision_mission.get("response"))
return jsonify({"data": response, "message": "vision and mission generated successfully"}), 200
except Exception as e:
# Cleanup: Delete all files in the upload directory if an error occurs
@@ -371,8 +420,6 @@ def generate_executive_goals_from_doc():
return jsonify({"error": "File type not allowed", "message": "The uploaded file type is not allowed. Please upload a PDF, DOC, or DOCX file."}), 400
@sops_bp.route('/executive/generate_sop_managers_doc', methods=['POST'])
@auth_check
def generate_sop_managers_doc():
@@ -466,10 +513,9 @@ def generate_vision_goals_quest():
sop_generator = SopGeneratorExecutive()
vision_mission = sop_generator.generate_vision_mission_from_questionnaire(questionnaire_data)
vission = vision_mission.get('vision')
mission = vision_mission.get('mission')
data = json.loads(vision_mission.get('response'))
return jsonify({"mission": mission, "vission": vission, "message": "vision and mission generated successfully"}), 200
return jsonify({"data": data, "message": "vision and mission generated successfully"}), 200
except Exception as e:
return make_response(jsonify({"error": "Processing error", "message": f"An error occurred while processing the request: {str(e)}"}), 500)
+4 -15
View File
@@ -1,17 +1,3 @@
'''from pydantic import BaseModel
from typing import List, Dict
class Question(BaseModel):
assigned_to: str
role: str
questions: str
area_tag:str
class AssementQuestion(BaseModel):
number: int
questions: List[Question]'''
from pydantic import BaseModel
from typing import List, Dict
from typing import Optional
@@ -33,5 +19,8 @@ class Questions(BaseModel):
class AssessmentQuestions(BaseModel):
questions: Questions
class AllQuestionsItems(BaseModel):
items: List[Question]
class AllQuestions(BaseModel):
questions : List[Question]
questions: AllQuestionsItems
+11 -2
View File
@@ -10,6 +10,7 @@ class Categories(BaseModel):
class RoleSops(BaseModel):
role:str
narrative:str
sops:Categories
class RoleSopssLists(BaseModel):
@@ -35,9 +36,17 @@ class DeptMisiion(BaseModel):
departments: str
goals: List[str]
class VisionMissionResponse2(BaseModel):
vision: List[str]
class VisionMissionResponse3(BaseModel):
mission: List[str]
vision: List[str]
class VisionMissionResponse2(BaseModel):
response: List[str]
class VisionMissionResponseV3(BaseModel):
response: str
class VisionMissionResponse(BaseModel):
vision: List[str]
+19 -2
View File
@@ -236,8 +236,25 @@ def get_questions_prompt_v5():
NOTE: !!! MAKE SURE YOU CORRECTLY ATTACH "assigned_to" AS THE ID OF THE MEMBER OF THE ROLE AS STATED IN THE SOP. CHECK MEMBERS UNDER THE ROLE IN THE PROVIDED SOP AND USE THE CORRECT ID OF THE MEMBER, DO NOT USE MEMBER iD THAT IS NOT PROVIDED AS "assigned_to" pls FOLLOW THIS STRICTLY!!!
NOTE: CHECK THE "role_id" UNDER THE PROVIDED SOP AND USE THE CORRECT ID OF THE ROLE, DO NOT USE OR FORMULATE "role_id" THAT IS NOT PROVIDED AS "role" pls FOLLOW THIS STRICTLY!!!
NOTE: IF area tags is not provided for specicfic role SOPS, kindly formulate an area name based on the sop and make the area_tag null but forumlate rea name, only do this if area tags info is not provided for specific role sops
NOTE: Use exactly the area names provided if available unless area tags is missing and you need to forumalate one
NOTE: IF area tags is not provided for specific role SOPS, kindly formulate an area name based on the sop and make the area_tag null but formulate area name, only do this if area tags info is not provided for specific role sops
NOTE: Use exactly the area names provided if available unless area tags is missing and you need to formulate one
CRITICAL AREA DIVERSIFICATION REQUIREMENT:
When area_tags are not provided or the area_tags array is empty, you MUST generate questions across AT LEAST 2-3 DIFFERENT area_names. Do NOT generate all questions under a single area_name. Analyze the SOPs and create diverse area_names such as:
- "Communication & Coordination"
- "Process Compliance"
- "Quality Assurance"
- "Timeline Management"
- "Technical Execution"
- "Safety Protocols"
- "Documentation & Reporting"
- "Resource Management"
- "Performance Monitoring"
- "Risk Assessment"
- "Training & Development"
- "Customer Relations"
MANDATORY: Distribute your questions across these different area_names to ensure comprehensive coverage of the organizational processes. Each area_name should have multiple questions assigned to it. Do not concentrate all questions in one area - this defeats the purpose of comprehensive assessment.
"""
return prompt
+115 -12
View File
@@ -15,19 +15,35 @@ def get_sop_extraction_from_doc():
def get_roles_extraction_from_questionnaire():
return '''Your task is to extract the "Roles" from the provided questionnaire responses.
You must identify and categorize the roles based on the information provided.
return '''
You are a specialized role extractor for company documents. Your task is to identify and extract job roles/positions mentioned in the provided Questionairre data.
TASK:
1. Extract ALL job roles/positions mentioned in the in questionairre response data as a list.
2. Filter the extracted roles based on the provided role_slug that will be provided.
3. Return the filtered roles as a JSON list.
RULES:
- Return an empty list if no matching roles are found.
- The role_slug is a keyword or category used to filter relevant roles.
- Only include roles that semantically relate to the role_slug.
- Be precise in extracting official job titles rather than general descriptions.
EXAMPLES:
Example 1:
Text: "Our company is looking to hire a Senior Data Scientist, Junior Data Analyst, and Database Administrator for the Analytics department. We also have openings for Financial Manager and Customer Support Manager."
Role_slug: "data"
Expected output: ["Senior Data Scientist", "Junior Data Analyst", "Database Administrator"]
Example 2:
Text: "The restructuring process will affect several departments including the Financial Analysis team, Customer Relations department, and Sales Management. We are currently seeking a Regional Sales Manager, Sales Team Supervisor, and Customer Support Manager."
Role_slug: "manager"
Expected output: ["Financial Manager", "Regional Sales Manager", "Customer Support Manager"]
Instructions:
1. **Roles**: Extract the roles mentioned in the questionnaire.
2. **Vision**: If applicable, extract the vision of the company or organization as it relates to the roles.
3. **Mission**: If applicable, extract the mission of the company or organization as it relates to the roles.
4. **Role-specific SOPs**:
- Identify any role-specific Standard Operating Procedures (SOPs) mentioned in the questionnaire.
- If SOPs for the role are not explicitly stated, infer them from the context, but only if there is clear evidence within the questionnaire. Do not generate or assume SOPs that are not directly supported by the information provided.
- If no roles or SOPs are found, return an empty list for each category.
Provide the extracted roles and any relevant sections exactly as they appear in the questionnaire.'''
'''
def get_sop_personalassessment_from_questionnaire():
@@ -89,6 +105,7 @@ def get_sop_personalassessment_from_area_rolev2():
NOTE: IF AREAS ARE NOT PROVIDED (AREA IS "NOT PROVIDED"), INTUITIVELY PROVIDE THE SOP BASED ON THE ROLE NAME.
NOTE: MAKE SURE SOPS ARE NOT MISSING FOR THE PROVIDED TYPES.
NOTE: FOR SOP TYPES NOT SELECTED RETURN AN EMPTY LIST and not "null" E.G IF "SHALL" AND "WILL" ARE SELECTED BUT "MUST" IS NOT AMONG, MUST WILL BE AN EMPTY LIST
also for each role , add a "narrative" which is short description of the role in question
NOTE !!!: IF A ROLE POINTS TO A SPECIFIC SOP TYPE (E.G., "SHALL" AND "MUST"), THESE TWO MUST NEVER BE EMPTY FOR THAT ROLE.
: FORMAT: SOPS SHOULD BE CLEAR, DIRECT, AND CONCISE. EACH ROLE SHOULD HAVE 5-7 BULLET POINTS PER SOP TYPE ("WILL," "SHALL," OR "MUST"). FOR COMPLEX ROLES, EACH SOP TYPE MAY HAVE A MAXIMUM OF 7-10 BULLET POINTS, NOT TOTAL ACROSS ALL TYPES, BUT PER SOP TYPE.
@@ -153,9 +170,52 @@ def get_vision_mission_extraction_from_doc2():
8. If vision and mission is found in the document , extract them as it is ,no changes
NOTED: if the goal(mission) and vision cant not be found at all, make it empty please
NOTE: MAKE SURE YOU EXTRACT EVERY INFORMATION FOUND FOR VISION AND GOALS FROM THE DOCUMENT.DO NOT OMIT ANY
**You must return the response in the exact HTML `<p>` and `<b>` format shown below, including the numbering, lettered sub-points, `<br>` tags for line breaks, and the double `<br><br>` between departments. Adhere to this format precisely.**
**Instructions:**
- If **no departments are explicitly mentioned**, assume all content applies to all departments mentioned in the document.
- Group goals by department when possible.
- Each goal should have a **title** and **a short description**.
- Format your output with:
- An **HTML section** for rendering on the frontend.
- A **structured JSON section** with department-wise goal breakdown.
**Example Output Format:**
{
"html": [
"<p>Vision: To create safe, broadly beneficial AI systems that ensure the advantages of artificial general intelligence (AGI) are shared equitably across society. Our vision emphasizes the importance of safety and alignment, ensuring that AI systems align with human values and remain under human control. We aspire to collaborate with other research and policy institutions to address global challenges associated with advanced AI, prioritizing the safe development of increasingly capable systems that act in the best interests of humanity.</p>",
"<p>Company Goals:</p><p> <b> 1. Audit Department: </b> <br> a. Enhance Risk Management and Internal Controls: Strengthen the organizations risk posture by identifying, assessing, and recommending improvements to internal controls.<br> b. Ensure Regulatory and Policy Compliance: Monitor adherence to relevant laws, regulations, standards, and internal policies.<br> c. Support Organizational Governance: Provide independent assurance to senior management and the board on the effectiveness of governance processes.<br> d. Drive Operational Efficiency: Identify inefficiencies and areas for process improvement during audits.<br> e. Leverage Technology and Data Analytics: Use automated tools and analytics for continuous auditing and real-time risk monitoring.<br> f. Develop Audit Talent and Capabilities: Invest in training and upskilling for the audit team.<br> g. Enhance Stakeholder Communication: Improve reporting clarity, timeliness, and relevance to stakeholders.<br><br> <b> 2. Finance Department: </b> <br> a. Ensure Financial Stability and Sustainability: Maintain strong cash flow and optimize working capital.<br> b. Improve Financial Planning and Analysis (FP&A): Provide accurate forecasting and budgeting to support strategic decision-making.<br> c. Enhance Cost Management and Operational Efficiency: Identify cost-saving opportunities and drive efficient use of financial resources.<br> d. Ensure Regulatory and Compliance Integrity: Maintain full compliance with financial regulations and internal controls.<br> e. Enable Strategic Investment and Capital Allocation: Evaluate and fund initiatives that align with the organizations long-term growth strategy.<br> f. Improve Financial Reporting and Transparency: Deliver timely and accurate financial reports for stakeholders.<br> g. Leverage Financial Technology and Automation: Implement tools to streamline financial operations.<br> h. Develop Finance Talent and Leadership: Upskill finance staff and promote cross-functional knowledge.<br><br> <b> 3. Account Department: </b> <br> a. Maintain Accurate and Timely Financial Records: Ensure all transactions are recorded properly.<br> b. Ensure Regulatory and Tax Compliance: Comply with all financial regulations and timely tax filings.<br> c. Streamline Month-End and Year-End Closing Processes: Reduce the time and complexity of closing periods.<br> d. Support Strategic Financial Planning: Provide reliable data for budgeting and forecasting.<br> e. Implement Automation and Digital Tools: Use accounting software and RPA to increase efficiency.<br> f. Strengthen Internal Controls and Risk Management: Ensure robust controls over financial transactions.<br> g. Improve Financial Transparency and Reporting Quality: Produce clear and consistent reports for stakeholders.<br> h. Develop Accounting Team Skills and Expertise: Provide continuous learning and leadership development for accounting staff.<br></p>"
],
"department_goals": [
{
"department": "Account Management",
"goals": [
{
"goal_title": "Customer Satisfaction",
"goal_description": "Manage accounts effectively to enhance customer satisfaction and retention."
}
]
},
{
"department": "Finance",
"goals": [
{
"goal_title": "Financial Stability",
"goal_description": "Finance the company to ensure long-term sustainability and growth."
}
]
}
]
}
NOTE, VERY CRITICAL: FOLLOW THIS RESPONSE FORMAT STRICTLY , NOTHING BEFORE OR AFTER PLEASE !!!
"""
''' def get_sop_executive_for_managers():
return Your task is to extract the "Vision", "Mission", and executive-generated Standard Operating Procedures (SOPs) specifically for managers from the provided document.
@@ -175,7 +235,7 @@ def get_vision_mission_extraction_from_doc2():
def get_vision_mission_extraction_from_questionnaire_executive():
return """
You are provided with an organization's response from a questionnaire, and your role is to extract the vision and mission (also called goals) from the questionnaire response:
You are provided with an organization's response from a questionnaire, and your role is to extract the vision and mission (also called goals) from the for each of the departments found questionnaire response:
- Generate the vision(at least one paragraph)of the organization based on the questionnaire and
- Generate the goals (mission) of the company based on the provided departmental goals and overall questionairre response
@@ -191,6 +251,49 @@ def get_vision_mission_extraction_from_questionnaire_executive():
NOTE: If the goal and mission of a can not be gotten from the questionaire response, make it empty.
NOTE: Ensure you extract every piece of information found for the vision and goals from the questionnaire. DO NOT OMIT ANYTHING.
NOTE: Group the goals based on the departments found in the questions see example response below pointing to sales, marketing and product develpoment
NOTE: ADHERE STRICTLY TO THIS OUTPUT FORMAT , DO NOT CHANGE IT PLEASE
**Example Output Format:** Two texts (one for vision and for goal)
- Format your output with:
- An **HTML section** for rendering on the frontend.
- A **structured JSON section** with department-wise goal breakdown.
**Example Output Format:**
{
"html": [
"<p>Vision: To create safe, broadly beneficial AI systems that ensure the advantages of artificial general intelligence (AGI) are shared equitably across society. Our vision emphasizes the importance of safety and alignment, ensuring that AI systems align with human values and remain under human control. We aspire to collaborate with other research and policy institutions to address global challenges associated with advanced AI, prioritizing the safe development of increasingly capable systems that act in the best interests of humanity.</p>",
"<p>Company Goals:</p><p> <b> 1. Audit Department: </b> <br> a. Enhance Risk Management and Internal Controls: Strengthen the organizations risk posture by identifying, assessing, and recommending improvements to internal controls.<br> b. Ensure Regulatory and Policy Compliance: Monitor adherence to relevant laws, regulations, standards, and internal policies.<br> c. Support Organizational Governance: Provide independent assurance to senior management and the board on the effectiveness of governance processes.<br> d. Drive Operational Efficiency: Identify inefficiencies and areas for process improvement during audits.<br> e. Leverage Technology and Data Analytics: Use automated tools and analytics for continuous auditing and real-time risk monitoring.<br> f. Develop Audit Talent and Capabilities: Invest in training and upskilling for the audit team.<br> g. Enhance Stakeholder Communication: Improve reporting clarity, timeliness, and relevance to stakeholders.<br><br> <b> 2. Finance Department: </b> <br> a. Ensure Financial Stability and Sustainability: Maintain strong cash flow and optimize working capital.<br> b. Improve Financial Planning and Analysis (FP&A): Provide accurate forecasting and budgeting to support strategic decision-making.<br> c. Enhance Cost Management and Operational Efficiency: Identify cost-saving opportunities and drive efficient use of financial resources.<br> d. Ensure Regulatory and Compliance Integrity: Maintain full compliance with financial regulations and internal controls.<br> e. Enable Strategic Investment and Capital Allocation: Evaluate and fund initiatives that align with the organizations long-term growth strategy.<br> f. Improve Financial Reporting and Transparency: Deliver timely and accurate financial reports for stakeholders.<br> g. Leverage Financial Technology and Automation: Implement tools to streamline financial operations.<br> h. Develop Finance Talent and Leadership: Upskill finance staff and promote cross-functional knowledge.<br><br> <b> 3. Account Department: </b> <br> a. Maintain Accurate and Timely Financial Records: Ensure all transactions are recorded properly.<br> b. Ensure Regulatory and Tax Compliance: Comply with all financial regulations and timely tax filings.<br> c. Streamline Month-End and Year-End Closing Processes: Reduce the time and complexity of closing periods.<br> d. Support Strategic Financial Planning: Provide reliable data for budgeting and forecasting.<br> e. Implement Automation and Digital Tools: Use accounting software and RPA to increase efficiency.<br> f. Strengthen Internal Controls and Risk Management: Ensure robust controls over financial transactions.<br> g. Improve Financial Transparency and Reporting Quality: Produce clear and consistent reports for stakeholders.<br> h. Develop Accounting Team Skills and Expertise: Provide continuous learning and leadership development for accounting staff.<br></p>"
],
"vision": "To create safe, broadly beneficial AI systems that ensure the advantages of artificial general intelligence (AGI) are shared equitably across society. Our vision emphasizes the importance of safety and alignment, ensuring that AI systems align with human values and remain under human control. We aspire to collaborate with other research and policy institutions to address global challenges associated with advanced AI, prioritizing the safe development of increasingly capable systems that act in the best interests of humanity.",
"department_goals": [
{
"department": "Account Management",
"goals": [
{
"goal_title": "Customer Satisfaction",
"goal_description": "Manage accounts effectively to enhance customer satisfaction and retention."
}
]
},
{
"department": "Finance",
"goals": [
{
"goal_title": "Financial Stability",
"goal_description": "Finance the company to ensure long-term sustainability and growth."
}
]
}
]
}
NOTE: FOLLOW THIS RESPONSE FORMAT STRICTLY , NOTHING BEFORE OR AFTER PLEASE !!!
"""
+7 -1
View File
@@ -1,5 +1,6 @@
import os
import json
import re
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List, Dict, Optional
@@ -80,9 +81,14 @@ class Chatbot:
self.client = OpenAI(api_key=self.api_key)
self.model = "gpt-4o-mini"
def clean_text(self, text):
# Remove all surrogate characters
return re.sub(r'[\uD800-\uDFFF]', '', text)
def _extract_text_from_docs(self, docs):
"""Extract text content from document objects."""
return [doc.page_content for doc in docs]
print(docs)
return [self.clean_text(doc.page_content) for doc in docs]
# Existing methods...
def validate_worker(self, question, docs) -> VisionMissionResponse:
+271
View File
@@ -0,0 +1,271 @@
import os
from typing import List, Dict, Any
from datetime import datetime
import json
import random
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
class QuestionEnrichmentService:
"""
Service class to handle question enrichment with area and member assignments.
This is the reverse of question generation - it takes existing questions and assigns them to areas and members.
"""
def __init__(self):
self.api_key = os.getenv("OPENAI_API_KEY")
self.client = OpenAI(api_key=self.api_key)
self.model = "gpt-4o-mini"
def validate_question_object(self, question_obj: Dict[str, Any], index: int) -> Dict[str, str]:
"""
Validate a single question object structure.
Args:
question_obj: The question object to validate
index: The index of the question object in the array (for error messages)
Returns:
Dict with 'valid' boolean and 'error' message if invalid
"""
required_fields = ['question', 'role', 'position_id', 'area_tags', 'members']
for field in required_fields:
if field not in question_obj:
return {
'valid': False,
'error': f"Question object at index {index} is missing required field '{field}'."
}
# Validate area_tags structure
if not isinstance(question_obj['area_tags'], list):
return {
'valid': False,
'error': f"Question object at index {index}: 'area_tags' must be an array."
}
for area_idx, area_tag in enumerate(question_obj['area_tags']):
if not isinstance(area_tag, dict) or 'name' not in area_tag or 'id' not in area_tag:
return {
'valid': False,
'error': f"Question object at index {index}: area_tag at index {area_idx} must have 'name' and 'id' fields."
}
# Validate members structure
if not isinstance(question_obj['members'], list):
return {
'valid': False,
'error': f"Question object at index {index}: 'members' must be an array."
}
for member_idx, member in enumerate(question_obj['members']):
if not isinstance(member, dict) or 'id' not in member:
return {
'valid': False,
'error': f"Question object at index {index}: member at index {member_idx} must have 'id' field."
}
return {'valid': True}
def _get_question_area_assignment_prompt(self):
"""
Get the prompt for OpenAI to assign questions to the most relevant area tags.
"""
return """
You are an expert at analyzing questions and determining which area/domain they belong to.
Your task is to analyze each question and assign it to the most relevant area tag from the provided list.
Guidelines:
1. Analyze the question content and context
2. Match the question to the most appropriate area tag based on its meaning and purpose
3. Consider what domain/area the question is actually testing or assessing
4. Choose only ONE area tag per question - the most relevant one
5. If multiple areas seem relevant, choose the most specific or primary one
Return your response as a JSON object with the question text as key and the selected area tag ID as value.
Example format:
{
"Is the system monitoring working properly?": 1276,
"Are safety protocols being followed?": 1436
}
"""
def _use_openai_for_area_assignment(self, questions_data: List[Dict[str, Any]]) -> Dict[str, int]:
"""
Use OpenAI to intelligently assign questions to the most relevant area tags.
Args:
questions_data: List of question objects
Returns:
Dict mapping question text to selected area tag ID
"""
try:
# Prepare the data for OpenAI
questions_info = []
all_area_tags = {}
for question_obj in questions_data:
question_text = question_obj['question']
area_tags = question_obj['area_tags']
questions_info.append({
"question": question_text,
"available_area_tags": area_tags
})
# Collect all unique area tags
for area_tag in area_tags:
all_area_tags[area_tag['id']] = area_tag['name']
# Create the prompt content
prompt_content = f"""
Questions to analyze and assign:
{json.dumps(questions_info, indent=2)}
Available area tags:
{json.dumps(all_area_tags, indent=2)}
For each question, select the most relevant area tag ID from its available_area_tags list.
"""
response = self.client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": self._get_question_area_assignment_prompt()},
{"role": "user", "content": prompt_content}
],
temperature=0.1,
max_tokens=1000
)
# Parse the response
response_content = response.choices[0].message.content
# Try to extract JSON from the response
try:
# Look for JSON in the response
start_idx = response_content.find('{')
end_idx = response_content.rfind('}') + 1
if start_idx != -1 and end_idx != -1:
json_str = response_content[start_idx:end_idx]
assignments = json.loads(json_str)
return assignments
except:
pass
# Fallback: return empty dict if parsing fails
return {}
except Exception as e:
print(f"Error in OpenAI area assignment: {e}")
return {}
def _find_best_area_tag_for_question(self, question_text: str, area_tags: List[Dict], openai_assignments: Dict[str, int]) -> Dict:
"""
Find the most relevant area tag for a given question using OpenAI assignments.
Args:
question_text: The question text to match
area_tags: List of available area tags
openai_assignments: OpenAI assignments from batch processing
Returns:
The most relevant area tag
"""
# First try to use OpenAI assignment
if question_text in openai_assignments:
selected_area_id = openai_assignments[question_text]
for area_tag in area_tags:
if area_tag['id'] == selected_area_id:
return area_tag
# Fallback to first area tag if OpenAI assignment not found
return area_tags[0] if area_tags else None
def _select_member_for_question(self, question_text: str, members: List[Dict]) -> Dict:
"""
Select the most appropriate member for a given question.
For now, this is a simple selection, but could be enhanced with more logic.
Args:
question_text: The question text
members: List of available members
Returns:
Selected member
"""
# For now, just select the first member
# In a real implementation, this could consider member skills, workload, etc.
return members[0] if members else None
def enrich_questions(self, questions_data: List[Dict[str, Any]]) -> Dict[str, Any]:
"""
Enrich multiple questions with area and member assignments.
Each question gets assigned to ONE area_tag and ONE member based on OpenAI analysis.
Returns the exact same structure as generate_questions_from_sop_v3.
Args:
questions_data: List of question objects to enrich
Returns:
Dict in the same format as AllQuestions model
"""
# Validate input is a list
if not isinstance(questions_data, list):
return {
'success': False,
'error': "Input data must be an array of question objects."
}
# Validate each question object
for idx, question_obj in enumerate(questions_data):
validation_result = self.validate_question_object(question_obj, idx)
if not validation_result['valid']:
return {
'success': False,
'error': validation_result['error']
}
# Use OpenAI to get intelligent area assignments
openai_assignments = self._use_openai_for_area_assignment(questions_data)
# Process the enriched questions - each question gets ONE area_tag and ONE member
enriched_items = []
for question_obj in questions_data:
# Find the best area tag for this question using OpenAI
best_area_tag = self._find_best_area_tag_for_question(
question_obj['question'],
question_obj['area_tags'],
openai_assignments
)
# Select the best member for this question
selected_member = self._select_member_for_question(
question_obj['question'],
question_obj['members']
)
# Create a single item for this question
if best_area_tag and selected_member:
item = {
"area_tag": best_area_tag['id'],
"area_name": best_area_tag['name'],
"assigned_to": selected_member['id'],
"questions": question_obj['question'],
"role": question_obj['position_id'] # Using position_id as role ID
}
enriched_items.append(item)
# Return in the exact same format as generate_questions_from_sop_v3
return {
'success': True,
'questions': {
'items': enriched_items
}
}
+14 -6
View File
@@ -16,7 +16,6 @@ load_dotenv()
class QuestionsGenerator:
def __init__(self):
self.api_key = os.getenv("OPENAI_API_KEY")
self.client = OpenAI(api_key=self.api_key)
self.model = "gpt-4o-mini"
def generate_questions(self, input_data: Dict) -> AssessmentQuestions:
@@ -204,6 +203,7 @@ class QuestionsGeneratorV2:
sops = input_data['sops']
assessment_type = input_data['assessment_type']
total_duration = input_data['duration']
area_tags = input_data.get('area_tags', []) # Get area_tags if provided, default to empty list
# Chunk the SOPs into smaller pieces
@@ -211,16 +211,24 @@ class QuestionsGeneratorV2:
docs_text = [sops[i:i + chunk_size] for i in range(0, len(sops), chunk_size)]
docs = [{"type": "text", "text": text} for text in docs_text]
response = self.client.beta.chat.completions.parse(
model=self.model,
messages=[
# Prepare messages for the API call
messages = [
{"role": "system", "content": get_questions_prompt_v5()},
{"role": "user", "content": f"The SOPs are provided below."},
{"role": "user", "content": json.dumps(docs)},
{"role": "user", "content": f"Assessment Type: {assessment_type}"},
{"role": "user", "content": f"Duration: {total_duration}"}
],
]
# Add area_tags information if provided
if area_tags:
messages.append({"role": "user", "content": f"Available Area Tags: {json.dumps(area_tags)}"})
else:
messages.append({"role": "user", "content": "Area Tags: Not provided (empty array) - Please generate questions across multiple diverse area_names as instructed."})
response = self.client.beta.chat.completions.parse(
model=self.model,
messages=messages,
temperature=0.1,
response_format=AllQuestions, # Ensure you specify the correct format
max_tokens=6000
+57 -4
View File
@@ -106,10 +106,11 @@ class DocumentParser:
"""
try:
MODEL = "gpt-4o"
docs_text = self._extract_text_from_docs(docs)
prompt = get_vision_mission_extraction_from_doc2()
response = self.client.beta.chat.completions.parse(
model=self.model,
model=MODEL,
messages=[
{
"role": "system",
@@ -117,15 +118,15 @@ class DocumentParser:
},
{
"role": "user",
"content": f"Department to consider for the company goals generation{departments}"
"content": f"Department to consider for the company goals generation: {departments}"
},
{
"role": "user",
"content": [{"type": "text", "text": text} for text in docs_text],
}
],
response_format=VisionMissionResponse2,
max_tokens=10000,
response_format=VisionMissionResponseV3,
max_tokens=8000,
temperature=0.1
)
@@ -200,6 +201,58 @@ class DocumentParser:
temperature=0.1
)
return json.loads(response.choices[0].message.content)
def get_roles_using_slug(self, docs, role_slug):
# Extract the text content from the Document objects
docs_text = [doc.page_content for doc in docs]
response = self.client.beta.chat.completions.parse(
model=self.model,
messages=[
{
"role": "system",
"content": f'''You are a specialized role extractor for company documents. Your task is to identify and extract job roles/positions mentioned in the provided text.
TASK:
1. Extract ALL job roles/positions mentioned in the text as a list.
2. Filter the extracted roles based on the provided role_slug: "{role_slug}".
3. Return the filtered roles as a JSON list.
RULES:
- Return an empty list if no matching roles are found.
- The role_slug is a keyword or category used to filter relevant roles.
- Only include roles that semantically relate to the role_slug.
- Be precise in extracting official job titles rather than general descriptions.
EXAMPLES:
Example 1:
Text: "Our company is looking to hire a Senior Data Scientist, Junior Data Analyst, and Database Administrator for the Analytics department. We also have openings for Financial Manager and Customer Support Manager."
Role_slug: "data"
Expected output: ["Senior Data Scientist", "Junior Data Analyst", "Database Administrator"]
Example 2:
Text: "The restructuring process will affect several departments including the Financial Analysis team, Customer Relations department, and Sales Management. We are currently seeking a Regional Sales Manager, Sales Team Supervisor, and Customer Support Manager."
Role_slug: "manager"
Expected output: ["Financial Manager", "Regional Sales Manager", "Customer Support Manager"]
Provide the result as a valid JSON array of strings.
''',
},
{
"role": "user",
"content": [
{
"type": "text",
"text": text
} for text in docs_text
]
}
],
response_format=Roles_response,
max_tokens=4096,
temperature=0.1
)
return json.loads(response.choices[0].message.content)
'''def extract_departments_and_managers(self, docs):
+7 -3
View File
@@ -123,7 +123,7 @@ class SopPersonalAssessment:
return False
def generate_roles_from_questionnaire(self, questionnaire_data: List[dict]) -> Roles_response:
def generate_roles_from_questionnaire(self, questionnaire_data: List[dict],role_slug:str) -> Roles_response:
try:
# List of areas: ["communication", "development", etc.]
@@ -139,10 +139,14 @@ class SopPersonalAssessment:
{
"role": "user",
"content": f'''Questionairre data : {questionnaire_data}''',
},
{
"role": "user",
"content": f'''Role slug to consider : {role_slug}''',
}
],
response_format=Roles_response,
max_tokens=16000,
max_tokens=4096,
temperature=0.1
)
extracted_text = json.loads(response.choices[0].message.content)
@@ -303,7 +307,7 @@ class SopGeneratorExecutive:
{"role": "system", "content": prompt},
{"role": "user", "content": f"questionnaire response:\n{user_content}"}
],
response_format=VisionMissionResponse2,
response_format=VisionMissionResponseV3,
max_tokens=16000,
temperature=0.1
)
+47 -15
View File
@@ -1,32 +1,55 @@
import os
from spire.doc import Document, FileFormat
from langchain_community.document_loaders import PyPDFLoader
from docx import Document as DocxDocument
from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
from reportlab.lib.styles import getSampleStyleSheet
from langchain_community.document_loaders import PyPDFLoader, UnstructuredWordDocumentLoader
import pdfplumber
from langchain_core.documents import Document
def load_pdf_with_pdfplumber(file_path):
docs = []
with pdfplumber.open(file_path) as pdf:
for i, page in enumerate(pdf.pages):
text = page.extract_text()
docs.append(Document(page_content=text, metadata={"page": i}))
return docs
def convert_word_to_pdf(doc_path: str) -> str:
"""
Convert a .doc or .docx file to PDF using Spire.Doc.
Convert a .docx file to PDF using python-docx and reportlab.
Args:
doc_path (str): The path to the .doc or .docx file.
doc_path (str): The path to the .docx file.
Returns:
str: The path to the converted PDF file.
"""
pdf_path = os.path.splitext(doc_path)[0] + '.pdf'
# Create a Document object
document = Document()
# Load the Word document
document.LoadFromFile(doc_path)
# Save as PDF
document.SaveToFile(pdf_path, FileFormat.PDF)
document.Close()
doc = DocxDocument(doc_path)
# Create a PDF
pdf = SimpleDocTemplate(pdf_path, pagesize=letter)
styles = getSampleStyleSheet()
flowables = []
# Extract text from paragraphs and add to PDF
for para in doc.paragraphs:
if para.text:
p = Paragraph(para.text, styles['Normal'])
flowables.append(p)
flowables.append(Spacer(1, 12))
# Build the PDF
pdf.build(flowables)
return pdf_path
def load_document(file_path: str):
"""
Utility function to load a PDF, DOCX, or DOC file by first converting it to PDF.
Utility function to load a PDF, DOCX, or DOC file.
Args:
file_path (str): The path to the file to load.
@@ -34,15 +57,24 @@ def load_document(file_path: str):
Returns:
List[Document]: A list of Document objects representing the contents of the file.
"""
try:
extension = os.path.splitext(file_path)[1].lower()
if extension in ['.doc', '.docx']:
# Convert .doc or .docx to PDF first
if extension == '.docx':
# For .docx files, use UnstructuredWordDocumentLoader directly
loader = UnstructuredWordDocumentLoader(file_path)
return loader.load()
elif extension == '.doc':
# Convert .doc to .pdf first
pdf_path = convert_word_to_pdf(file_path)
loader = PyPDFLoader(pdf_path)
return loader.load()
elif extension == '.pdf':
loader = PyPDFLoader(file_path)
return load_pdf_with_pdfplumber(file_path)
else:
raise ValueError(f"Unsupported file type: {extension}. Only .pdf, .docx, and .doc are supported.")
return loader.load()
except Exception as e:
print(f"Error loading document: {str(e)}")
return None
+244
View File
@@ -0,0 +1,244 @@
import json
import datetime
from reportlab.lib.pagesizes import letter
from reportlab.lib import colors
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle, PageBreak, Image
from reportlab.platypus import Frame, PageTemplate, NextPageTemplate
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.enums import TA_CENTER, TA_JUSTIFY, TA_LEFT, TA_RIGHT
from reportlab.lib.units import inch, cm
from reportlab.pdfgen import canvas
def header_footer(canvas, doc):
"""Add the header and footer to each page"""
canvas.saveState()
# Header
header_text = "Standard Operating Procedures"
canvas.setFont('Helvetica-Bold', 10)
canvas.drawString(72, letter[1] - 40, header_text)
# Add a line below the header
canvas.setStrokeColor(colors.lightgrey)
canvas.line(72, letter[1] - 50, letter[0] - 72, letter[1] - 50)
# Footer with page number and date
current_date = datetime.datetime.now().strftime("%B %d, %Y")
page_num = f"Page {doc.page} | {current_date}"
canvas.setFont('Helvetica', 8)
canvas.drawString(letter[0] - 150, 40, page_num)
# Add a line above the footer
canvas.line(72, 50, letter[0] - 72, 50)
canvas.restoreState()
def convert_sop_to_pdf(sop_data, output_pdf="sop_document.pdf", company_name="Company Name"):
"""
Convert SOP data to a well-formatted PDF document
Args:
sop_data (dict or str): SOP data in dictionary format or JSON string
output_pdf (str): Output PDF filename
company_name (str): Name of the company to display on the cover page
"""
# Parse JSON string if needed
if isinstance(sop_data, str):
sop_data = json.loads(sop_data)
# Extract SOP details
sop_details = sop_data.get("sop_details", [])
# Create PDF document
doc = SimpleDocTemplate(output_pdf, pagesize=letter,
rightMargin=72, leftMargin=72,
topMargin=90, bottomMargin=72)
# Container for PDF elements
elements = []
# Get styles
styles = getSampleStyleSheet()
# Define custom styles
title_style = ParagraphStyle(
'TitleStyle',
parent=styles['Heading1'],
fontSize=24,
alignment=TA_CENTER,
spaceAfter=12,
fontName='Helvetica-Bold',
textColor=colors.darkblue
)
subtitle_style = ParagraphStyle(
'SubtitleStyle',
parent=styles['Heading2'],
fontSize=16,
alignment=TA_CENTER,
spaceAfter=36,
fontName='Helvetica-Bold',
textColor=colors.darkblue
)
heading2_style = ParagraphStyle(
'Heading2Style',
parent=styles['Heading2'],
fontSize=16,
spaceBefore=12,
spaceAfter=6,
fontName='Helvetica-Bold',
textColor=colors.darkblue,
borderWidth=0,
borderColor=colors.lightgrey,
borderPadding=5,
borderRadius=5
)
heading3_style = ParagraphStyle(
'Heading3Style',
parent=styles['Heading3'],
fontSize=12,
spaceBefore=6,
spaceAfter=3,
fontName='Helvetica-Bold',
textColor=colors.darkslategray
)
normal_style = ParagraphStyle(
'NormalStyle',
parent=styles['Normal'],
fontSize=10,
alignment=TA_JUSTIFY,
leading=14,
fontName='Helvetica'
)
bullet_style = ParagraphStyle(
'BulletStyle',
parent=normal_style,
leftIndent=20,
firstLineIndent=-15,
spaceBefore=2,
spaceAfter=2
)
# Create cover page
elements.append(Spacer(1, 2*inch))
elements.append(Paragraph("Standard Operating Procedures", title_style))
elements.append(Spacer(1, 0.5*inch))
#elements.append(Paragraph(company_name, subtitle_style))
elements.append(Spacer(1, 2*inch))
# Add current date
current_date = datetime.datetime.now().strftime("%B %d, %Y")
date_style = ParagraphStyle(
'DateStyle',
parent=styles['Normal'],
fontSize=12,
alignment=TA_CENTER,
fontName='Helvetica'
)
elements.append(Paragraph(f"Generated on: {current_date}", date_style))
# Add a page break after the cover page
elements.append(PageBreak())
# Process each role
for role_data in sop_details:
role_name = role_data.get("role", "Unnamed Role")
sops = role_data.get("sops", {})
narrative = role_data.get("narrative")
# Add role header with decorative element
elements.append(Paragraph(role_name, heading2_style))
# Add horizontal rule after heading
elements.append(Spacer(1, 0.05*inch))
# Add narrative if available
if narrative and narrative != "Narrative" and narrative is not None:
elements.append(Paragraph("Narrative:", heading3_style))
elements.append(Paragraph(narrative, normal_style))
elements.append(Spacer(1, 0.2*inch))
# Process SOPs
for sop_type in ["must", "shall", "will"]:
sop_items = sops.get(sop_type, [])
if sop_items:
# Capitalize the first letter of SOP type and make it bold
sop_type_title = sop_type.capitalize()
elements.append(Paragraph(f"{sop_type_title}:", heading3_style))
# Create bullet points for each SOP item with better formatting
for item in sop_items:
elements.append(Paragraph(f"{item}", bullet_style))
elements.append(Spacer(1, 0.15*inch))
# Add a page break between roles for cleaner separation
elements.append(PageBreak())
# Build the PDF document with header and footer
doc.build(elements, onFirstPage=header_footer, onLaterPages=header_footer)
return output_pdf
def main():
# Example usage
sop_json = """
{
"sop_details": [
{
"role": "Sales Manager",
"role_id": 140,
"sops": {
"must": [],
"shall": [
"Shall develop and implement sales strategies to achieve company targets.",
"Shall conduct regular performance reviews with the sales team to ensure targets are met.",
"Shall provide training and support to sales staff to enhance their skills and performance.",
"Shall analyze market trends and adjust sales strategies accordingly.",
"Shall maintain accurate records of sales activities and customer interactions."
],
"will": [
"Will lead the sales team to achieve monthly and quarterly sales goals.",
"Will collaborate with marketing to align sales strategies with promotional campaigns.",
"Will report sales performance to upper management on a regular basis.",
"Will identify potential new markets and customer segments for growth.",
"Will foster a positive team environment to motivate sales staff."
]
},
"areas": [],
"narrative": "The Sales Manager is responsible for leading and developing the sales team to achieve business targets and growth objectives. They will implement effective sales strategies, provide coaching to team members, and maintain strong customer relationships while ensuring all sales activities align with the company's overall goals."
},
{
"role": "Campaign Manager",
"role_id": 141,
"sops": {
"must": [],
"shall": [],
"will": [
"Will develop and execute marketing campaigns to promote products and services.",
"Will analyze campaign performance metrics to optimize future campaigns.",
"Will collaborate with cross-functional teams to ensure campaign alignment with business objectives.",
"Will manage campaign budgets and ensure effective allocation of resources.",
"Will stay updated on industry trends to inform campaign strategies."
]
},
"areas": [],
"narrative": "The Campaign Manager oversees the planning, execution, and analysis of marketing campaigns across various channels to drive brand awareness and customer acquisition. They work closely with creative teams, external vendors, and stakeholders to ensure campaigns are effective, on-brand, and deliver measurable results."
}
]
}
"""
# You can replace the company name with your own
output_file = convert_sop_to_pdf(sop_json, company_name="Strategic Business Solutions, Inc.")
print(f"PDF successfully generated: {output_file}")
if __name__ == "__main__":
main()
+236 -132
View File
@@ -1,140 +1,244 @@
import requests
import pandas as pd
from dotenv import load_dotenv
load_dotenv()
import os
import json
import datetime
from reportlab.lib.pagesizes import letter
from reportlab.lib import colors
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle, PageBreak, Image
from reportlab.platypus import Frame, PageTemplate, NextPageTemplate
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.enums import TA_CENTER, TA_JUSTIFY, TA_LEFT, TA_RIGHT
from reportlab.lib.units import inch, cm
from reportlab.pdfgen import canvas
DATA_KEY = os.getenv("AI_DATA_KEY")
# Constants for API requests
URL = "https://erpai.mkdlabs.com//v3/api/custom/erpai/common/get-data-ai"
HEADERS = {
"x-project": DATA_KEY # Replace with your actual key
}
def header_footer(canvas, doc):
"""Add the header and footer to each page"""
canvas.saveState()
# JSON bodies for API requests
def create_json_body(area_type, company_id):
return {
"type": area_type,
"options": {
"company_id": company_id
# Header
header_text = "Standard Operating Procedures"
canvas.setFont('Helvetica-Bold', 10)
canvas.drawString(72, letter[1] - 40, header_text)
# Add a line below the header
canvas.setStrokeColor(colors.lightgrey)
canvas.line(72, letter[1] - 50, letter[0] - 72, letter[1] - 50)
# Footer with page number and date
current_date = datetime.datetime.now().strftime("%B %d, %Y")
page_num = f"Page {doc.page} | {current_date}"
canvas.setFont('Helvetica', 8)
canvas.drawString(letter[0] - 150, 40, page_num)
# Add a line above the footer
canvas.line(72, 50, letter[0] - 72, 50)
canvas.restoreState()
def convert_sop_to_pdf(sop_data, output_pdf="sop_document.pdf", company_name="Company Name"):
"""
Convert SOP data to a well-formatted PDF document
Args:
sop_data (dict or str): SOP data in dictionary format or JSON string
output_pdf (str): Output PDF filename
company_name (str): Name of the company to display on the cover page
"""
# Parse JSON string if needed
if isinstance(sop_data, str):
sop_data = json.loads(sop_data)
# Extract SOP details
sop_details = sop_data.get("sop_details", [])
# Create PDF document
doc = SimpleDocTemplate(output_pdf, pagesize=letter,
rightMargin=72, leftMargin=72,
topMargin=90, bottomMargin=72)
# Container for PDF elements
elements = []
# Get styles
styles = getSampleStyleSheet()
# Define custom styles
title_style = ParagraphStyle(
'TitleStyle',
parent=styles['Heading1'],
fontSize=24,
alignment=TA_CENTER,
spaceAfter=12,
fontName='Helvetica-Bold',
textColor=colors.darkblue
)
subtitle_style = ParagraphStyle(
'SubtitleStyle',
parent=styles['Heading2'],
fontSize=16,
alignment=TA_CENTER,
spaceAfter=36,
fontName='Helvetica-Bold',
textColor=colors.darkblue
)
heading2_style = ParagraphStyle(
'Heading2Style',
parent=styles['Heading2'],
fontSize=16,
spaceBefore=12,
spaceAfter=6,
fontName='Helvetica-Bold',
textColor=colors.darkblue,
borderWidth=0,
borderColor=colors.lightgrey,
borderPadding=5,
borderRadius=5
)
heading3_style = ParagraphStyle(
'Heading3Style',
parent=styles['Heading3'],
fontSize=12,
spaceBefore=6,
spaceAfter=3,
fontName='Helvetica-Bold',
textColor=colors.darkslategray
)
normal_style = ParagraphStyle(
'NormalStyle',
parent=styles['Normal'],
fontSize=10,
alignment=TA_JUSTIFY,
leading=14,
fontName='Helvetica'
)
bullet_style = ParagraphStyle(
'BulletStyle',
parent=normal_style,
leftIndent=20,
firstLineIndent=-15,
spaceBefore=2,
spaceAfter=2
)
# Create cover page
elements.append(Spacer(1, 2*inch))
elements.append(Paragraph("Standard Operating Procedures", title_style))
elements.append(Spacer(1, 0.5*inch))
#elements.append(Paragraph(company_name, subtitle_style))
elements.append(Spacer(1, 2*inch))
# Add current date
current_date = datetime.datetime.now().strftime("%B %d, %Y")
date_style = ParagraphStyle(
'DateStyle',
parent=styles['Normal'],
fontSize=12,
alignment=TA_CENTER,
fontName='Helvetica'
)
elements.append(Paragraph(f"Generated on: {current_date}", date_style))
# Add a page break after the cover page
elements.append(PageBreak())
# Process each role
for role_data in sop_details:
role_name = role_data.get("role", "Unnamed Role")
sops = role_data.get("sops", {})
narrative = role_data.get("narrative")
# Add role header with decorative element
elements.append(Paragraph(role_name, heading2_style))
# Add horizontal rule after heading
elements.append(Spacer(1, 0.05*inch))
# Add narrative if available
if narrative and narrative != "Narrative" and narrative is not None:
elements.append(Paragraph("Narrative:", heading3_style))
elements.append(Paragraph(narrative, normal_style))
elements.append(Spacer(1, 0.2*inch))
# Process SOPs
for sop_type in ["must", "shall", "will"]:
sop_items = sops.get(sop_type, [])
if sop_items:
# Capitalize the first letter of SOP type and make it bold
sop_type_title = sop_type.capitalize()
elements.append(Paragraph(f"{sop_type_title}:", heading3_style))
# Create bullet points for each SOP item with better formatting
for item in sop_items:
elements.append(Paragraph(f"{item}", bullet_style))
elements.append(Spacer(1, 0.15*inch))
# Add a page break between roles for cleaner separation
elements.append(PageBreak())
# Build the PDF document with header and footer
doc.build(elements, onFirstPage=header_footer, onLaterPages=header_footer)
return output_pdf
def main():
# Example usage
sop_json = """
{
"sop_details": [
{
"role": "Sales Manager",
"role_id": 140,
"sops": {
"must": [],
"shall": [
"Shall develop and implement sales strategies to achieve company targets.",
"Shall conduct regular performance reviews with the sales team to ensure targets are met.",
"Shall provide training and support to sales staff to enhance their skills and performance.",
"Shall analyze market trends and adjust sales strategies accordingly.",
"Shall maintain accurate records of sales activities and customer interactions."
],
"will": [
"Will lead the sales team to achieve monthly and quarterly sales goals.",
"Will collaborate with marketing to align sales strategies with promotional campaigns.",
"Will report sales performance to upper management on a regular basis.",
"Will identify potential new markets and customer segments for growth.",
"Will foster a positive team environment to motivate sales staff."
]
},
"areas": [],
"narrative": "The Sales Manager is responsible for leading and developing the sales team to achieve business targets and growth objectives. They will implement effective sales strategies, provide coaching to team members, and maintain strong customer relationships while ensuring all sales activities align with the company's overall goals."
},
{
"role": "Campaign Manager",
"role_id": 141,
"sops": {
"must": [],
"shall": [],
"will": [
"Will develop and execute marketing campaigns to promote products and services.",
"Will analyze campaign performance metrics to optimize future campaigns.",
"Will collaborate with cross-functional teams to ensure campaign alignment with business objectives.",
"Will manage campaign budgets and ensure effective allocation of resources.",
"Will stay updated on industry trends to inform campaign strategies."
]
},
"areas": [],
"narrative": "The Campaign Manager oversees the planning, execution, and analysis of marketing campaigns across various channels to drive brand awareness and customer acquisition. They work closely with creative teams, external vendors, and stakeholders to ensure campaigns are effective, on-brand, and deliver measurable results."
}
]
}
"""
# Function to fetch data from the API
def fetch_data(json_body):
json_body["options"]["company_id"] = json_body["options"].get("company_id") # Ensure company_id is included
response = requests.post(URL, headers=HEADERS, json=json_body)
response.raise_for_status() # Raise an error for bad responses
return response.json()
def convert_assessment_data_to_dataframe(assessment_data):
df_assessment = []
for assessment in assessment_data.get("data", []):
assessment_id = assessment["assessment_id"]
assessment_name = assessment["assessment_name"]
start_date = assessment["start_date"]
open_items = assessment["open_items"]
completed_items = assessment["completed_items"]
total_assigned_items = assessment["total_assigned_items"]
red_flags = assessment["red_flags"]
for user in assessment.get("user_details", []):
user_name = user["name"]
user_total_items = user["total_assigned_items"]
user_completed_items = user["completed_items"]
for area in user.get("area_list", []):
df_assessment.append({
"assessment_id": assessment_id,
"assessment_name": assessment_name,
"start_date": start_date,
"open_items_overall": open_items,
"completed_items_overall": completed_items,
"total_assigned_items_overall": total_assigned_items,
"user_name": user_name,
"user_total_assigned_items": user_total_items,
"user_completed_items": user_completed_items,
"area": area,
"red_flags": red_flags
})
return pd.DataFrame(df_assessment)
# Convert to DataFrame
# Summary statistics for overall assessment level
def generate_summary_statistics(df):
total_assessments = df['assessment_id'].nunique()
avg_open_items = df.groupby('assessment_id')['open_items_overall'].mean().mean()
avg_completed_items = df.groupby('assessment_id')['completed_items_overall'].mean().mean()
avg_total_assigned_items = df.groupby('assessment_id')['total_assigned_items_overall'].mean().mean()
avg_red_flags = df['red_flags'].mean()
total_users = df['user_name'].nunique()
avg_user_total_items = df.groupby('user_name')['user_total_assigned_items'].mean().mean()
avg_user_completed_items = df.groupby('user_name')['user_completed_items'].mean().mean()
completion_rate_per_user = (df['user_completed_items'].sum() / df['user_total_assigned_items'].sum()) * 100 if df['user_total_assigned_items'].sum() > 0 else 0
area_summary = df['area'].value_counts()
return {
"total_assessments": total_assessments,
"avg_open_items_per_assessment": avg_open_items,
"avg_completed_items_per_assessment": avg_completed_items,
"avg_total_assigned_items_per_assessment": avg_total_assigned_items,
"avg_red_flags": avg_red_flags,
"total_users": total_users,
"avg_user_total_assigned_items": avg_user_total_items,
"avg_user_completed_items": avg_user_completed_items,
"completion_rate_per_user": completion_rate_per_user,
"area_summary": area_summary.to_dict()
}
# Additional statistics for efficiency and areas
def generate_extended_statistics(df):
df['user_completion_rate'] = (df['user_completed_items'] / df['user_total_assigned_items']).fillna(0) * 100
top_5_efficient_users = df.groupby('user_name')['user_completion_rate'].mean().nlargest(5).to_dict()
bottom_5_least_efficient_users = df.groupby('user_name')['user_completion_rate'].mean().nsmallest(5).to_dict()
df['uncompleted_items'] = df['user_total_assigned_items'] - df['user_completed_items']
areas_with_most_uncompleted_items = df.groupby('area')['uncompleted_items'].sum().nlargest(5).to_dict()
return {
"top_5_efficient_users": top_5_efficient_users,
"bottom_5_least_efficient_users": bottom_5_least_efficient_users,
"areas_with_most_uncompleted_items": areas_with_most_uncompleted_items
}
# Generate statistics for problematic areas
def generate_problematic_area_statistics(df):
total_open_items = df.groupby('name')['open_items'].sum().sort_values(ascending=False)
total_red_flags = df.groupby('name')['red_flags'].sum().sort_values(ascending=False)
return pd.DataFrame({
"total_open_items": total_open_items,
"total_red_flags": total_red_flags
}).fillna(0)
def generate_summary_stats(assessment_data, area_data):
assessment_df = convert_assessment_data_to_dataframe(assessment_data)
problematic_area_df = pd.DataFrame(area_data.get("data", []))
summary_stats = generate_summary_statistics(assessment_df)
extended_stats = generate_extended_statistics(assessment_df)
summary_stats["users(Workers) based stats"] = extended_stats
problematic_stats = generate_problematic_area_statistics(problematic_area_df)
summary_stats["Area based stats"] = problematic_stats.to_dict(orient='index')
return summary_stats
# You can replace the company name with your own
output_file = convert_sop_to_pdf(sop_json, company_name="Strategic Business Solutions, Inc.")
print(f"PDF successfully generated: {output_file}")
if __name__ == "__main__":
from src.services.chatbot import Chatbot
bot = Chatbot()
res = bot.predict_next_n_assessment(companyid=12,N=3)
main()
print(res)
Binary file not shown.