Aherobo Ovie Victor 0e3e22e8cb Initial commit
2025-07-17 22:20:25 +01:00
2025-07-17 22:20:25 +01:00
2025-07-17 22:20:25 +01:00
2025-07-17 22:20:25 +01:00
2025-07-17 22:20:25 +01:00
2025-07-17 22:20:25 +01:00
2025-07-17 22:20:25 +01:00
2025-07-17 22:20:25 +01:00
2025-07-17 22:20:25 +01:00

Mini SpecsComply Pro (SCP)

Overview

Mini SpecsComply Pro (SCP) is a lightweight document compliance and validation tool designed to analyze and verify technical documents against predefined standards and project-specific requirements. It leverages advanced AI models for embedding, reasoning, and ranking to ensure fast and accurate document processing.

Features

  • Document Analysis: Automated analysis of technical documents for compliance verification
  • AI-Powered Processing:
    • GROQ LLM for deep reasoning and compliance analysis
    • Cohere for document embedding and result ranking
  • Advanced Standards Matching:
    • Sophisticated matching algorithm to identify relevant standards
    • Section-based analysis for contextual understanding
    • Technical term recognition and keyword extraction
    • Relevance scoring system for accurate standard selection
  • Custom Standards Support:
    • Upload and manage your own compliance standards
    • JSON-based standard definitions with flexible structure
  • Vector Database Support:
    • Pinecone (default)
    • Weaviate (alternative)
  • RESTful API: Built with FastAPI for easy integration
  • Real-time Processing: Async support for efficient document handling
  • Structured Reports: Detailed compliance feedback and recommendations with applied standards tracking

Prerequisites

  • Python 3.8 or higher
  • pip or poetry for package management
  • API keys for:
    • GROQ
    • Cohere
    • Pinecone (if using Pinecone) or Weaviate URL (if using Weaviate)

Installation

  1. Clone the repository:
git clone http://23.29.118.76:3000/task/mini-specscomply-pro.git
cd mini-specscomply-pro
  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Create a .env file in the project root:
# Required API Keys
GROQ_API_KEY=your_groq_api_key
COHERE_API_KEY=your_cohere_api_key

# Vector Database (Choose one)
# For Pinecone:
VECTOR_DB=pinecone
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_environment #us-east-1
PINECONE_INDEX_NAME=specscomply_documents

# Or for Weaviate:
# VECTOR_DB=weaviate
# WEAVIATE_URL=your_weaviate_url
# WEAVIATE_API_KEY=your_weaviate_api_key

# Optional Settings
APP_NAME="Mini SpecsComply Pro"
APP_VERSION="0.1.0"
DEBUG=False

Running the Application

Quick Start

python launch.py

This will check your environment setup and start the application. Go to http://localhost:8000 in your browser.

The API will be available at:

  • API Documentation: http://localhost:8000/docs

API Endpoints

  • POST /api/documents/upload - Upload a document for analysis
  • GET /api/documents/{document_id} - Get document status and results
  • POST /api/documents/{document_id}/resubmit - Resubmit a document for re-analysis
  • GET /api/documents/{document_id}/analysis - Get detailed compliance analysis
  • GET /api/standards - List all available standards
  • POST /api/standards/upload - Upload a custom standard definition
  • GET /api/standards/{standard_id} - Get details of a specific standard
  • GET /api/health - Health check endpoint

Configuration

The application can be configured through environment variables or the .env file. Key configuration options:

  • DEBUG: Enable debug mode (default: False)
  • VECTOR_DB: Choose vector database backend ("pinecone" or "weaviate")
  • EMBEDDING_MODEL: Cohere embedding model (default: "embed-english-v3.0")
  • RERANKER_MODEL: Cohere reranker model (default: "rerank-english-v2.0")
  • REASONING_MODEL: GROQ model (default: "llama-3.3-70b-versatile")

Development

Project Structure

mini-specscomply-pro/
├── app/
│   ├── api/          # API routes and endpoints
│   ├── core/         # Core configuration and models
│   └── services/     # Business logic services
|── Data/              # Sample data and documents
├── requirements.txt  # Project dependencies
├── run.py            # Application runner
|── launch.py         # Setup and launch script
├── .env              # Environment variables
├── .gitignore        # Git ignore file
├── README.md         # Project documentation

Advanced Standards Matching

Mini SpecsComply Pro uses a sophisticated algorithm to match documents with relevant standards:

  1. Document Analysis

    • Extracts sections and headings from the document
    • Identifies key technical terms and phrases
    • Recognizes standard references (e.g., "ISO-9001", "IEEE 829")
  2. Relevance Scoring

    • Calculates weighted scores based on multiple factors:
      • Direct standard name matches (highest weight)
      • Keyword matches between document and standard
      • Section-specific matches (e.g., in References or Requirements sections)
      • Technical term matches
      • Requirement-specific matches
  3. Standard Selection

    • Selects the most relevant standards based on score threshold
    • Applies these standards during compliance analysis
    • Displays applied standards in the compliance report

This approach ensures that the most appropriate standards are applied to each document, improving the accuracy and relevance of compliance analysis.

Document and Standard Formats

Compliance Documents

For best results, structure your compliance documents with clear sections and headings. The system performs better with well-organized documents that include:

  1. Clear Headings: Use markdown-style headings (e.g., # Section Title) to organize content
  2. Introduction Section: Provide context and purpose of the document
  3. Scope Section: Define what the document covers and doesn't cover
  4. Requirements Sections: Clearly state requirements using terms like "shall", "must", "should"
  5. References Section: List relevant standards, specifications, or other documents
  6. Technical Details: Include specific technical information relevant to compliance

Example document structure:

# System Compliance Specification

## Introduction
This document specifies the compliance requirements for the XYZ system.

## Scope
This specification applies to all components of the XYZ system.

## Requirements
### Functional Requirements
1. The system shall process user input within 500ms.
2. The system must maintain data integrity during power failures.

### Security Requirements
1. All data transmissions shall be encrypted using AES-256.
2. User authentication must comply with NIST guidelines.

## References
- ISO-9001:2015 Quality Management Systems
- IEEE-829 Software Test Documentation

Custom Standard Definitions

Custom standards are defined in JSON format with the following structure:

{
  "name": "ISO-9001",
  "description": "Quality Management System standard",
  "requirements": [
    {
      "id": "ISO-9001-4.1",
      "description": "The organization shall determine external and internal issues relevant to its purpose and strategic direction.",
      "severity": "major"
    },
    {
      "id": "ISO-9001-4.2",
      "description": "The organization shall monitor and review information about these external and internal issues.",
      "severity": "minor"
    }
  ]
}

You can also define multiple standards in a single file:

{
  "standards": [
    {
      "name": "ISO-9001",
      "description": "Quality Management System standard",
      "requirements": [...]
    },
    {
      "name": "IEEE-829",
      "description": "Software Test Documentation standard",
      "requirements": [...]
    }
  ]
}

Requirement severity levels:

  • critical: Major non-compliance that must be addressed immediately
  • major: Significant issue that should be addressed soon
  • minor: Less significant issue that should be addressed when convenient
  • info: Informational note or suggestion

Troubleshooting

Common issues and solutions:

  1. Missing API Keys

    • Ensure all required API keys are set in your .env file
    • Check the API key format and validity
  2. Vector Database Connection

    • Verify the vector database configuration
    • Ensure the selected database service is running and accessible
  3. Model Errors

    • Check API quotas and limits
    • Verify model names in configuration
  4. Standards Not Being Applied

    • Verify that standards have been uploaded correctly
    • Check the logs for standards matching information
    • Ensure document content includes relevant terminology for matching
S
Description
No description provided
Readme 155 KiB
Languages
Python 63.9%
JavaScript 19.1%
HTML 9.2%
CSS 7.8%