Mini SpecsComply Pro (SCP) is a lightweight document compliance and validation tool designed to analyze and verify technical documents against predefined standards and project-specific requirements. It leverages advanced AI models for embedding, reasoning, and ranking to ensure fast and accurate document processing.
## Features
- **Document Analysis:** Automated analysis of technical documents for compliance verification
- **AI-Powered Processing:**
- GROQ LLM for deep reasoning and compliance analysis
- Cohere for document embedding and result ranking
- **Advanced Standards Matching:**
- Sophisticated matching algorithm to identify relevant standards
- Section-based analysis for contextual understanding
- Technical term recognition and keyword extraction
- Relevance scoring system for accurate standard selection
- **Custom Standards Support:**
- Upload and manage your own compliance standards
- JSON-based standard definitions with flexible structure
- **Vector Database Support:**
- Pinecone (default)
- Weaviate (alternative)
- **RESTful API:** Built with FastAPI for easy integration
- **Real-time Processing:** Async support for efficient document handling
- **Structured Reports:** Detailed compliance feedback and recommendations with applied standards tracking
## Prerequisites
- Python 3.8 or higher
- pip or poetry for package management
- API keys for:
- GROQ
- Cohere
- Pinecone (if using Pinecone) or Weaviate URL (if using Weaviate)
This will check your environment setup and start the application. Go to `http://localhost:8000` in your browser.
The API will be available at:
- API Documentation: `http://localhost:8000/docs`
## API Endpoints
-`POST /api/documents/upload` - Upload a document for analysis
-`GET /api/documents/{document_id}` - Get document status and results
-`POST /api/documents/{document_id}/resubmit` - Resubmit a document for re-analysis
-`GET /api/documents/{document_id}/analysis` - Get detailed compliance analysis
-`GET /api/standards` - List all available standards
-`POST /api/standards/upload` - Upload a custom standard definition
-`GET /api/standards/{standard_id}` - Get details of a specific standard
-`GET /api/health` - Health check endpoint
## Configuration
The application can be configured through environment variables or the `.env` file. Key configuration options:
-`DEBUG`: Enable debug mode (default: False)
-`VECTOR_DB`: Choose vector database backend ("pinecone" or "weaviate")
-`EMBEDDING_MODEL`: Cohere embedding model (default: "embed-english-v3.0")
-`RERANKER_MODEL`: Cohere reranker model (default: "rerank-english-v2.0")
-`REASONING_MODEL`: GROQ model (default: "llama-3.3-70b-versatile")
## Development
### Project Structure
```
mini-specscomply-pro/
├── app/
│ ├── api/ # API routes and endpoints
│ ├── core/ # Core configuration and models
│ └── services/ # Business logic services
|── Data/ # Sample data and documents
├── requirements.txt # Project dependencies
├── run.py # Application runner
|── launch.py # Setup and launch script
├── .env # Environment variables
├── .gitignore # Git ignore file
├── README.md # Project documentation
```
## Advanced Standards Matching
Mini SpecsComply Pro uses a sophisticated algorithm to match documents with relevant standards:
1.**Document Analysis**
- Extracts sections and headings from the document
- Identifies key technical terms and phrases
- Recognizes standard references (e.g., "ISO-9001", "IEEE 829")
2.**Relevance Scoring**
- Calculates weighted scores based on multiple factors:
- Direct standard name matches (highest weight)
- Keyword matches between document and standard
- Section-specific matches (e.g., in References or Requirements sections)
- Technical term matches
- Requirement-specific matches
3.**Standard Selection**
- Selects the most relevant standards based on score threshold
- Applies these standards during compliance analysis
- Displays applied standards in the compliance report
This approach ensures that the most appropriate standards are applied to each document, improving the accuracy and relevance of compliance analysis.
## Document and Standard Formats
### Compliance Documents
For best results, structure your compliance documents with clear sections and headings. The system performs better with well-organized documents that include:
1.**Clear Headings**: Use markdown-style headings (e.g., `# Section Title`) to organize content
2.**Introduction Section**: Provide context and purpose of the document
3.**Scope Section**: Define what the document covers and doesn't cover
4.**Requirements Sections**: Clearly state requirements using terms like "shall", "must", "should"
5.**References Section**: List relevant standards, specifications, or other documents
6.**Technical Details**: Include specific technical information relevant to compliance
Example document structure:
```markdown
# System Compliance Specification
## Introduction
This document specifies the compliance requirements for the XYZ system.
## Scope
This specification applies to all components of the XYZ system.
## Requirements
### Functional Requirements
1. The system shall process user input within 500ms.
2. The system must maintain data integrity during power failures.
### Security Requirements
1. All data transmissions shall be encrypted using AES-256.
2. User authentication must comply with NIST guidelines.
## References
- ISO-9001:2015 Quality Management Systems
- IEEE-829 Software Test Documentation
```
### Custom Standard Definitions
Custom standards are defined in JSON format with the following structure:
```json
{
"name":"ISO-9001",
"description":"Quality Management System standard",
"requirements":[
{
"id":"ISO-9001-4.1",
"description":"The organization shall determine external and internal issues relevant to its purpose and strategic direction.",
"severity":"major"
},
{
"id":"ISO-9001-4.2",
"description":"The organization shall monitor and review information about these external and internal issues.",
"severity":"minor"
}
]
}
```
You can also define multiple standards in a single file:
```json
{
"standards":[
{
"name":"ISO-9001",
"description":"Quality Management System standard",
"requirements":[...]
},
{
"name":"IEEE-829",
"description":"Software Test Documentation standard",
"requirements":[...]
}
]
}
```
Requirement severity levels:
-`critical`: Major non-compliance that must be addressed immediately
-`major`: Significant issue that should be addressed soon
-`minor`: Less significant issue that should be addressed when convenient
-`info`: Informational note or suggestion
## Troubleshooting
Common issues and solutions:
1.**Missing API Keys**
- Ensure all required API keys are set in your `.env` file
- Check the API key format and validity
2.**Vector Database Connection**
- Verify the vector database configuration
- Ensure the selected database service is running and accessible
3.**Model Errors**
- Check API quotas and limits
- Verify model names in configuration
4.**Standards Not Being Applied**
- Verify that standards have been uploaded correctly
- Check the logs for standards matching information
- Ensure document content includes relevant terminology for matching