Initial commit for deployment

2025-05-09 15:41:16 +01:00
commit ac98999507
54 changed files with 4343 additions and 0 deletions
@@ -0,0 +1,173 @@
+# AI Service Workflow and Architecture
+
+## Overview
+
+The AI Service is a modular, API-driven system that provides document processing, embedding, and chat functionality with multiple AI models. It's designed to support a chatbot application with document training, private/team chat options, and model switching capabilities.
+
+## System Architecture
+
+```
+┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
+│                 │     │                 │     │                 │
+│  Client Apps    │────▶│  AI Service API │────▶│  Vector Store   │
+│                 │     │                 │     │   (Pinecone)    │
+└─────────────────┘     └────────┬────────┘     └─────────────────┘
+                                 │
+                                 ▼
+                        ┌─────────────────┐     ┌─────────────────┐
+                        │                 │     │                 │
+                        │   AI Models     │────▶│  Local Storage  │
+                        │                 │     │                 │
+                        └─────────────────┘     └─────────────────┘
+```
+
+## Core Components
+
+1. **Document Service**: Processes documents, splits them into chunks, and stores embeddings
+2. **Embedding Service**: Generates vector embeddings for text using sentence transformers
+3. **Model Service**: Manages different AI models and generates responses
+4. **Chat Service**: Handles chat creation, message history, and team chat functionality
+
+## API Endpoints Workflow
+
+### Health Check
+
+- **Endpoint**: `GET /health`
+- **Purpose**: Simple health check to verify the service is running
+- **Response**: `{"status": "healthy"}`
+
+### Document Management Workflow
+
+1. **Process Document**
+   - **Endpoint**: `POST /documents`
+   - **Purpose**: Process a document for embedding
+   - **Workflow**:
+     - Client submits document content, title, and optional metadata
+     - Document is split into chunks
+     - Embeddings are generated for each chunk
+     - Embeddings are stored in Pinecone
+     - Document metadata is stored locally
+   - **Response**: Document metadata including ID and chunk count
+
+2. **Get All Documents**
+   - **Endpoint**: `GET /documents`
+   - **Purpose**: Retrieve all processed documents
+   - **Response**: List of document metadata
+
+3. **Get Document by ID**
+   - **Endpoint**: `GET /documents/{doc_id}`
+   - **Purpose**: Retrieve a specific document's metadata
+   - **Response**: Document metadata
+
+4. **Delete Document**
+   - **Endpoint**: `DELETE /documents/{doc_id}`
+   - **Purpose**: Remove a document and its embeddings
+   - **Workflow**:
+     - Document chunks are deleted from Pinecone
+     - Document metadata is removed from local storage
+   - **Response**: Success status
+
+5. **Search Documents**
+   - **Endpoint**: `POST /documents/search`
+   - **Purpose**: Semantic search across document embeddings
+   - **Workflow**:
+     - Query text is converted to an embedding
+     - Similar embeddings are found in Pinecone
+     - Results are returned with metadata and similarity scores
+   - **Response**: List of search results with metadata
+
+### Model Management Workflow
+
+1. **Get Available Models**
+   - **Endpoint**: `GET /models`
+   - **Purpose**: List all available AI models
+   - **Response**: List of model information (ID, name, description, etc.)
+
+2. **Get Model Information**
+   - **Endpoint**: `GET /models/{model_id}`
+   - **Purpose**: Get details about a specific model
+   - **Response**: Model information
+
+### Chat Workflow
+
+1. **Create Chat**
+   - **Endpoint**: `POST /chats`
+   - **Purpose**: Create a new chat session
+   - **Workflow**:
+     - Client provides user ID, optional title, and model ID
+     - System generates a unique chat ID
+     - Chat metadata is stored locally
+   - **Response**: Created chat information
+
+2. **Get User Chats**
+   - **Endpoint**: `GET /chats/user/{user_id}`
+   - **Purpose**: Get all chats for a specific user
+   - **Response**: List of chat information
+
+3. **Get Chat by ID**
+   - **Endpoint**: `GET /chats/{chat_id}`
+   - **Purpose**: Get a specific chat's information and messages
+   - **Response**: Chat information including message history
+
+4. **Send Message**
+   - **Endpoint**: `POST /chats/{chat_id}/messages`
+   - **Purpose**: Send a message and get AI response
+   - **Workflow**:
+     - Client sends message with user ID and optional model parameters
+     - User message is added to chat history
+     - If RAG is enabled, relevant documents are retrieved
+     - AI model generates a response based on chat history and context
+     - Bot response is added to chat history
+   - **Response**: Bot response message
+
+5. **Team Chat Management**
+   - **Add Team Member**: `POST /chats/{chat_id}/members/{user_id}`
+   - **Remove Team Member**: `DELETE /chats/{chat_id}/members/{user_id}`
+   - **Purpose**: Manage team chat participants
+   - **Response**: Success status
+
+6. **Delete Chat**
+   - **Endpoint**: `DELETE /chats/{chat_id}`
+   - **Purpose**: Remove a chat and its messages
+   - **Response**: Success status
+
+## Retrieval-Augmented Generation (RAG) Workflow
+
+When RAG is enabled in a chat message request:
+
+1. User message is processed
+2. Message is converted to an embedding
+3. Similar document chunks are retrieved from Pinecone
+4. Retrieved chunks are added as context to the prompt
+5. AI model generates a response using both the chat history and document context
+6. Response is returned to the user
+
+## Model Parameters
+
+The API supports customizing AI model behavior through parameters:
+
+- `temperature`: Controls randomness (0.0-2.0)
+- `max_tokens`: Maximum response length
+- `top_p`: Nucleus sampling parameter (0.0-1.0)
+- `frequency_penalty`: Penalizes repeated tokens (-2.0-2.0)
+- `presence_penalty`: Penalizes repeated topics (-2.0-2.0)
+- `stop_sequences`: Sequences where generation stops
+- `system_prompt`: Custom system prompt to guide the model
+
+## Deployment
+
+The service is deployed using uvicorn:
+
+```bash
+nohup uvicorn ai_service.run:app --host 0.0.0.0 --port 5251 &
+```
+
+## Example Usage Flow
+
+1. Process documents for knowledge base
+2. Create a new chat session
+3. Send messages with or without RAG
+4. Optionally add team members for collaborative chats
+5. Switch models as needed for different capabilities
+
+This architecture provides a flexible, scalable foundation for building AI-powered chat applications with document training capabilities.