Files
module4_backend_project/design.md
T

12 KiB

Ticket Scaling Microservice - Design Document

Table of Contents

  1. Architecture Overview
  2. System Components
  3. Scalability Strategies
  4. Atomic Operations
  5. Fallback Mechanisms
  6. Performance Optimizations
  7. Monitoring & Observability
  8. Security Considerations
  9. Deployment Strategy
  10. Future Enhancements

Architecture Overview

High-Level Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Load Balancer │    │   Prometheus    │    │    Grafana      │
│   (Optional)    │    │   Monitoring    │    │   Dashboard     │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         │                       │                       │
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│                 │    │                 │    │                 │
│  Ticket Service │◄───┤     Redis       │    │  In-Memory      │
│  (Node.js/      │    │   Primary Store │    │  Fallback Store │
│   Express)      │    │                 │    │                 │
│                 │    │                 │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │
         │
┌─────────────────┐
│   PDF Generator │
│   (PDFKit)      │
└─────────────────┘

Design Principles

  1. High Availability: Fallback mechanisms ensure service continuity
  2. Atomic Operations: Redis Lua scripts prevent race conditions
  3. Horizontal Scalability: Stateless design enables easy scaling
  4. Observability: Comprehensive logging and metrics
  5. Performance: Optimized for high-throughput scenarios

System Components

1. Core Application (server.js)

  • Technology: Node.js with Express framework
  • Responsibilities:
    • HTTP request handling
    • Business logic orchestration
    • Error handling and logging
    • PDF generation coordination

2. Redis Client (redis-client.js)

  • Technology: Redis with Lua scripting
  • Responsibilities:
    • Atomic ticket operations
    • Event metadata management
    • Connection health monitoring
    • Script execution

3. Fallback Store (fallback-store.js)

  • Technology: In-memory JavaScript Map
  • Responsibilities:
    • Emergency ticket storage
    • Temporary operation continuity
    • Graceful degradation

4. PDF Generator (pdf-generator.js)

  • Technology: PDFKit library
  • Responsibilities:
    • Professional ticket generation
    • File management
    • Cleanup operations

5. Logging System (logger.js)

  • Technology: Winston logging framework
  • Responsibilities:
    • Structured logging
    • Request tracking
    • Error reporting
    • Performance metrics

Scalability Strategies

Horizontal Scaling

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Instance 1    │    │   Instance 2    │    │   Instance N    │
│   Port: 3049    │    │   Port: 3050    │    │   Port: 305X    │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘
                                 │
                    ┌─────────────────┐
                    │  Shared Redis   │
                    │    Cluster      │
                    └─────────────────┘

Key Features:

  • Stateless application design
  • Shared Redis backend
  • Load balancer distribution
  • Independent scaling

Vertical Scaling

  • CPU: Multi-core utilization through Node.js cluster mode
  • Memory: Configurable heap sizes for high-throughput
  • I/O: Async operations prevent blocking

Database Scaling

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Redis Master  │    │  Redis Replica  │    │  Redis Replica  │
│   (Read/Write)  │───▶│   (Read Only)   │    │   (Read Only)   │
└─────────────────┘    └─────────────────┘    └─────────────────┘

Strategies:

  • Redis clustering for horizontal scaling
  • Read replicas for metrics/stats queries
  • Sharding by event ID for massive scale

Atomic Operations

Lua Script Design

Our core purchase operation uses a Redis Lua script to ensure atomicity:

-- Atomic ticket purchase script
local ticketKey = KEYS[1]  -- event:X:tickets
local metaKey = KEYS[2]    -- event:X:meta
local globalKey = KEYS[3]  -- global:stats

-- Atomic operations:
1. Check event exists
2. Pop ticket from list
3. Update sold count
4. Update global stats
5. Store purchase record

Benefits:

  • Race Condition Prevention: All operations execute atomically
  • Consistency: No partial state updates
  • Performance: Single round-trip to Redis
  • Reliability: All-or-nothing execution

Concurrency Handling

  • Optimistic Locking: Lua scripts handle concurrent access
  • Queue Management: Redis lists provide FIFO ticket distribution
  • Connection Pooling: Efficient Redis connection reuse

Fallback Mechanisms

Activation Triggers

  1. Redis Connection Failure: Network issues or Redis downtime
  2. Script Execution Errors: Lua script failures
  3. Timeout Scenarios: Slow Redis responses

Fallback Architecture

┌─────────────────┐
│  Request Comes  │
└─────────────────┘
         │
         ▼
┌─────────────────┐    ┌─────────────────┐
│  Try Redis      │───▶│  Redis Success  │
│  Operation      │    │  Return Result  │
└─────────────────┘    └─────────────────┘
         │
         ▼ (On Failure)
┌─────────────────┐    ┌─────────────────┐
│  Activate       │───▶│  In-Memory      │
│  Fallback Store │    │  Operation      │
└─────────────────┘    └─────────────────┘

Fallback Store Improvements

  • Automatic Seeding: Fallback store is seeded during server startup and when activated
  • Data Synchronization: Automatic attempt to sync with Redis data when activated
  • Manual Seeding: Admin endpoint to manually populate fallback store from Redis
  • Resilient Operation: Continues functioning even when Redis is completely unavailable

Fallback Limitations

  • Non-Persistent: Data lost on restart (mitigated by automatic reseeding)
  • Single Instance: No cross-instance synchronization
  • Capacity Limited: Memory constraints
  • Warning Logs: Clear indication of degraded mode

Performance Optimizations

Application Level

  1. Async Operations: Non-blocking I/O throughout
  2. Connection Pooling: Reuse Redis connections
  3. Batch Operations: Bulk ticket seeding
  4. Caching: Event metadata caching

Redis Optimizations

  1. Lua Scripts: Reduced network round-trips
  2. Pipeline Operations: Batch commands
  3. Memory Management: Efficient data structures
  4. Persistence: AOF for durability

PDF Generation

  1. Async Generation: Non-blocking PDF creation
  2. Stream Processing: Memory-efficient file handling
  3. Cleanup Jobs: Automatic old file removal
  4. Error Isolation: PDF failures don't affect purchases

Monitoring & Observability

Metrics Collection

{
  "global": {
    "totalEvents": 5,
    "totalTickets": 50000,
    "totalSold": 1250
  },
  "events": [
    {
      "eventId": "1",
      "soldTickets": 250,
      "remainingTickets": 9750
    }
  ],
  "system": {
    "usingFallback": false,
    "redisConnected": true,
    "uptime": 3600,
    "memoryUsage": {...}
  },
  "pdf": {
    "totalTickets": 1250,
    "totalSizeMB": "15.6"
  }
}

Logging Strategy

  • Structured Logging: JSON format for parsing
  • Request Tracking: Unique IDs for tracing
  • Performance Metrics: Response times and throughput
  • Error Categorization: Different log levels

Health Checks

  • Application Health: /health endpoint
  • Redis Connectivity: Connection status
  • Fallback Status: Degraded mode indication
  • Resource Usage: Memory and CPU monitoring

Security Considerations

Input Validation

  • Event ID Validation: Numeric constraints
  • Request Rate Limiting: DDoS protection
  • Parameter Sanitization: Injection prevention

Container Security

  • Non-Root User: Principle of least privilege
  • Minimal Base Image: Alpine Linux for smaller attack surface
  • Health Checks: Container monitoring

Data Protection

  • No Sensitive Data: Tickets are identifiers only
  • Audit Logging: Purchase tracking
  • Secure Defaults: Production-ready configuration

Deployment Strategy

Development Environment

# Local development
npm install
npm run docker:up    # Start Redis
npm run seed         # Seed events
npm run dev          # Start with nodemon

Production Environment

# Docker deployment
docker-compose up -d                    # Core services
docker-compose --profile monitoring up # With monitoring

Container Orchestration

  • Docker Compose: Local and small deployments
  • Kubernetes: Large-scale deployments
  • Health Checks: Automatic restart on failure
  • Resource Limits: CPU and memory constraints

Future Enhancements

Performance Improvements

  1. Redis Clustering: Horizontal database scaling
  2. CDN Integration: PDF delivery optimization
  3. Caching Layer: Application-level caching
  4. Connection Optimization: Advanced pooling

Feature Additions

  1. QR Code Generation: Enhanced ticket security
  2. Email Integration: Automatic ticket delivery
  3. Payment Processing: Complete purchase flow
  4. Event Management: Dynamic event creation

Monitoring Enhancements

  1. Distributed Tracing: Request flow tracking
  2. Custom Dashboards: Business metrics visualization
  3. Alerting: Proactive issue detection
  4. Performance Profiling: Bottleneck identification

Security Hardening

  1. Authentication: API key management
  2. Rate Limiting: Advanced throttling
  3. Encryption: Data in transit protection
  4. Audit Trails: Comprehensive logging

Conclusion

This design provides a robust, scalable foundation for high-volume ticket sales with the following key strengths:

  • Atomic Operations: Guaranteed consistency under load
  • High Availability: Graceful degradation capabilities
  • Observability: Comprehensive monitoring and logging
  • Scalability: Horizontal and vertical scaling support
  • Performance: Optimized for high-throughput scenarios

The architecture successfully handles the challenge requirements of processing thousands of concurrent requests while maintaining data integrity and system reliability.