System Overview¶

This document provides a high-level overview of the Leichte Sprache API architecture, design decisions, and system components.

Architecture Philosophy¶

The system is designed with several key principles:

Modularity: Rules are self-contained and dynamically loaded
Extensibility: Easy addition of new rules and features
Performance: Efficient processing of large texts
Reliability: Robust error handling and graceful degradation
Maintainability: Clear separation of concerns and comprehensive testing

High-Level Architecture¶

graph TB
    subgraph "Client Layer"
        CLI[CLI Tools]
        WEB[Web Applications]
        API_CLIENT[API Clients]
    end

    subgraph "API Layer"
        FASTAPI[FastAPI Server]
        CORS[CORS Middleware]
        VALIDATION[Request Validation]
        AUTH[Authentication]
    end

    subgraph "Service Layer"
        ANALYZER[Analysis Service]
        GENERATOR[Generation Service]
        RULE_LOADER[Dynamic Rule Loader]
    end

    subgraph "Rule Engine"
        SYNTAX[Syntax Rules]
        LEXICAL[Lexical Rules]
        STYLISTIC[Stylistic Rules]
        TECHNICAL[Technical Rules]
    end

    subgraph "ML/NLP Layer"
        SPACY[spaCy German Model]
        BERT[BERT Models]
        PERPLEXITY[Perplexity Models]
        LLM[LLM Integration]
    end

    subgraph "Data Layer"
        MODELS[Model Files]
        CONFIG[Configuration]
        PROMPTS[LLM Prompts]
        CACHE[Model Cache]
    end

    CLI --> FASTAPI
    WEB --> FASTAPI
    API_CLIENT --> FASTAPI

    FASTAPI --> ANALYZER
    FASTAPI --> GENERATOR
    ANALYZER --> RULE_LOADER
    GENERATOR --> LLM

    RULE_LOADER --> SYNTAX
    RULE_LOADER --> LEXICAL
    RULE_LOADER --> STYLISTIC
    RULE_LOADER --> TECHNICAL

    SYNTAX --> SPACY
    LEXICAL --> BERT
    TECHNICAL --> PERPLEXITY
    GENERATOR --> SPACY

    SPACY --> MODELS
    BERT --> MODELS
    LLM --> PROMPTS
    RULE_LOADER --> CONFIG

Core Components¶

1. FastAPI Application (`api_main.py`)¶

The REST API server built with FastAPI providing:

HTTP endpoints for analysis and generation
Request/response validation with Pydantic models
OpenAPI documentation auto-generation
CORS support for web applications
Error handling and standardized responses

Key Features:

app = FastAPI(
    title="Leichte Sprache API",
    description="REST-API für die Analyse von deutschen Texten",
    version="1.0.0",
    docs_url="/docs",
    redoc_url="/redoc"
)

2. Analysis Service (`analysis_service.py`)¶

Core business logic for text analysis:

German text processing with spaCy
Dynamic rule discovery and loading
Violation detection and annotation
Statistics generation
Performance optimization with singleton patterns

Architecture Patterns: - Singleton: SpaCy model loading - Factory: Rule instantiation - Strategy: Pluggable rule implementations

3. Rule Engine (`regeln/` directory)¶

Modular rule system with 18 specialized rules:

Rule Categories¶

Category	Rules	Technology	Purpose
Syntax	Sentence length, subordinate clauses, passive voice	spaCy parsing	Grammar structure analysis
Lexical	Foreign words, compound words, complexity	BERT, spaCy	Vocabulary complexity
Stylistic	Negations, idioms, pronouns	Pattern matching	Writing style
Technical	Numbers, abbreviations, punctuation	BERT, regex	Technical compliance

Dynamic Rule Loading¶

def load_rules():
    """Dynamically discover and load all rule modules."""
    rules = {}
    regeln_dir = Path("regeln")

    for rule_dir in regeln_dir.iterdir():
        if rule_dir.is_dir() and (rule_dir / "regel.py").exists():
            module = importlib.import_module(f"regeln.{rule_dir.name}")
            rules[rule_dir.name] = module.pruefe_regel

    return rules

4. ML/NLP Components¶

spaCy Integration¶

German language model: de_core_news_lg
Linguistic analysis: POS tagging, dependency parsing, NER
Sentence segmentation: Accurate German sentence boundaries
Token classification: Word-level analysis

BERT Models¶

Specialized BERT models for complex tasks:

# Abbreviation detection
model = AutoModelForTokenClassification.from_pretrained(
    "regeln/abkuerzungen/model/"
)

# Complexity scoring
complexity_model = AutoModelForSequenceClassification.from_pretrained(
    "regeln/komplexitaet/textkomplexitaet/"
)

LLM Integration (`tools/agent_optimizer.py`)¶

Multi-provider LLM support:

OpenAI: GPT-4, GPT-3.5
Anthropic: Claude Sonnet, Opus
Google: Gemini Pro, Flash
Ollama: Local model inference
Mistral: Mistral Large, Medium

5. Generation Engine¶

Iterative text improvement system:

graph LR
    INPUT[Input Text] --> ANALYZE[Analyze Violations]
    ANALYZE --> LLM[LLM Generation]
    LLM --> REANALYZE[Re-analyze Result]
    REANALYZE --> DECISION{Target Reached?}
    DECISION -->|No| ESCALATE[Escalation Strategy]
    ESCALATE --> LLM
    DECISION -->|Yes| OUTPUT[Final Result]

    style INPUT fill:#e1f5fe
    style OUTPUT fill:#e8f5e8
    style DECISION fill:#fff3e0

Escalation Strategies¶

Level	Strategy	Description
0	Normal	Standard prompt with all violations
1	Focused	Top 5 priority violations only
2	High Temperature	Increased creativity (temp=0.9)
3	Rule-by-Rule	Single rule category focus

Data Flow¶

Text Analysis Flow¶

sequenceDiagram
    participant C as Client
    participant A as API
    participant S as Service
    participant R as Rules
    participant M as ML Models

    C->>A: POST /analyse
    A->>A: Validate request
    A->>S: analyse_text(text)
    S->>M: Load spaCy model
    M->>S: German Doc object
    S->>R: Apply all rules
    R->>M: Use BERT models
    M->>R: Predictions
    R->>S: Violation lists
    S->>S: Aggregate results
    S->>A: Analysis result
    A->>C: JSON response

Text Generation Flow¶

sequenceDiagram
    participant C as Client
    participant A as API
    participant G as Generator
    participant L as LLM
    participant S as Service

    C->>A: POST /generate
    A->>G: optimize_text()
    loop Max Iterations
        G->>S: Analyze current text
        S->>G: Violations found
        G->>L: Generate improvement
        L->>G: Improved text
        G->>G: Check termination
    end
    G->>A: Final result
    A->>C: Generation response

Performance Characteristics¶

Analysis Performance¶

Text Size	Processing Time	Memory Usage
< 1,000 words	< 2 seconds	~2GB
1,000-5,000 words	2-10 seconds	~2.5GB
> 5,000 words	10-30 seconds	~3GB

Memory Usage by Component¶

pie title Memory Usage Distribution
    "spaCy Model" : 1200
    "BERT Models" : 800
    "Application Code" : 200
    "Cache & Buffers" : 300

Bottleneck Analysis¶

Model Loading: 30-60 seconds on first startup
BERT Inference: ~100ms per complex rule
Compound Word Analysis: ~840ms for regel_komposita
Network I/O: Variable for LLM calls

Scalability Considerations¶

Horizontal Scaling¶

# Docker Swarm / Kubernetes setup
services:
  api:
    image: acolono/leichte-sprache-api
    replicas: 3
    deploy:
      resources:
        limits:
          memory: 8G
          cpus: '2.0'

Caching Strategy¶

Model caching: Singleton spaCy and BERT models
Response caching: Redis for repeated text analysis
LLM optimization: Cross-request learning patterns

Load Balancing¶

upstream leichte_sprache {
    server api1:8000;
    server api2:8000;
    server api3:8000;
}

server {
    location / {
        proxy_pass http://leichte_sprache;
        proxy_set_header Host $host;
    }
}

Security Architecture¶

Input Validation¶

Request size limits: 50KB for analysis, 10KB for generation
Content filtering: Malicious input detection
Rate limiting: Per-IP request throttling

API Security¶

# Security headers
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # Configure for production
    allow_credentials=True,
    allow_methods=["GET", "POST"],
    allow_headers=["*"]
)

Secret Management¶

Environment variables: API keys and configuration
No logging: Sensitive data excluded from logs
Secure defaults: Conservative security settings

Monitoring and Observability¶

Health Checks¶

@app.get("/health")
async def health_check():
    """Comprehensive health check."""
    try:
        # Test core functionality
        result = analyse_text("Test.")
        return {"status": "healthy", "timestamp": time.time()}
    except Exception as e:
        return {"status": "unhealthy", "error": str(e)}

Metrics Collection¶

Response times: Per endpoint performance
Error rates: 4xx/5xx response tracking
Resource usage: Memory and CPU monitoring
Rule performance: Individual rule execution times

Logging Strategy¶

import logging

# Structured logging
logging.basicConfig(
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    level=logging.INFO
)

logger = logging.getLogger(__name__)

Design Decisions¶

Why FastAPI?¶

Performance: Async support and high throughput
Documentation: Auto-generated OpenAPI specs
Validation: Pydantic integration for type safety
Developer Experience: Excellent tooling and debugging

Why spaCy?¶

German support: Excellent German language model
Performance: Fast C++ implementation
Accuracy: State-of-the-art linguistic analysis
Ecosystem: Rich plugin ecosystem

Why Modular Rules?¶

Maintainability: Each rule is independently testable
Extensibility: Easy to add new rules
Performance: Can disable unused rules
Clarity: Clear separation of concerns

Why Multiple LLM Providers?¶

Redundancy: Fallback options for reliability
Cost optimization: Choose provider by price/performance
Local deployment: Ollama for data privacy
Quality diversity: Different models for different use cases

Future Architecture Considerations¶

Planned Improvements¶

Microservices: Split rule engine into separate services
Message Queues: Async processing for long texts
Caching Layer: Redis for response caching
Real-time Monitoring: Prometheus and Grafana integration
A/B Testing: Framework for rule improvement testing

Scalability Roadmap¶

timeline
    title Architecture Evolution

    Phase 1 : Monolithic API
            : Single Docker container
            : File-based configuration

    Phase 2 : Horizontal Scaling
            : Load balancer
            : Health checks
            : Basic monitoring

    Phase 3 : Microservices
            : Rule engine services
            : Message queues
            : Distributed caching

    Phase 4 : Advanced Features
            : Real-time analytics
            : Auto-scaling
            : ML pipeline automation

Development Workflow¶

Local Development¶

# Start development environment
python api_main.py  # Auto-reload enabled

# Run tests
python test-suite/test_runner.py

# Check code quality
ruff check .
black --check .

CI/CD Pipeline¶

# .github/workflows/test.yml
name: Test Suite
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.12'
      - name: Run tests
        run: python test-suite/test_runner.py

This architecture provides a solid foundation for the Leichte Sprache API while maintaining flexibility for future enhancements and scalability requirements.