System Overview¶
This document provides a high-level overview of the Leichte Sprache API architecture, design decisions, and system components.
Architecture Philosophy¶
The system is designed with several key principles:
- Modularity: Rules are self-contained and dynamically loaded
- Extensibility: Easy addition of new rules and features
- Performance: Efficient processing of large texts
- Reliability: Robust error handling and graceful degradation
- Maintainability: Clear separation of concerns and comprehensive testing
High-Level Architecture¶
graph TB
subgraph "Client Layer"
CLI[CLI Tools]
WEB[Web Applications]
API_CLIENT[API Clients]
end
subgraph "API Layer"
FASTAPI[FastAPI Server]
CORS[CORS Middleware]
VALIDATION[Request Validation]
AUTH[Authentication]
end
subgraph "Service Layer"
ANALYZER[Analysis Service]
GENERATOR[Generation Service]
RULE_LOADER[Dynamic Rule Loader]
end
subgraph "Rule Engine"
SYNTAX[Syntax Rules]
LEXICAL[Lexical Rules]
STYLISTIC[Stylistic Rules]
TECHNICAL[Technical Rules]
end
subgraph "ML/NLP Layer"
SPACY[spaCy German Model]
BERT[BERT Models]
PERPLEXITY[Perplexity Models]
LLM[LLM Integration]
end
subgraph "Data Layer"
MODELS[Model Files]
CONFIG[Configuration]
PROMPTS[LLM Prompts]
CACHE[Model Cache]
end
CLI --> FASTAPI
WEB --> FASTAPI
API_CLIENT --> FASTAPI
FASTAPI --> ANALYZER
FASTAPI --> GENERATOR
ANALYZER --> RULE_LOADER
GENERATOR --> LLM
RULE_LOADER --> SYNTAX
RULE_LOADER --> LEXICAL
RULE_LOADER --> STYLISTIC
RULE_LOADER --> TECHNICAL
SYNTAX --> SPACY
LEXICAL --> BERT
TECHNICAL --> PERPLEXITY
GENERATOR --> SPACY
SPACY --> MODELS
BERT --> MODELS
LLM --> PROMPTS
RULE_LOADER --> CONFIG
Core Components¶
1. FastAPI Application (api_main.py)¶
The REST API server built with FastAPI providing:
- HTTP endpoints for analysis and generation
- Request/response validation with Pydantic models
- OpenAPI documentation auto-generation
- CORS support for web applications
- Error handling and standardized responses
Key Features:
app = FastAPI(
title="Leichte Sprache API",
description="REST-API für die Analyse von deutschen Texten",
version="1.0.0",
docs_url="/docs",
redoc_url="/redoc"
)
2. Analysis Service (analysis_service.py)¶
Core business logic for text analysis:
- German text processing with spaCy
- Dynamic rule discovery and loading
- Violation detection and annotation
- Statistics generation
- Performance optimization with singleton patterns
Architecture Patterns: - Singleton: SpaCy model loading - Factory: Rule instantiation - Strategy: Pluggable rule implementations
3. Rule Engine (regeln/ directory)¶
Modular rule system with 18 specialized rules:
Rule Categories¶
| Category | Rules | Technology | Purpose |
|---|---|---|---|
| Syntax | Sentence length, subordinate clauses, passive voice | spaCy parsing | Grammar structure analysis |
| Lexical | Foreign words, compound words, complexity | BERT, spaCy | Vocabulary complexity |
| Stylistic | Negations, idioms, pronouns | Pattern matching | Writing style |
| Technical | Numbers, abbreviations, punctuation | BERT, regex | Technical compliance |
Dynamic Rule Loading¶
def load_rules():
"""Dynamically discover and load all rule modules."""
rules = {}
regeln_dir = Path("regeln")
for rule_dir in regeln_dir.iterdir():
if rule_dir.is_dir() and (rule_dir / "regel.py").exists():
module = importlib.import_module(f"regeln.{rule_dir.name}")
rules[rule_dir.name] = module.pruefe_regel
return rules
4. ML/NLP Components¶
spaCy Integration¶
- German language model:
de_core_news_lg - Linguistic analysis: POS tagging, dependency parsing, NER
- Sentence segmentation: Accurate German sentence boundaries
- Token classification: Word-level analysis
BERT Models¶
Specialized BERT models for complex tasks:
# Abbreviation detection
model = AutoModelForTokenClassification.from_pretrained(
"regeln/abkuerzungen/model/"
)
# Complexity scoring
complexity_model = AutoModelForSequenceClassification.from_pretrained(
"regeln/komplexitaet/textkomplexitaet/"
)
LLM Integration (tools/agent_optimizer.py)¶
Multi-provider LLM support:
- OpenAI: GPT-4, GPT-3.5
- Anthropic: Claude Sonnet, Opus
- Google: Gemini Pro, Flash
- Ollama: Local model inference
- Mistral: Mistral Large, Medium
5. Generation Engine¶
Iterative text improvement system:
graph LR
INPUT[Input Text] --> ANALYZE[Analyze Violations]
ANALYZE --> LLM[LLM Generation]
LLM --> REANALYZE[Re-analyze Result]
REANALYZE --> DECISION{Target Reached?}
DECISION -->|No| ESCALATE[Escalation Strategy]
ESCALATE --> LLM
DECISION -->|Yes| OUTPUT[Final Result]
style INPUT fill:#e1f5fe
style OUTPUT fill:#e8f5e8
style DECISION fill:#fff3e0
Escalation Strategies¶
| Level | Strategy | Description |
|---|---|---|
| 0 | Normal | Standard prompt with all violations |
| 1 | Focused | Top 5 priority violations only |
| 2 | High Temperature | Increased creativity (temp=0.9) |
| 3 | Rule-by-Rule | Single rule category focus |
Data Flow¶
Text Analysis Flow¶
sequenceDiagram
participant C as Client
participant A as API
participant S as Service
participant R as Rules
participant M as ML Models
C->>A: POST /analyse
A->>A: Validate request
A->>S: analyse_text(text)
S->>M: Load spaCy model
M->>S: German Doc object
S->>R: Apply all rules
R->>M: Use BERT models
M->>R: Predictions
R->>S: Violation lists
S->>S: Aggregate results
S->>A: Analysis result
A->>C: JSON response
Text Generation Flow¶
sequenceDiagram
participant C as Client
participant A as API
participant G as Generator
participant L as LLM
participant S as Service
C->>A: POST /generate
A->>G: optimize_text()
loop Max Iterations
G->>S: Analyze current text
S->>G: Violations found
G->>L: Generate improvement
L->>G: Improved text
G->>G: Check termination
end
G->>A: Final result
A->>C: Generation response
Performance Characteristics¶
Analysis Performance¶
| Text Size | Processing Time | Memory Usage |
|---|---|---|
| < 1,000 words | < 2 seconds | ~2GB |
| 1,000-5,000 words | 2-10 seconds | ~2.5GB |
| > 5,000 words | 10-30 seconds | ~3GB |
Memory Usage by Component¶
pie title Memory Usage Distribution
"spaCy Model" : 1200
"BERT Models" : 800
"Application Code" : 200
"Cache & Buffers" : 300
Bottleneck Analysis¶
- Model Loading: 30-60 seconds on first startup
- BERT Inference: ~100ms per complex rule
- Compound Word Analysis: ~840ms for regel_komposita
- Network I/O: Variable for LLM calls
Scalability Considerations¶
Horizontal Scaling¶
# Docker Swarm / Kubernetes setup
services:
api:
image: acolono/leichte-sprache-api
replicas: 3
deploy:
resources:
limits:
memory: 8G
cpus: '2.0'
Caching Strategy¶
- Model caching: Singleton spaCy and BERT models
- Response caching: Redis for repeated text analysis
- LLM optimization: Cross-request learning patterns
Load Balancing¶
upstream leichte_sprache {
server api1:8000;
server api2:8000;
server api3:8000;
}
server {
location / {
proxy_pass http://leichte_sprache;
proxy_set_header Host $host;
}
}
Security Architecture¶
Input Validation¶
- Request size limits: 50KB for analysis, 10KB for generation
- Content filtering: Malicious input detection
- Rate limiting: Per-IP request throttling
API Security¶
# Security headers
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # Configure for production
allow_credentials=True,
allow_methods=["GET", "POST"],
allow_headers=["*"]
)
Secret Management¶
- Environment variables: API keys and configuration
- No logging: Sensitive data excluded from logs
- Secure defaults: Conservative security settings
Monitoring and Observability¶
Health Checks¶
@app.get("/health")
async def health_check():
"""Comprehensive health check."""
try:
# Test core functionality
result = analyse_text("Test.")
return {"status": "healthy", "timestamp": time.time()}
except Exception as e:
return {"status": "unhealthy", "error": str(e)}
Metrics Collection¶
- Response times: Per endpoint performance
- Error rates: 4xx/5xx response tracking
- Resource usage: Memory and CPU monitoring
- Rule performance: Individual rule execution times
Logging Strategy¶
import logging
# Structured logging
logging.basicConfig(
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
level=logging.INFO
)
logger = logging.getLogger(__name__)
Design Decisions¶
Why FastAPI?¶
- Performance: Async support and high throughput
- Documentation: Auto-generated OpenAPI specs
- Validation: Pydantic integration for type safety
- Developer Experience: Excellent tooling and debugging
Why spaCy?¶
- German support: Excellent German language model
- Performance: Fast C++ implementation
- Accuracy: State-of-the-art linguistic analysis
- Ecosystem: Rich plugin ecosystem
Why Modular Rules?¶
- Maintainability: Each rule is independently testable
- Extensibility: Easy to add new rules
- Performance: Can disable unused rules
- Clarity: Clear separation of concerns
Why Multiple LLM Providers?¶
- Redundancy: Fallback options for reliability
- Cost optimization: Choose provider by price/performance
- Local deployment: Ollama for data privacy
- Quality diversity: Different models for different use cases
Future Architecture Considerations¶
Planned Improvements¶
- Microservices: Split rule engine into separate services
- Message Queues: Async processing for long texts
- Caching Layer: Redis for response caching
- Real-time Monitoring: Prometheus and Grafana integration
- A/B Testing: Framework for rule improvement testing
Scalability Roadmap¶
timeline
title Architecture Evolution
Phase 1 : Monolithic API
: Single Docker container
: File-based configuration
Phase 2 : Horizontal Scaling
: Load balancer
: Health checks
: Basic monitoring
Phase 3 : Microservices
: Rule engine services
: Message queues
: Distributed caching
Phase 4 : Advanced Features
: Real-time analytics
: Auto-scaling
: ML pipeline automation
Development Workflow¶
Local Development¶
# Start development environment
python api_main.py # Auto-reload enabled
# Run tests
python test-suite/test_runner.py
# Check code quality
ruff check .
black --check .
CI/CD Pipeline¶
# .github/workflows/test.yml
name: Test Suite
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.12'
- name: Run tests
run: python test-suite/test_runner.py
This architecture provides a solid foundation for the Leichte Sprache API while maintaining flexibility for future enhancements and scalability requirements.