Data Layer Architecture
The AltSportsLeagues.ai data layer provides a comprehensive, type-safe foundation for all data operations across the platform. Built with Pydantic v2 and supporting multiple programming languages, our data layer ensures consistency, validation, and interoperability throughout the entire system.
Architecture Overview
Core Principles
Type Safety First
All data structures are defined with strict typing and validation rules:
- Runtime Validation: Pydantic models ensure data integrity at runtime
- TypeScript Integration: Automatic generation of TypeScript interfaces
- Cross-Language Consistency: Same data structures across Python, TypeScript, and JSON
Schema-Driven Development
Every data operation starts with a schema definition:
- Single Source of Truth: JSON schemas define the canonical data structure
- Auto-Generation: Python classes, TypeScript interfaces, and validation logic are auto-generated
- Version Control: Schema evolution is tracked and validated
Multi-Language Support
The data layer supports seamless integration across different programming environments:
| Language | Purpose | Generation Method |
|---|---|---|
| Python | Backend logic, AI flows, API endpoints | Direct Pydantic models |
| TypeScript | Frontend components, API clients | Auto-generated interfaces |
| JSON | API communication, configuration | Schema validation |
| Zustand | React state management | Auto-generated stores |
Shared Schemas
Core Business Objects
League Questionnaire Schema
Purpose: Standardizes league partnership application data Usage: Intake forms, validation, AI processing Key Fields: League details, contact information, partnership requirements
from data_layer.shared import LeagueQuestionnaireSchema
# Example usage
questionnaire = LeagueQuestionnaireSchema(
league_name="Canadian Premier League",
sport_category="Soccer",
contact_email="partnerships@cpl.ca",
# ... additional fields
)Contract Terms Schema
Purpose: Defines partnership agreement structures Usage: Contract generation, negotiation workflows Key Fields: Pricing tiers, terms, conditions, deliverables
Tier Classification Schema
Purpose: Automated partnership tier assessment Usage: AI scoring, revenue modeling, risk assessment Key Fields: Tier scores, market analysis, revenue projections
Negotiation Package Schema
Purpose: Structured negotiation data and proposals Usage: Automated proposal generation, stakeholder discussions Key Fields: Terms, pricing, customizations, timelines
Data Builders
The data layer includes intelligent builders that transform raw data into structured business objects:
Unified Builder
Location: data_layer/shared/builders/unified_builder.py
Purpose: Orchestrates complex data transformations across multiple schemas
Capabilities:
- Multi-schema validation and merging
- Business rule application
- Data enrichment and normalization
- Error handling and recovery
from data_layer.shared.builders.unified_builder import UnifiedBuilder
builder = UnifiedBuilder()
result = await builder.build_from_questionnaire(raw_questionnaire_data)
# Returns fully validated, enriched partnership dataSchema Generators
Automated tools for maintaining schema consistency across the platform:
Schema MDX Generator
Purpose: Generates documentation from schema definitions Output: Markdown documentation with examples and validation rules Integration: Automatically updates documentation when schemas change
Type Generator
Purpose: Creates language-specific type definitions Supported Languages: Python (Pydantic), TypeScript, JSON Schema Features: Type safety, validation rules, documentation
Usage Patterns
In AI Flows (CrewAI)
from crewai.flow.flow import Flow, listen, start
from data_layer.shared import LeagueQuestionnaireSchema
class PartnershipFlow(Flow):
@start()
def validate_questionnaire(self):
# Type-safe validation
questionnaire = LeagueQuestionnaireSchema(**self.state.input_data)
# Business logic with full type safety
if questionnaire.sport_category == "Soccer":
self.state.tier_analysis = self.analyze_soccer_market(questionnaire)
return questionnaireIn React Components
import { useLeagueQuestionnaireStore } from '@/data-layer/zustand/league_questionnaire_store';
// Type-safe state management
const { questionnaire, updateQuestionnaire, validateQuestionnaire } = useLeagueQuestionnaireStore();
// Full TypeScript intellisense and validation
const handleUpdate = (field: keyof LeagueQuestionnaire, value: any) => {
updateQuestionnaire({ [field]: value });
};In API Endpoints
from fastapi import APIRouter, HTTPException
from data_layer.shared import ContractTermsSchema
router = APIRouter()
@router.post("/contracts/generate")
async def generate_contract(contract_data: dict):
try:
# Runtime validation
contract = ContractTermsSchema(**contract_data)
# Generate contract with validated data
result = await contract_generator.generate(contract)
return {"contract": result, "status": "generated"}
except ValidationError as e:
raise HTTPException(status_code=422, detail=str(e))Development Workflow
Schema-First Development
- Define Schema: Create JSON schema in
database/schemas/domain/v1/ - Generate Adapters: Run schema generation scripts
- Implement Logic: Use generated types in your code
- Validate: Runtime validation ensures data integrity
- Document: Auto-generated documentation stays current
Schema Evolution
When schemas change:
- Update the JSON schema definition
- Run generation scripts:
./scripts/regenerate_adapters.sh - Update dependent code (TypeScript interfaces auto-update)
- Validate all usages still work
- Update tests and documentation
Quality Assurance
Validation Layers
- Schema Validation: JSON Schema compliance
- Type Validation: Language-specific type checking
- Business Rule Validation: Domain-specific constraints
- Runtime Validation: Pydantic model validation
Testing Strategy
- Unit Tests: Individual schema validation
- Integration Tests: Cross-schema interactions
- End-to-End Tests: Complete data flows
- Performance Tests: Large dataset validation
Performance Characteristics
Efficiency Metrics
| Operation | Performance | Notes |
|---|---|---|
| Schema Validation | < 10ms | Pydantic v2 optimization |
| Type Generation | < 5 seconds | Full codebase regeneration |
| Data Transformation | < 100ms | Unified builder processing |
| Cross-Language Sync | < 30 seconds | Full regeneration cycle |
Scalability Features
- Lazy Loading: Schemas loaded on demand
- Caching: Generated types cached for performance
- Batch Processing: Bulk operations for large datasets
- Memory Efficient: Minimal memory footprint for validation
Integration Points
External Systems
- Jira/Atlassian: Partnership tracking and workflow management
- Google Workspace: Document processing and contract generation
- Firebase/Supabase: Real-time data synchronization
- AI Services: Claude, OpenAI, Vertex AI for intelligent processing
Internal Services
- API Layer: RESTful endpoints with automatic OpenAPI generation
- Workflow Layer: N8n automation with type-safe data flow
- Frontend Layer: React components with Zustand state management
- AI Layer: CrewAI flows with validated data structures
Best Practices
Schema Design
- Start Simple: Begin with minimal required fields
- Version Carefully: Use semantic versioning for schema changes
- Validate Early: Catch data issues at the source
- Document Well: Auto-generated docs need good schema descriptions
Development Guidelines
- Type Safety: Always use generated types, never raw dictionaries
- Validation: Validate data at every boundary (API, UI, processing)
- Testing: Test with real data, not just mock data
- Evolution: Plan schema changes carefully to avoid breaking changes
Performance Optimization
- Batch Operations: Use builders for bulk data processing
- Lazy Evaluation: Load schemas only when needed
- Caching: Cache validation results for repeated operations
- Monitoring: Track validation performance and optimize bottlenecks
Quick Start
Using Shared Schemas
# Import and use schemas
from data_layer.shared import LeagueQuestionnaireSchema, ContractTermsSchema
# Validate data
questionnaire = LeagueQuestionnaireSchema(**input_data)
contract = ContractTermsSchema(**contract_data)
# Access with full type safety
print(f"Processing {questionnaire.league_name} for {contract.tier} tier")Generating New Schemas
# Regenerate all shared schemas
cd data_layer/scripts
python generate_adapters.py --type shared
# Or use the unified script
./scripts/regenerate_adapters.shBuilding Complex Data Objects
from data_layer.shared.builders.unified_builder import UnifiedBuilder
builder = UnifiedBuilder()
result = await builder.build_from_questionnaire({
"league_name": "Test League",
"sport_category": "Soccer",
# ... additional data
})
# Result includes validated schemas, tier analysis, contract terms, etc.This data layer provides the foundation for all AltSportsLeagues.ai operations, ensuring type safety, validation, and consistency across the entire platform.