Source: data_layer/docs/DEPLOYMENT_ARCHITECTURE.md
Prompt Intelligence System - Deployment Architecture
ποΈ Deployment Structure (Google Cloud Run)
Single Docker Container Deploys BOTH:
Docker Container (apps/backend/)
βββ apps/backend/ β Python logic (services, API, stores)
β βββ stores/prompts.py β InMemoryStore wrapper
β βββ services/prompts.py β Workflow execution
β βββ api/prompts.py β REST endpoints
β βββ server.py β FastAPI app
β
βββ database/ β Data files (deployed together!)
βββ prompts/ β 135 .md source files
β βββ agents/*.md β Agent prompts
β βββ commands/*.md β Command prompts
βββ output-styles/ β Complete knowledge base
β βββ league_questionnaire_to_contract/ β 7-stage pipeline
β βββ examples/seeds/ β Few-shot examples
β βββ schemas/seeds/ β Validation schemas
βββ .chroma/ β ChromaDB cache (optional)Key Point: database/ and apps/backend/ deploy as ONE container β same filesystem β fast file access
π File Organization Strategy
Data Files β database/
What goes here: Static data, prompts, examples, schemas Why: Version controlled, shared across all backend instances Format: .md, .json, .jsonl
database/
βββ prompts/ # Git versioned prompts
β βββ agents/
β β βββ contract.generator.agent.prompt.seed.v1.md
β β βββ document.pdf.agent.prompt.seed.v1.md
β β βββ ... (28 agent prompts)
β βββ commands/
β β βββ ... (26 command prompts)
β βββ workflows/
β βββ ... (workflow definitions)
β
βββ output-styles/ # Consolidated knowledge base
β βββ league_questionnaire_to_contract/
β β βββ stage_2_questionnaire_extraction/
β β β βββ examples/*.json # 562 example files!
β β β βββ schema/*.json # Stage output schema
β β β βββ README.md # Stage prompt/docs
β β βββ stage_3_document_enhancement/
β β β βββ examples/*.json # 9 examples
β β β βββ schema/*.json
β β βββ stage_4_classification/
β β βββ stage_5_upsert_databases/
β β βββ stage_6_contract_tier_suggestions/
β β β βββ examples/*.json # 55 examples
β β βββ stage_7_contract_assembly/
β β β βββ examples/*.json # 153 examples!
β β βββ stage_7_output_*/ # 3 output variants
β βββ examples/seeds/ # Consolidated few-shot examples
β βββ schemas/seeds/ # Consolidated schemas
β
βββ scripts/ # Build/validation scripts
βββ build.py # Pre-build workflows (runs at container startup)
βββ validate.py # End-to-end testsPython Logic β apps/backend/
What goes here: Application code, business logic, APIs Why: Separate concerns, follow FastAPI conventions Format: .py files only
apps/backend/
βββ stores/ # Data access layer
β βββ prompts.py # InMemoryStore + DB sync
β βββ firebase_adapter.py # Firebase operations (existing)
β βββ supabase_adapter.py # Supabase operations (existing)
β
βββ services/ # Business logic layer
β βββ prompts.py # Workflow execution
β βββ firebase_adapter.py # (symlink to stores/)
β βββ supabase_adapter.py # (symlink to stores/)
β
βββ api/ # API endpoints
β βββ prompts.py # Prompt intelligence routes
β
βββ server.py # FastAPI applicationπΎ Database Separation Strategy
Firebase β User-Specific Data
What: User preferences, prompt history, feedback Why: Real-time sync, authentication, personalization
// Firebase structure
users/{userId}/
prompt_preferences/
detail_level: "technical"
output_format: "markdown"
favorite_workflows: ["questionnaire_to_contract"]
prompt_history/{promptId}/
total_uses: 45
success_rate: 0.92
last_used: timestamp
feedback_ratings: [4.5, 4.8, 5.0]
prompts/
workflows/
questionnaire_to_contract: {...} // Critical workflows for cross-instance syncSupabase β League Data + Analytics
What: League examples, workflow analytics, performance metrics Why: SQL queries, complex analytics, reporting
-- League-specific examples (basketball, soccer, etc.)
CREATE TABLE league_examples (
id SERIAL PRIMARY KEY,
sport TEXT NOT NULL,
tier TEXT,
league_name TEXT,
example_data JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-- Workflow definitions for analytics
CREATE TABLE workflow_definitions (
workflow_name TEXT PRIMARY KEY,
total_stages INT,
version INT,
metadata JSONB,
updated_at TIMESTAMP
);
-- Execution tracking
CREATE TABLE workflow_executions (
id SERIAL PRIMARY KEY,
workflow_name TEXT,
execution_time FLOAT,
success BOOLEAN,
user_id TEXT,
executed_at TIMESTAMP DEFAULT NOW()
);
-- Prompt performance analytics
CREATE TABLE prompt_catalog (
prompt_type TEXT,
prompt_name TEXT,
version INT,
suggestions_count INT,
metadata JSONB,
updated_at TIMESTAMP,
PRIMARY KEY (prompt_type, prompt_name)
);π Data Flow (Production)
On Container Startup:
1. Container starts (apps/backend/ + database/ both present)
2. run database/scripts/build.py
ββ Read database/prompts/*.md (135 files)
ββ Read database/output-styles/league_questionnaire_to_contract/ (9 stages)
ββ Build workflows in memory
ββ Save to InMemoryStore
ββ Sync to Firebase (for other instances)
3. Server ready (workflows cached in <1ms retrieval)On API Request /api/prompts/search:
1. User: POST /api/prompts/search {"query": "basketball contract"}
2. API (apps/backend/api/prompts.py)
ββ Call service.store.search_prompts()
3. Store (apps/backend/stores/prompts.py)
ββ Search InMemoryStore with vector similarity
ββ Returns top matches
4. Response: {"results": [...]}On Workflow Execution /api/prompts/execute:
1. User: POST /api/prompts/execute {"workflow": "questionnaire_to_contract"}
2. Service (apps/backend/services/prompts.py)
ββ Get workflow from store (<1ms cached)
ββ Build LangGraph from stages
ββ Execute 7-stage pipeline
ββ Track metrics in Supabase
3. Response: {"status": "completed", "result": {...}}On Prompt Update /api/prompts/update:
1. User: POST /api/prompts/update {"suggestions": ["Add NBA examples"]}
2. Store (apps/backend/stores/prompts.py)
ββ Get current from InMemoryStore
ββ Apply suggestions (increment version)
ββ Save to InMemoryStore (immediate)
ββ Background async:
β ββ Firebase.set() (user data sync)
β ββ Supabase.upsert() (analytics tracking)
3. Response: {"status": "updated", "new_version": 3}π― Database Choice Logic (Built-In)
Automatic Routing in stores/prompts.py:
# Line 241-274 in stores/prompts.py
is_user_data = "user" in prompt_name or prompt_type in ["user_preferences", "user_history"]
is_league_data = "league" in prompt_name or "questionnaire" in prompt_name or prompt_type in ["workflows", "examples"]
if is_user_data:
β Firebase (real-time, personalization)
if is_league_data:
β Supabase (SQL analytics, reporting)
if prompt_type == "workflows":
β BOTH (critical data redundancy)User Data Example (Firebase):
- "user_preference_detailed"
- "user_history_analyst_001"
- "feedback_user_xyz"League Data Example (Supabase):
- "questionnaire_to_contract" (workflow)
- "league_example_nba_premium"
- "schema_extracted_data"π Google Cloud Run Configuration
Dockerfile Strategy
# Build stage
FROM python:3.12-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Production stage
FROM python:3.12-slim
WORKDIR /app
# Copy Python code
COPY apps/backend/ ./apps/backend/
# Copy database files (deployed together!)
COPY database/ ./database/
# Install dependencies
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
# Pre-build workflows at container build time
RUN python database/scripts/build.py
# Run server
CMD exec uvicorn apps.backend.server:app --host 0.0.0.0 --port $PORTEnvironment Variables
# .env or Cloud Run environment
OPENAI_API_KEY=sk-... # Required for embeddings
FIREBASE_SERVICE_ACCOUNT_PATH=./config/firebase.json # User data
SUPABASE_URL=https://xxx.supabase.co # League data/analytics
SUPABASE_SERVICE_KEY=eyJ... # Supabase accessCloud Run Service Configuration
# .cloudrun.yaml or cloudbuild.yaml
service:
containers:
- image: gcr.io/PROJECT_ID/altsportsdata-backend
resources:
limits:
memory: 2Gi # InMemoryStore + embeddings
cpu: 2
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
key: OPENAI_API_KEY
name: openai-secretπ Performance Characteristics (Production)
With Firebase/Supabase Connected:
| Operation | Time | Database Used |
|---|---|---|
| First workflow query | ~9ms | Read database/ files β cache |
| Cached workflow query | <1ms | InMemoryStore only |
| Firebase sync (write) | ~20ms | Background async (doesn't block) |
| Supabase sync (write) | ~15ms | Background async (doesn't block) |
| Workflow execution | 15-30s | LangGraph pipeline |
| Batch 10 parallel | ~30s | Async concurrent execution |
Container Restart (Stateful):
1. Container restarts (Cloud Run auto-scaling)
2. Load from Firebase (~200ms)
ββ Restore last workflow versions
ββ Populate InMemoryStore
3. Server ready
4. Future queries <1ms (already cached)Benefit: No rebuild from files on restart! Faster startup.
π Best Practices for Google Cloud Run
1. Separate Concerns
β
Data files (database/): Version controlled, static
β
Python logic (apps/backend/): Application code
β
Deployment: Both together in one container
2. Multi-Instance Safety
β InMemoryStore: Each container has its own β Firebase: Syncs state between instances β Supabase: Centralized analytics β Startup: Loads from Firebase (consistent state)
3. Graceful Degradation
β InMemoryStore works offline: No network failures β Firebase/Supabase optional: System runs without them β File fallback: Always can rebuild from source
4. Performance
β Reads: InMemoryStore only (no DB queries) β Writes: Async background sync (doesn't block responses) β Startup: Pre-build during container build (warm cache)
π§ Development Workflow
Local Development:
# 1. Build workflows
python database/scripts/build.py
# 2. Validate system
python database/scripts/validate.py
# 3. Start server
cd apps/backend
python server.py
# 4. Test API
curl http://localhost:8080/api/prompts/catalogAdd New Workflow:
# 1. Add .md files to database/prompts/
echo "New workflow prompt" > database/prompts/workflows/new_workflow.md
# 2. Rebuild
python database/scripts/build.py
# 3. Test
curl -X POST http://localhost:8080/api/prompts/search \
-H "Content-Type: application/json" \
-d '{"query": "new workflow"}'Update Existing Workflow:
# Via API (recommended)
curl -X POST http://localhost:8080/api/prompts/update \
-H "Content-Type: application/json" \
-d '{
"prompt_type": "workflow",
"prompt_name": "questionnaire_to_contract",
"suggestions": ["Add more NBA examples", "Improve tier logic"]
}'
# System automatically:
# - Increments version
# - Updates InMemoryStore (immediate)
# - Syncs to Firebase (user/workflow data)
# - Syncs to Supabase (analytics)π¦ What Each Layer Does
Layer 1: Source Files (database/)
Role: Static data, version controlled Contents:
- Prompt templates (.md files)
- Few-shot examples (.json files)
- Validation schemas (.json files)
- Pipeline stages (organized by stage_N/)
Access: Read-only during operation (writes via API updates)
Layer 2: InMemoryStore (apps/backend/stores/prompts.py)
Role: Fast cache layer Performance: <1ms retrieval Persistence: Lost on container restart (restored from Firebase) Capabilities: Vector search with OpenAI embeddings
Layer 3: Firebase (User + Critical Data)
Role: Real-time sync, user personalization Stores:
- User prompt preferences
- User history/feedback
- Critical workflows (cross-instance sync)
Use When: User-specific, real-time, authentication needed
Layer 4: Supabase (League + Analytics)
Role: Complex queries, analytics, reporting Stores:
- League-specific examples (NBA, MLB, soccer, etc.)
- Workflow execution metrics
- Prompt performance analytics
Use When: SQL queries, reporting, data analysis needed
π― Real Example: Full Stack Flow
User Query from Next.js Frontend:
// Next.js client
const response = await fetch('/api/prompts/search', {
method: 'POST',
body: JSON.stringify({
query: "generate NBA partnership contract",
namespace: "workflows"
})
});
// Backend retrieval (<1ms)
// β InMemoryStore vector search
// β Returns: questionnaire_to_contract workflow
// Execute workflow
const execution = await fetch('/api/prompts/execute', {
method: 'POST',
body: JSON.stringify({
workflow: "questionnaire_to_contract",
input_data: {file: nbaQuestionnaire}
})
});
// Backend execution (18s)
// β Loads workflow from InMemoryStore
// β Builds 7-stage LangGraph
// β Executes: extract β enhance β classify β upsert β price β assemble β export
// β Tracks in Supabase
// β Returns contract
// Frontend displays resultβ Deployment Checklist
Pre-Deployment:
- All .md prompts in
database/prompts/ - All examples in
database/output-styles/*/examples/ - All schemas in
database/output-styles/*/schema/ - Build script tested:
python database/scripts/build.py - Validation passed:
python database/scripts/validate.py
Dockerfile:
- Copies
apps/backend/(Python code) - Copies
database/(data files) - Runs
database/scripts/build.py(pre-warm cache) - Exposes PORT env variable
Cloud Run:
- Environment variables set (OPENAI_API_KEY, FIREBASE_, SUPABASE_)
- Memory β₯ 2GB (for InMemoryStore + embeddings)
- CPU β₯ 2 (for parallel batch processing)
- Timeout β₯ 300s (for long workflow executions)
Database Setup:
- Firebase: Create
promptscollection - Supabase: Run SQL schema creation (above)
- Test connectivity from Cloud Run container
π Why This Architecture Works
β Fast (< 1ms cached)
- InMemoryStore is in-process memory
- No network calls for reads
- Vector search with OpenAI embeddings
β Reliable (Multi-tier fallback)
- InMemoryStore β Firebase β Files
- Always has data to serve
- Graceful degradation
β Scalable (Cloud Run auto-scaling)
- Each instance has own InMemoryStore
- Firebase syncs across instances
- Supabase centralizes analytics
β Maintainable (Clear separation)
- Data files in
database/ - Python logic in
apps/backend/ - Easy to version control and update
β Production-Ready
- Database sync for persistence
- Analytics for monitoring
- Battle-tested patterns (lesson_5.py)
Built for Google Cloud Run containerized deployment with database/ + apps/backend/ deployed together.