Source: data_layer/docs/DEPLOYMENT_ARCHITECTURE.md

Prompt Intelligence System - Deployment Architecture

🏗️ Deployment Structure (Google Cloud Run)

Single Docker Container Deploys BOTH:

Docker Container (apps/backend/)
├── apps/backend/           ← Python logic (services, API, stores)
│   ├── stores/prompts.py   ← InMemoryStore wrapper
│   ├── services/prompts.py ← Workflow execution
│   ├── api/prompts.py      ← REST endpoints
│   └── server.py           ← FastAPI app
│
└── database/               ← Data files (deployed together!)
    ├── prompts/            ← 135 .md source files
    │   ├── agents/*.md     ← Agent prompts
    │   └── commands/*.md   ← Command prompts
    ├── output-styles/      ← Complete knowledge base
    │   ├── league_questionnaire_to_contract/  ← 7-stage pipeline
    │   ├── examples/seeds/ ← Few-shot examples
    │   └── schemas/seeds/  ← Validation schemas
    └── .chroma/            ← ChromaDB cache (optional)

Key Point: database/ and apps/backend/ deploy as ONE container → same filesystem → fast file access

📁 File Organization Strategy

Data Files → `database/`

What goes here: Static data, prompts, examples, schemas Why: Version controlled, shared across all backend instances Format: .md, .json, .jsonl

database/
├── prompts/                     # Git versioned prompts
│   ├── agents/
│   │   ├── contract.generator.agent.prompt.seed.v1.md
│   │   ├── document.pdf.agent.prompt.seed.v1.md
│   │   └── ... (28 agent prompts)
│   ├── commands/
│   │   └── ... (26 command prompts)
│   └── workflows/
│       └── ... (workflow definitions)
│
├── output-styles/               # Consolidated knowledge base
│   ├── league_questionnaire_to_contract/
│   │   ├── stage_2_questionnaire_extraction/
│   │   │   ├── examples/*.json      # 562 example files!
│   │   │   ├── schema/*.json        # Stage output schema
│   │   │   └── README.md            # Stage prompt/docs
│   │   ├── stage_3_document_enhancement/
│   │   │   ├── examples/*.json      # 9 examples
│   │   │   └── schema/*.json
│   │   ├── stage_4_classification/
│   │   ├── stage_5_upsert_databases/
│   │   ├── stage_6_contract_tier_suggestions/
│   │   │   └── examples/*.json      # 55 examples
│   │   ├── stage_7_contract_assembly/
│   │   │   └── examples/*.json      # 153 examples!
│   │   └── stage_7_output_*/        # 3 output variants
│   ├── examples/seeds/          # Consolidated few-shot examples
│   └── schemas/seeds/           # Consolidated schemas
│
└── scripts/                     # Build/validation scripts
    ├── build.py                 # Pre-build workflows (runs at container startup)
    └── validate.py              # End-to-end tests

Python Logic → `apps/backend/`

What goes here: Application code, business logic, APIs Why: Separate concerns, follow FastAPI conventions Format: .py files only

apps/backend/
├── stores/                      # Data access layer
│   ├── prompts.py              # InMemoryStore + DB sync
│   ├── firebase_adapter.py     # Firebase operations (existing)
│   └── supabase_adapter.py     # Supabase operations (existing)
│
├── services/                    # Business logic layer
│   ├── prompts.py              # Workflow execution
│   ├── firebase_adapter.py     # (symlink to stores/)
│   └── supabase_adapter.py     # (symlink to stores/)
│
├── api/                         # API endpoints
│   └── prompts.py              # Prompt intelligence routes
│
└── server.py                    # FastAPI application

💾 Database Separation Strategy

Firebase → User-Specific Data

What: User preferences, prompt history, feedback Why: Real-time sync, authentication, personalization

// Firebase structure
users/{userId}/
  prompt_preferences/
    detail_level: "technical"
    output_format: "markdown"
    favorite_workflows: ["questionnaire_to_contract"]
  
  prompt_history/{promptId}/
    total_uses: 45
    success_rate: 0.92
    last_used: timestamp
    feedback_ratings: [4.5, 4.8, 5.0]
 
prompts/
  workflows/
    questionnaire_to_contract: {...}  // Critical workflows for cross-instance sync

Supabase → League Data + Analytics

What: League examples, workflow analytics, performance metrics Why: SQL queries, complex analytics, reporting

-- League-specific examples (basketball, soccer, etc.)
CREATE TABLE league_examples (
  id SERIAL PRIMARY KEY,
  sport TEXT NOT NULL,
  tier TEXT,
  league_name TEXT,
  example_data JSONB,
  created_at TIMESTAMP DEFAULT NOW()
);
 
-- Workflow definitions for analytics
CREATE TABLE workflow_definitions (
  workflow_name TEXT PRIMARY KEY,
  total_stages INT,
  version INT,
  metadata JSONB,
  updated_at TIMESTAMP
);
 
-- Execution tracking
CREATE TABLE workflow_executions (
  id SERIAL PRIMARY KEY,
  workflow_name TEXT,
  execution_time FLOAT,
  success BOOLEAN,
  user_id TEXT,
  executed_at TIMESTAMP DEFAULT NOW()
);
 
-- Prompt performance analytics
CREATE TABLE prompt_catalog (
  prompt_type TEXT,
  prompt_name TEXT,
  version INT,
  suggestions_count INT,
  metadata JSONB,
  updated_at TIMESTAMP,
  PRIMARY KEY (prompt_type, prompt_name)
);

🔄 Data Flow (Production)

On Container Startup:

1. Container starts (apps/backend/ + database/ both present)
2. run database/scripts/build.py
   ├─ Read database/prompts/*.md (135 files)
   ├─ Read database/output-styles/league_questionnaire_to_contract/ (9 stages)
   ├─ Build workflows in memory
   ├─ Save to InMemoryStore
   └─ Sync to Firebase (for other instances)
3. Server ready (workflows cached in &lt;1ms retrieval)

On API Request `/api/prompts/search`:

1. User: POST /api/prompts/search {"query": "basketball contract"}
2. API (apps/backend/api/prompts.py)
   ├─ Call service.store.search_prompts()
3. Store (apps/backend/stores/prompts.py)
   ├─ Search InMemoryStore with vector similarity
   ├─ Returns top matches
4. Response: {"results": [...]}

On Workflow Execution `/api/prompts/execute`:

1. User: POST /api/prompts/execute {"workflow": "questionnaire_to_contract"}
2. Service (apps/backend/services/prompts.py)
   ├─ Get workflow from store (&lt;1ms cached)
   ├─ Build LangGraph from stages
   ├─ Execute 7-stage pipeline
   ├─ Track metrics in Supabase
3. Response: {"status": "completed", "result": {...}}

On Prompt Update `/api/prompts/update`:

1. User: POST /api/prompts/update {"suggestions": ["Add NBA examples"]}
2. Store (apps/backend/stores/prompts.py)
   ├─ Get current from InMemoryStore
   ├─ Apply suggestions (increment version)
   ├─ Save to InMemoryStore (immediate)
   ├─ Background async:
   │   ├─ Firebase.set() (user data sync)
   │   └─ Supabase.upsert() (analytics tracking)
3. Response: {"status": "updated", "new_version": 3}

🎯 Database Choice Logic (Built-In)

Automatic Routing in `stores/prompts.py`:

# Line 241-274 in stores/prompts.py
is_user_data = "user" in prompt_name or prompt_type in ["user_preferences", "user_history"]
is_league_data = "league" in prompt_name or "questionnaire" in prompt_name or prompt_type in ["workflows", "examples"]
 
if is_user_data:
    → Firebase (real-time, personalization)
 
if is_league_data:
    → Supabase (SQL analytics, reporting)
 
if prompt_type == "workflows":
    → BOTH (critical data redundancy)

User Data Example (Firebase):

- "user_preference_detailed"
- "user_history_analyst_001"
- "feedback_user_xyz"

League Data Example (Supabase):

- "questionnaire_to_contract" (workflow)
- "league_example_nba_premium"
- "schema_extracted_data"

🚀 Google Cloud Run Configuration

Dockerfile Strategy

# Build stage
FROM python:3.12-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
 
# Production stage
FROM python:3.12-slim
WORKDIR /app
 
# Copy Python code
COPY apps/backend/ ./apps/backend/
 
# Copy database files (deployed together!)
COPY database/ ./database/
 
# Install dependencies
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
 
# Pre-build workflows at container build time
RUN python database/scripts/build.py
 
# Run server
CMD exec uvicorn apps.backend.server:app --host 0.0.0.0 --port $PORT

Environment Variables

# .env or Cloud Run environment
OPENAI_API_KEY=sk-...                              # Required for embeddings
FIREBASE_SERVICE_ACCOUNT_PATH=./config/firebase.json  # User data
SUPABASE_URL=https://xxx.supabase.co               # League data/analytics
SUPABASE_SERVICE_KEY=eyJ...                        # Supabase access

Cloud Run Service Configuration

# .cloudrun.yaml or cloudbuild.yaml
service:
  containers:
  - image: gcr.io/PROJECT_ID/altsportsdata-backend
    resources:
      limits:
        memory: 2Gi      # InMemoryStore + embeddings
        cpu: 2
    env:
    - name: OPENAI_API_KEY
      valueFrom:
        secretKeyRef:
          key: OPENAI_API_KEY
          name: openai-secret

📊 Performance Characteristics (Production)

With Firebase/Supabase Connected:

Operation	Time	Database Used
First workflow query	~9ms	Read `database/` files → cache
Cached workflow query	<1ms	InMemoryStore only
Firebase sync (write)	~20ms	Background async (doesn't block)
Supabase sync (write)	~15ms	Background async (doesn't block)
Workflow execution	15-30s	LangGraph pipeline
Batch 10 parallel	~30s	Async concurrent execution

Container Restart (Stateful):

1. Container restarts (Cloud Run auto-scaling)
2. Load from Firebase (~200ms)
   ├─ Restore last workflow versions
   ├─ Populate InMemoryStore
3. Server ready
4. Future queries &lt;1ms (already cached)

Benefit: No rebuild from files on restart! Faster startup.

🎓 Best Practices for Google Cloud Run

1. Separate Concerns

✅ Data files (database/): Version controlled, static ✅ Python logic (apps/backend/): Application code ✅ Deployment: Both together in one container

2. Multi-Instance Safety

✅ InMemoryStore: Each container has its own ✅ Firebase: Syncs state between instances ✅ Supabase: Centralized analytics ✅ Startup: Loads from Firebase (consistent state)

3. Graceful Degradation

✅ InMemoryStore works offline: No network failures ✅ Firebase/Supabase optional: System runs without them ✅ File fallback: Always can rebuild from source

4. Performance

✅ Reads: InMemoryStore only (no DB queries) ✅ Writes: Async background sync (doesn't block responses) ✅ Startup: Pre-build during container build (warm cache)

🔧 Development Workflow

Local Development:

# 1. Build workflows
python database/scripts/build.py
 
# 2. Validate system
python database/scripts/validate.py
 
# 3. Start server
cd apps/backend
python server.py
 
# 4. Test API
curl http://localhost:8080/api/prompts/catalog

Add New Workflow:

# 1. Add .md files to database/prompts/
echo "New workflow prompt" > database/prompts/workflows/new_workflow.md
 
# 2. Rebuild
python database/scripts/build.py
 
# 3. Test
curl -X POST http://localhost:8080/api/prompts/search \
  -H "Content-Type: application/json" \
  -d '{"query": "new workflow"}'

Update Existing Workflow:

# Via API (recommended)
curl -X POST http://localhost:8080/api/prompts/update \
  -H "Content-Type: application/json" \
  -d '{
    "prompt_type": "workflow",
    "prompt_name": "questionnaire_to_contract",
    "suggestions": ["Add more NBA examples", "Improve tier logic"]
  }'
 
# System automatically:
# - Increments version
# - Updates InMemoryStore (immediate)
# - Syncs to Firebase (user/workflow data)
# - Syncs to Supabase (analytics)

📦 What Each Layer Does

Layer 1: Source Files (`database/`)

Role: Static data, version controlled Contents:

Prompt templates (.md files)
Few-shot examples (.json files)
Validation schemas (.json files)
Pipeline stages (organized by stage_N/)

Access: Read-only during operation (writes via API updates)

Layer 2: InMemoryStore (`apps/backend/stores/prompts.py`)

Role: Fast cache layer Performance: <1ms retrieval Persistence: Lost on container restart (restored from Firebase) Capabilities: Vector search with OpenAI embeddings

Layer 3: Firebase (User + Critical Data)

Role: Real-time sync, user personalization Stores:

User prompt preferences
User history/feedback
Critical workflows (cross-instance sync)

Use When: User-specific, real-time, authentication needed

Layer 4: Supabase (League + Analytics)

Role: Complex queries, analytics, reporting Stores:

League-specific examples (NBA, MLB, soccer, etc.)
Workflow execution metrics
Prompt performance analytics

Use When: SQL queries, reporting, data analysis needed

🎯 Real Example: Full Stack Flow

User Query from Next.js Frontend:

// Next.js client
const response = await fetch('/api/prompts/search', {
  method: 'POST',
  body: JSON.stringify({
    query: "generate NBA partnership contract",
    namespace: "workflows"
  })
});
 
// Backend retrieval (&lt;1ms)
// → InMemoryStore vector search
// → Returns: questionnaire_to_contract workflow
 
// Execute workflow
const execution = await fetch('/api/prompts/execute', {
  method: 'POST',
  body: JSON.stringify({
    workflow: "questionnaire_to_contract",
    input_data: {file: nbaQuestionnaire}
  })
});
 
// Backend execution (18s)
// → Loads workflow from InMemoryStore
// → Builds 7-stage LangGraph
// → Executes: extract → enhance → classify → upsert → price → assemble → export
// → Tracks in Supabase
// → Returns contract
 
// Frontend displays result

✅ Deployment Checklist

Pre-Deployment:

All .md prompts in database/prompts/
All examples in database/output-styles/*/examples/
All schemas in database/output-styles/*/schema/
Build script tested: python database/scripts/build.py
Validation passed: python database/scripts/validate.py

Dockerfile:

Copies apps/backend/ (Python code)
Copies database/ (data files)
Runs database/scripts/build.py (pre-warm cache)
Exposes PORT env variable

Cloud Run:

Environment variables set (OPENAI_API_KEY, FIREBASE_, SUPABASE_)
Memory ≥ 2GB (for InMemoryStore + embeddings)
CPU ≥ 2 (for parallel batch processing)
Timeout ≥ 300s (for long workflow executions)

Database Setup:

Firebase: Create prompts collection
Supabase: Run SQL schema creation (above)
Test connectivity from Cloud Run container

🎓 Why This Architecture Works

✅ Fast (< 1ms cached)

InMemoryStore is in-process memory
No network calls for reads
Vector search with OpenAI embeddings

✅ Reliable (Multi-tier fallback)

InMemoryStore → Firebase → Files
Always has data to serve
Graceful degradation

✅ Scalable (Cloud Run auto-scaling)

Each instance has own InMemoryStore
Firebase syncs across instances
Supabase centralizes analytics

✅ Maintainable (Clear separation)

Data files in database/
Python logic in apps/backend/
Easy to version control and update

✅ Production-Ready

Database sync for persistence
Analytics for monitoring
Battle-tested patterns (lesson_5.py)

Built for Google Cloud Run containerized deployment with database/ + apps/backend/ deployed together.

Prompt Intelligence System - Complete Architecture Prompt Intelligence System - Documentation Index