Architecture
Multi-Database Quick Start (Supabase Edition)

Source: data_layer/docs/QUICKSTART_SUPABASE.md

Multi-Database Quick Start (Supabase Edition)

Get up and running with the Supabase + Neo4j + Pinecone multi-database system in 10 minutes.

Prerequisites

  • Python 3.8+
  • Supabase account and project
  • (Optional) Neo4j database
  • (Optional) Pinecone account
  • (Optional) OpenAI API key

Step 1: Install Dependencies

pip install supabase python-dotenv
 
# Optional dependencies
pip install neo4j           # For graph relationships
pip install pinecone-client # For vector search
pip install openai          # For embeddings

Step 2: Configure Environment

Create or update .env file in project root:

# Required
SUPABASE_URL=postgresql://postgres:password@db.project.supabase.co:5432/postgres
SUPABASE_API_KEY=your_anon_key_here
 
# Optional
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password
 
PINECONE_API_KEY=your_pinecone_key
OPENAI_API_KEY=your_openai_key

Step 3: Create Supabase Tables

Run this SQL in your Supabase SQL Editor:

-- Sport Archetypes
CREATE TABLE IF NOT EXISTS sport_archetypes (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  archetype_name TEXT UNIQUE NOT NULL,
  display_name TEXT NOT NULL,
  code_pattern TEXT NOT NULL,
  characteristics JSONB DEFAULT '{}',
  embedding_vector JSONB,
  neo4j_node_id TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW(),
  updated_at TIMESTAMPTZ DEFAULT NOW()
);
 
-- Prospective Leagues (with multi-DB sync tracking)
CREATE TABLE IF NOT EXISTS prospective_leagues (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  league_tag TEXT UNIQUE NOT NULL,
  league_name TEXT,
  sport_name TEXT,
  sport_tier TEXT DEFAULT 'TIER2',
  sport_archetype TEXT,
  status TEXT DEFAULT 'unverified',
  verification_status TEXT DEFAULT 'unverified',
  source_type TEXT DEFAULT 'web_scrape',
 
  -- Embedding fields
  embedding_vector JSONB,
  embedding_model TEXT,
  embedding_updated_at TIMESTAMPTZ,
 
  -- Neo4j sync tracking
  neo4j_node_id TEXT,
  neo4j_synced_at TIMESTAMPTZ,
  neo4j_sync_status TEXT DEFAULT 'pending',
 
  -- Pinecone sync tracking
  pinecone_id TEXT,
  pinecone_namespace TEXT,
  pinecone_synced_at TIMESTAMPTZ,
  pinecone_sync_status TEXT DEFAULT 'pending',
 
  created_at TIMESTAMPTZ DEFAULT NOW(),
  updated_at TIMESTAMPTZ DEFAULT NOW()
);
 
-- Few-Shot Examples
CREATE TABLE IF NOT EXISTS few_shot_examples (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  example_id TEXT UNIQUE NOT NULL,
  category TEXT NOT NULL,
  scenario TEXT,
  sport TEXT DEFAULT 'general',
  tier TEXT DEFAULT 'mixed',
  complexity TEXT DEFAULT 'moderate',
  version TEXT DEFAULT '2.0',
  quality_score DECIMAL DEFAULT 0.80,
  usage_count INTEGER DEFAULT 0,
  input_data JSONB DEFAULT '{}',
  output_data JSONB DEFAULT '{}',
  tags JSONB DEFAULT '[]',
  embedding_vector JSONB,
  embedding_model TEXT,
  pinecone_sync_status TEXT DEFAULT 'pending',
  created_at TIMESTAMPTZ DEFAULT NOW(),
  updated_at TIMESTAMPTZ DEFAULT NOW()
);
 
-- Indexes for performance
CREATE INDEX IF NOT EXISTS idx_leagues_archetype ON prospective_leagues(sport_archetype);
CREATE INDEX IF NOT EXISTS idx_leagues_tier ON prospective_leagues(sport_tier);
CREATE INDEX IF NOT EXISTS idx_leagues_status ON prospective_leagues(status);
CREATE INDEX IF NOT EXISTS idx_leagues_verification ON prospective_leagues(verification_status);
CREATE INDEX IF NOT EXISTS idx_examples_category ON few_shot_examples(category);
CREATE INDEX IF NOT EXISTS idx_examples_sport ON few_shot_examples(sport);

Step 4: Validate Setup

cd database
python scripts/validate_supabase_multi_db.py

Expected output:

βœ… SUPABASE_URL: Supabase project URL
βœ… SUPABASE_API_KEY: Supabase API key
βœ… Supabase connection successful
βœ… OpenAI connection successful

Step 5: Seed Data

Seed Sport Archetypes (Required)

python scripts/seed_supabase_multi_db.py --archetypes

Seed Sample Leagues (Optional)

First, create some JSON seed files in database/seeds/:

{
  "league_tag": "ufc",
  "league_name": "Ultimate Fighting Championship",
  "sport_name": "Mixed Martial Arts",
  "sport_tier": "TIER1",
  "sport_archetype": "combat",
  "status": "verified",
  "verification_status": "human_verified"
}

Then seed:

python scripts/seed_supabase_multi_db.py --leagues

Seed Everything at Once

python scripts/seed_supabase_multi_db.py --all

Step 6: Query Your Data

Using Python Supabase Client

from supabase import create_client
import os
 
# Connect
url = os.getenv('SUPABASE_URL')
key = os.getenv('SUPABASE_API_KEY')
supabase = create_client(url, key)
 
# Query combat sports leagues
result = supabase.table('prospective_leagues')\
    .select('*')\
    .eq('sport_archetype', 'combat')\
    .execute()
 
for league in result.data:
    print(f"{league['league_name']} - {league['status']}")

Using SQL in Supabase

-- Find all combat sports leagues
SELECT league_name, sport_tier, verification_status
FROM prospective_leagues
WHERE sport_archetype = 'combat'
ORDER BY league_name;
 
-- Get sync status summary
SELECT
  neo4j_sync_status,
  pinecone_sync_status,
  COUNT(*) as count
FROM prospective_leagues
GROUP BY neo4j_sync_status, pinecone_sync_status;
 
-- Find high-quality examples by category
SELECT example_id, scenario, quality_score
FROM few_shot_examples
WHERE category = 'triage'
  AND quality_score > 0.85
ORDER BY quality_score DESC
LIMIT 10;

Architecture Overview

JSONL Seed Files (source of truth)
         ↓
    Supabase PostgreSQL
         ↓
    β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”
    ↓         ↓
  Neo4j    Pinecone
  (graph)   (vectors)

Database Roles

DatabasePurposeRequired?
SupabasePrimary data store, fast queries, ACIDβœ… Yes
Neo4jGraph relationships, archetype navigationβ­• Optional
PineconeVector search, semantic similarityβ­• Optional

Key Features

  1. Sport Archetypes - 5 master categories for classification
  2. Sync Tracking - Per-database sync status (pending/synced/failed)
  3. Graceful Degradation - Works with just Supabase
  4. Auto-Embeddings - OpenAI embeddings generated during seeding

Common Queries

Get Leagues by Archetype

result = supabase.table('prospective_leagues')\
    .select('*')\
    .eq('sport_archetype', 'team_sport')\
    .execute()

Find Unverified High-Value Opportunities

result = supabase.table('prospective_leagues')\
    .select('*')\
    .eq('verification_status', 'unverified')\
    .eq('sport_tier', 'TIER1')\
    .execute()

Get Examples for Prompt Engineering

result = supabase.table('few_shot_examples')\
    .select('*')\
    .eq('category', 'contract_generation')\
    .gte('quality_score', 0.85)\
    .limit(5)\
    .execute()

Next Steps

  1. Add More Leagues - Create JSON files in seeds/
  2. Set Up Neo4j (Optional) - For graph relationships
  3. Configure Pinecone (Optional) - For semantic search
  4. Explore Notebooks - See _notebooks/multi_db_exploration.ipynb

Troubleshooting

Supabase Connection Fails

  • Verify SUPABASE_URL is the PostgreSQL connection string
  • Check SUPABASE_API_KEY is the anon key from Supabase dashboard
  • Ensure tables are created (run SQL from Step 3)

Neo4j Connection Fails (Optional)

  • Check Neo4j is running: docker ps or brew services list neo4j
  • Verify NEO4J_PASSWORD is correct
  • This is optional - system works without Neo4j

No Embeddings Generated

  • Verify OPENAI_API_KEY is set
  • Check OpenAI account has credits
  • Embeddings are optional but recommended

Resources


Built with: Supabase + Neo4j + Pinecone + OpenAI Pattern: JSONL β†’ Multi-DB Sync Status: Production Ready πŸš€

Platform

Documentation

Community

Support

partnership@altsportsdata.comdev@altsportsleagues.ai

2025 Β© AltSportsLeagues.ai. Powered by AI-driven sports business intelligence.

πŸ€– AI-Enhancedβ€’πŸ“Š Data-Drivenβ€’βš‘ Real-Time