Source: data_layer/docs/QUICKSTART_SUPABASE.md
Multi-Database Quick Start (Supabase Edition)
Get up and running with the Supabase + Neo4j + Pinecone multi-database system in 10 minutes.
Prerequisites
- Python 3.8+
- Supabase account and project
- (Optional) Neo4j database
- (Optional) Pinecone account
- (Optional) OpenAI API key
Step 1: Install Dependencies
pip install supabase python-dotenv
# Optional dependencies
pip install neo4j # For graph relationships
pip install pinecone-client # For vector search
pip install openai # For embeddingsStep 2: Configure Environment
Create or update .env file in project root:
# Required
SUPABASE_URL=postgresql://postgres:password@db.project.supabase.co:5432/postgres
SUPABASE_API_KEY=your_anon_key_here
# Optional
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password
PINECONE_API_KEY=your_pinecone_key
OPENAI_API_KEY=your_openai_keyStep 3: Create Supabase Tables
Run this SQL in your Supabase SQL Editor:
-- Sport Archetypes
CREATE TABLE IF NOT EXISTS sport_archetypes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
archetype_name TEXT UNIQUE NOT NULL,
display_name TEXT NOT NULL,
code_pattern TEXT NOT NULL,
characteristics JSONB DEFAULT '{}',
embedding_vector JSONB,
neo4j_node_id TEXT,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Prospective Leagues (with multi-DB sync tracking)
CREATE TABLE IF NOT EXISTS prospective_leagues (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
league_tag TEXT UNIQUE NOT NULL,
league_name TEXT,
sport_name TEXT,
sport_tier TEXT DEFAULT 'TIER2',
sport_archetype TEXT,
status TEXT DEFAULT 'unverified',
verification_status TEXT DEFAULT 'unverified',
source_type TEXT DEFAULT 'web_scrape',
-- Embedding fields
embedding_vector JSONB,
embedding_model TEXT,
embedding_updated_at TIMESTAMPTZ,
-- Neo4j sync tracking
neo4j_node_id TEXT,
neo4j_synced_at TIMESTAMPTZ,
neo4j_sync_status TEXT DEFAULT 'pending',
-- Pinecone sync tracking
pinecone_id TEXT,
pinecone_namespace TEXT,
pinecone_synced_at TIMESTAMPTZ,
pinecone_sync_status TEXT DEFAULT 'pending',
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Few-Shot Examples
CREATE TABLE IF NOT EXISTS few_shot_examples (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
example_id TEXT UNIQUE NOT NULL,
category TEXT NOT NULL,
scenario TEXT,
sport TEXT DEFAULT 'general',
tier TEXT DEFAULT 'mixed',
complexity TEXT DEFAULT 'moderate',
version TEXT DEFAULT '2.0',
quality_score DECIMAL DEFAULT 0.80,
usage_count INTEGER DEFAULT 0,
input_data JSONB DEFAULT '{}',
output_data JSONB DEFAULT '{}',
tags JSONB DEFAULT '[]',
embedding_vector JSONB,
embedding_model TEXT,
pinecone_sync_status TEXT DEFAULT 'pending',
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Indexes for performance
CREATE INDEX IF NOT EXISTS idx_leagues_archetype ON prospective_leagues(sport_archetype);
CREATE INDEX IF NOT EXISTS idx_leagues_tier ON prospective_leagues(sport_tier);
CREATE INDEX IF NOT EXISTS idx_leagues_status ON prospective_leagues(status);
CREATE INDEX IF NOT EXISTS idx_leagues_verification ON prospective_leagues(verification_status);
CREATE INDEX IF NOT EXISTS idx_examples_category ON few_shot_examples(category);
CREATE INDEX IF NOT EXISTS idx_examples_sport ON few_shot_examples(sport);Step 4: Validate Setup
cd database
python scripts/validate_supabase_multi_db.pyExpected output:
β
SUPABASE_URL: Supabase project URL
β
SUPABASE_API_KEY: Supabase API key
β
Supabase connection successful
β
OpenAI connection successfulStep 5: Seed Data
Seed Sport Archetypes (Required)
python scripts/seed_supabase_multi_db.py --archetypesSeed Sample Leagues (Optional)
First, create some JSON seed files in database/seeds/:
{
"league_tag": "ufc",
"league_name": "Ultimate Fighting Championship",
"sport_name": "Mixed Martial Arts",
"sport_tier": "TIER1",
"sport_archetype": "combat",
"status": "verified",
"verification_status": "human_verified"
}Then seed:
python scripts/seed_supabase_multi_db.py --leaguesSeed Everything at Once
python scripts/seed_supabase_multi_db.py --allStep 6: Query Your Data
Using Python Supabase Client
from supabase import create_client
import os
# Connect
url = os.getenv('SUPABASE_URL')
key = os.getenv('SUPABASE_API_KEY')
supabase = create_client(url, key)
# Query combat sports leagues
result = supabase.table('prospective_leagues')\
.select('*')\
.eq('sport_archetype', 'combat')\
.execute()
for league in result.data:
print(f"{league['league_name']} - {league['status']}")Using SQL in Supabase
-- Find all combat sports leagues
SELECT league_name, sport_tier, verification_status
FROM prospective_leagues
WHERE sport_archetype = 'combat'
ORDER BY league_name;
-- Get sync status summary
SELECT
neo4j_sync_status,
pinecone_sync_status,
COUNT(*) as count
FROM prospective_leagues
GROUP BY neo4j_sync_status, pinecone_sync_status;
-- Find high-quality examples by category
SELECT example_id, scenario, quality_score
FROM few_shot_examples
WHERE category = 'triage'
AND quality_score > 0.85
ORDER BY quality_score DESC
LIMIT 10;Architecture Overview
JSONL Seed Files (source of truth)
β
Supabase PostgreSQL
β
ββββββ΄βββββ
β β
Neo4j Pinecone
(graph) (vectors)Database Roles
| Database | Purpose | Required? |
|---|---|---|
| Supabase | Primary data store, fast queries, ACID | β Yes |
| Neo4j | Graph relationships, archetype navigation | β Optional |
| Pinecone | Vector search, semantic similarity | β Optional |
Key Features
- Sport Archetypes - 5 master categories for classification
- Sync Tracking - Per-database sync status (pending/synced/failed)
- Graceful Degradation - Works with just Supabase
- Auto-Embeddings - OpenAI embeddings generated during seeding
Common Queries
Get Leagues by Archetype
result = supabase.table('prospective_leagues')\
.select('*')\
.eq('sport_archetype', 'team_sport')\
.execute()Find Unverified High-Value Opportunities
result = supabase.table('prospective_leagues')\
.select('*')\
.eq('verification_status', 'unverified')\
.eq('sport_tier', 'TIER1')\
.execute()Get Examples for Prompt Engineering
result = supabase.table('few_shot_examples')\
.select('*')\
.eq('category', 'contract_generation')\
.gte('quality_score', 0.85)\
.limit(5)\
.execute()Next Steps
- Add More Leagues - Create JSON files in
seeds/ - Set Up Neo4j (Optional) - For graph relationships
- Configure Pinecone (Optional) - For semantic search
- Explore Notebooks - See
_notebooks/multi_db_exploration.ipynb
Troubleshooting
Supabase Connection Fails
- Verify
SUPABASE_URLis the PostgreSQL connection string - Check
SUPABASE_API_KEYis the anon key from Supabase dashboard - Ensure tables are created (run SQL from Step 3)
Neo4j Connection Fails (Optional)
- Check Neo4j is running:
docker psorbrew services list neo4j - Verify
NEO4J_PASSWORDis correct - This is optional - system works without Neo4j
No Embeddings Generated
- Verify
OPENAI_API_KEYis set - Check OpenAI account has credits
- Embeddings are optional but recommended
Resources
- INDEX.md - Complete navigation guide
- MULTI_DATABASE_ARCHITECTURE.md - Architecture deep dive
- DATABASE_ARCHITECTURE.md - Supabase/Firebase details
Built with: Supabase + Neo4j + Pinecone + OpenAI Pattern: JSONL β Multi-DB Sync Status: Production Ready π