Architecture
Consolidation Cleanup Guide

Source: data_layer/docs/CONSOLIDATION_CLEANUP_GUIDE.md

Consolidation Cleanup Guide

Overview

The schema and seed consolidation is complete. This guide explains what can be safely cleaned up and how to do it.

What Was Consolidated

βœ… Completed Consolidations

  1. Type Definitions: kb_catalog/schemas/ β†’ schemas/types/
  2. Archetypes: kb_catalog/schemas/archetypes/ β†’ schemas/archetypes/
  3. API Schemas: kb_catalog/schemas/api/ β†’ schemas/integrations/api/
  4. Forms: kb_catalog/schemas/forms/ β†’ schemas/integrations/forms/
  5. Output Schemas: kb_catalog/schemas/output/ β†’ schemas/integrations/output/
  6. ASD Integration: kb_catalog/schemas/asd/ β†’ schemas/integrations/asd/
  7. Prisma Docs: kb_catalog/schemas/prisma/ β†’ schemas/infrastructure/prisma/
  8. Workflows: kb_catalog/schemas/workflow-*.md β†’ schemas/workflows/
  9. Seed Files: output-styles/schemas/seeds/ β†’ few_shot_examples_training_data/data/*.jsonl
  10. Domain Seeds: output-styles/schemas/domain/v1/seeds/ β†’ few_shot_examples_training_data/data/*.jsonl

πŸ“Š Statistics

  • 327 seed examples converted from JSON to JSONL
  • 8 schema categories reorganized
  • 5 JSONL files created:
    • league_examples.jsonl (38 examples)
    • questionnaires.jsonl (53 examples)
    • schema_definitions.jsonl (75 examples)
    • sample_data.jsonl (5 examples)
    • legacy_seeds.jsonl (156 examples)

Verification Steps

Before running cleanup, verify the consolidation is working:

1. Test Schema Imports

cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
 
# Test Python imports
python -c "
from schemas.types.types import *
from schemas.types.sport_types import *
print('βœ… Python imports working')
"
 
# Check files exist
ls -lh schemas/types/
ls -lh schemas/integrations/api/
ls -lh schemas/integrations/forms/
ls -lh schemas/archetypes/

2. Verify JSONL Files

# Check JSONL files were created
ls -lh few_shot_examples_training_data/data/*.jsonl
 
# Count examples
echo "League examples:" && wc -l few_shot_examples_training_data/data/league_examples.jsonl
echo "Questionnaires:" && wc -l few_shot_examples_training_data/data/questionnaires.jsonl
echo "Schema definitions:" && wc -l few_shot_examples_training_data/data/schema_definitions.jsonl
echo "Sample data:" && wc -l few_shot_examples_training_data/data/sample_data.jsonl
echo "Legacy seeds:" && wc -l few_shot_examples_training_data/data/legacy_seeds.jsonl
 
# Validate JSON format
python -c "
import json
from pathlib import Path
 
files = [
    'league_examples.jsonl',
    'questionnaires.jsonl',
    'schema_definitions.jsonl',
    'sample_data.jsonl',
    'legacy_seeds.jsonl'
]
 
for filename in files:
    path = Path('few_shot_examples_training_data/data') / filename
    if not path.exists():
        print(f'❌ {filename} not found')
        continue
    
    with open(path) as f:
        for i, line in enumerate(f, 1):
            try:
                json.loads(line)
            except Exception as e:
                print(f'❌ {filename} line {i}: {e}')
                break
        else:
            print(f'βœ… {filename} is valid')
"

3. Test Seed Retrieval (Optional)

# Test the few-shot examples API
python -c "
from few_shot_examples_training_data import ExampleManager
 
manager = ExampleManager()
 
# Test loading different categories
categories = ['triage', 'contract-generation']  # Existing categories
for cat in categories:
    examples = manager.load_examples(cat)
    print(f'βœ… Loaded {len(examples)} examples for {cat}')
"

Running Cleanup

Automated Cleanup (Recommended)

cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
 
# Make script executable
chmod +x scripts/cleanup_consolidated_dirs.sh
 
# Run cleanup (will prompt for confirmation)
./scripts/cleanup_consolidated_dirs.sh

The script will:

  1. Show what will be deleted
  2. Prompt for confirmation
  3. Create a timestamped backup
  4. Remove consolidated directories
  5. Provide restore instructions

Manual Cleanup

If you prefer manual control:

cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
 
# Create backup first
mkdir -p _consolidation_backup
cp -r kb_catalog/schemas/ _consolidation_backup/kb_catalog_schemas/
cp -r output-styles/schemas/ _consolidation_backup/output_styles_schemas/
 
# Remove individual files
rm -f kb_catalog/schemas/types.py
rm -f kb_catalog/schemas/types.js
rm -f kb_catalog/schemas/sport_types.py
rm -f kb_catalog/schemas/user-roles.ts
rm -f kb_catalog/schemas/workflow-*.md
 
# Remove directories
rm -rf kb_catalog/schemas/archetypes/
rm -rf kb_catalog/schemas/api/
rm -rf kb_catalog/schemas/forms/
rm -rf kb_catalog/schemas/output/
rm -rf kb_catalog/schemas/asd/
rm -rf kb_catalog/schemas/prisma/
 
# Remove old seed directories
rm -rf output-styles/schemas/seeds/
rm -rf output-styles/schemas/domain/v1/seeds/

What Remains

kb_catalog/schemas/

This directory now contains ONLY knowledge base metadata:

kb_catalog/schemas/
β”œβ”€β”€ mappings/              # API-to-schema mappings (keep)
β”œβ”€β”€ metadata/              # Schema registry (keep)
β”œβ”€β”€ usage-guides/          # Documentation (keep)
└── README.md             # Knowledge base docs (keep)

These files provide business context ABOUT the schemas and should be kept.

output-styles/schemas/

After cleanup, this directory should be empty or only contain domain/ if there are other domain files that weren't seeds.

Rollback Instructions

If something goes wrong, restore from backup:

cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
 
# Find your backup
ls -lhd _consolidation_backup_*
 
# Restore (replace timestamp with your backup's timestamp)
BACKUP_DIR="_consolidation_backup_20250111_120000"
cp -r $BACKUP_DIR/kb_catalog_schemas_backup/* kb_catalog/schemas/
cp -r $BACKUP_DIR/output_styles_schemas_backup/* output-styles/schemas/
 
echo "βœ… Restored from backup"

Post-Cleanup Tasks

After cleanup is complete:

1. Update Import References

Search for old imports and update them:

# Find Python imports
grep -r "from kb_catalog.schemas" . --include="*.py" | grep -v "_backup"
 
# Find TypeScript imports
grep -r "kb_catalog/schemas" . --include="*.ts" --include="*.tsx" | grep -v "_backup"
 
# Update to new paths
# Old: from kb_catalog.schemas.types import SportType
# New: from schemas.types.types import SportType

2. Seed Database

Load the new JSONL files into the database:

cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
 
# Seed all new examples
uv run python scripts/seed.examples.py
 
# Verify seeding
python -c "
from prisma import Prisma
import asyncio
 
async def check():
    db = Prisma()
    await db.connect()
    
    count = await db.fewshotexample.count()
    print(f'Total examples in database: {count}')
    
    # Check by scenario
    scenarios = await db.fewshotexample.group_by(
        by=['scenario'],
        count={'_all': True}
    )
    
    print('\\nExamples by scenario:')
    for s in scenarios:
        print(f'  {s[\"scenario\"]}: {s[\"_count\"][\"_all\"]}')
    
    await db.disconnect()
 
asyncio.run(check())
"

3. Update Documentation

Update any project documentation that references the old paths:

  • Update import examples
  • Update path references
  • Update seed file documentation
  • Update architecture diagrams

4. Remove Backup (After Verification Period)

After 1-2 weeks of verified stable operation:

cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
 
# List backups
ls -lhd _consolidation_backup_*
 
# Remove old backup (replace with actual name)
rm -rf _consolidation_backup_20250111_120000
 
echo "βœ… Cleanup fully complete"

Troubleshooting

Problem: Import errors after cleanup

Solution:

# Check if files exist in new location
ls -lh schemas/types/types.py
ls -lh schemas/integrations/api/
 
# Update Python path if needed
export PYTHONPATH="/Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database:$PYTHONPATH"

Problem: Seed data not loading

Solution:

# Verify JSONL files
python -c "
import json
from pathlib import Path
 
for f in Path('few_shot_examples_training_data/data').glob('*.jsonl'):
    with open(f) as file:
        lines = list(file)
        print(f'{f.name}: {len(lines)} lines')
"
 
# Try rerunning consolidation script
python scripts/consolidate_seeds.py

Problem: Can't find old files

Solution:

# Search for file in backup
find _consolidation_backup_* -name "types.py"
 
# Restore specific file
cp _consolidation_backup_*/kb_catalog_schemas_backup/types.py kb_catalog/schemas/

Summary

βœ… Before running cleanup:

  • Verify schema imports work
  • Verify JSONL files are valid
  • Test seed retrieval

βœ… Run cleanup:

  • Use automated script OR manual commands
  • Backup is created automatically
  • Old directories removed

βœ… After cleanup:

  • Update import references
  • Seed database with new JSONL files
  • Update documentation
  • Remove backup after verification period

Documentation References

  • Consolidation Summary: docs/schema.consolidation-complete.md
  • Schema README: schemas/README.md
  • Types README: schemas/types/README.md
  • Few-Shot Examples: few_shot_examples_training_data/README.md

Questions?

If you encounter issues, check:

  1. Backup directory exists: _consolidation_backup_*/
  2. New files exist: schemas/types/, schemas/integrations/
  3. JSONL files valid: few_shot_examples_training_data/data/*.jsonl

For major issues, restore from backup and review the consolidation process.

Platform

Documentation

Community

Support

partnership@altsportsdata.comdev@altsportsleagues.ai

2025 Β© AltSportsLeagues.ai. Powered by AI-driven sports business intelligence.

πŸ€– AI-Enhancedβ€’πŸ“Š Data-Drivenβ€’βš‘ Real-Time