Source: data_layer/docs/CONSOLIDATION_CLEANUP_GUIDE.md
Consolidation Cleanup Guide
Overview
The schema and seed consolidation is complete. This guide explains what can be safely cleaned up and how to do it.
What Was Consolidated
β Completed Consolidations
- Type Definitions:
kb_catalog/schemas/βschemas/types/ - Archetypes:
kb_catalog/schemas/archetypes/βschemas/archetypes/ - API Schemas:
kb_catalog/schemas/api/βschemas/integrations/api/ - Forms:
kb_catalog/schemas/forms/βschemas/integrations/forms/ - Output Schemas:
kb_catalog/schemas/output/βschemas/integrations/output/ - ASD Integration:
kb_catalog/schemas/asd/βschemas/integrations/asd/ - Prisma Docs:
kb_catalog/schemas/prisma/βschemas/infrastructure/prisma/ - Workflows:
kb_catalog/schemas/workflow-*.mdβschemas/workflows/ - Seed Files:
output-styles/schemas/seeds/βfew_shot_examples_training_data/data/*.jsonl - Domain Seeds:
output-styles/schemas/domain/v1/seeds/βfew_shot_examples_training_data/data/*.jsonl
π Statistics
- 327 seed examples converted from JSON to JSONL
- 8 schema categories reorganized
- 5 JSONL files created:
league_examples.jsonl(38 examples)questionnaires.jsonl(53 examples)schema_definitions.jsonl(75 examples)sample_data.jsonl(5 examples)legacy_seeds.jsonl(156 examples)
Verification Steps
Before running cleanup, verify the consolidation is working:
1. Test Schema Imports
cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
# Test Python imports
python -c "
from schemas.types.types import *
from schemas.types.sport_types import *
print('β
Python imports working')
"
# Check files exist
ls -lh schemas/types/
ls -lh schemas/integrations/api/
ls -lh schemas/integrations/forms/
ls -lh schemas/archetypes/2. Verify JSONL Files
# Check JSONL files were created
ls -lh few_shot_examples_training_data/data/*.jsonl
# Count examples
echo "League examples:" && wc -l few_shot_examples_training_data/data/league_examples.jsonl
echo "Questionnaires:" && wc -l few_shot_examples_training_data/data/questionnaires.jsonl
echo "Schema definitions:" && wc -l few_shot_examples_training_data/data/schema_definitions.jsonl
echo "Sample data:" && wc -l few_shot_examples_training_data/data/sample_data.jsonl
echo "Legacy seeds:" && wc -l few_shot_examples_training_data/data/legacy_seeds.jsonl
# Validate JSON format
python -c "
import json
from pathlib import Path
files = [
'league_examples.jsonl',
'questionnaires.jsonl',
'schema_definitions.jsonl',
'sample_data.jsonl',
'legacy_seeds.jsonl'
]
for filename in files:
path = Path('few_shot_examples_training_data/data') / filename
if not path.exists():
print(f'β {filename} not found')
continue
with open(path) as f:
for i, line in enumerate(f, 1):
try:
json.loads(line)
except Exception as e:
print(f'β {filename} line {i}: {e}')
break
else:
print(f'β
{filename} is valid')
"3. Test Seed Retrieval (Optional)
# Test the few-shot examples API
python -c "
from few_shot_examples_training_data import ExampleManager
manager = ExampleManager()
# Test loading different categories
categories = ['triage', 'contract-generation'] # Existing categories
for cat in categories:
examples = manager.load_examples(cat)
print(f'β
Loaded {len(examples)} examples for {cat}')
"Running Cleanup
Automated Cleanup (Recommended)
cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
# Make script executable
chmod +x scripts/cleanup_consolidated_dirs.sh
# Run cleanup (will prompt for confirmation)
./scripts/cleanup_consolidated_dirs.shThe script will:
- Show what will be deleted
- Prompt for confirmation
- Create a timestamped backup
- Remove consolidated directories
- Provide restore instructions
Manual Cleanup
If you prefer manual control:
cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
# Create backup first
mkdir -p _consolidation_backup
cp -r kb_catalog/schemas/ _consolidation_backup/kb_catalog_schemas/
cp -r output-styles/schemas/ _consolidation_backup/output_styles_schemas/
# Remove individual files
rm -f kb_catalog/schemas/types.py
rm -f kb_catalog/schemas/types.js
rm -f kb_catalog/schemas/sport_types.py
rm -f kb_catalog/schemas/user-roles.ts
rm -f kb_catalog/schemas/workflow-*.md
# Remove directories
rm -rf kb_catalog/schemas/archetypes/
rm -rf kb_catalog/schemas/api/
rm -rf kb_catalog/schemas/forms/
rm -rf kb_catalog/schemas/output/
rm -rf kb_catalog/schemas/asd/
rm -rf kb_catalog/schemas/prisma/
# Remove old seed directories
rm -rf output-styles/schemas/seeds/
rm -rf output-styles/schemas/domain/v1/seeds/What Remains
kb_catalog/schemas/
This directory now contains ONLY knowledge base metadata:
kb_catalog/schemas/
βββ mappings/ # API-to-schema mappings (keep)
βββ metadata/ # Schema registry (keep)
βββ usage-guides/ # Documentation (keep)
βββ README.md # Knowledge base docs (keep)These files provide business context ABOUT the schemas and should be kept.
output-styles/schemas/
After cleanup, this directory should be empty or only contain domain/ if there are other domain files that weren't seeds.
Rollback Instructions
If something goes wrong, restore from backup:
cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
# Find your backup
ls -lhd _consolidation_backup_*
# Restore (replace timestamp with your backup's timestamp)
BACKUP_DIR="_consolidation_backup_20250111_120000"
cp -r $BACKUP_DIR/kb_catalog_schemas_backup/* kb_catalog/schemas/
cp -r $BACKUP_DIR/output_styles_schemas_backup/* output-styles/schemas/
echo "β
Restored from backup"Post-Cleanup Tasks
After cleanup is complete:
1. Update Import References
Search for old imports and update them:
# Find Python imports
grep -r "from kb_catalog.schemas" . --include="*.py" | grep -v "_backup"
# Find TypeScript imports
grep -r "kb_catalog/schemas" . --include="*.ts" --include="*.tsx" | grep -v "_backup"
# Update to new paths
# Old: from kb_catalog.schemas.types import SportType
# New: from schemas.types.types import SportType2. Seed Database
Load the new JSONL files into the database:
cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
# Seed all new examples
uv run python scripts/seed.examples.py
# Verify seeding
python -c "
from prisma import Prisma
import asyncio
async def check():
db = Prisma()
await db.connect()
count = await db.fewshotexample.count()
print(f'Total examples in database: {count}')
# Check by scenario
scenarios = await db.fewshotexample.group_by(
by=['scenario'],
count={'_all': True}
)
print('\\nExamples by scenario:')
for s in scenarios:
print(f' {s[\"scenario\"]}: {s[\"_count\"][\"_all\"]}')
await db.disconnect()
asyncio.run(check())
"3. Update Documentation
Update any project documentation that references the old paths:
- Update import examples
- Update path references
- Update seed file documentation
- Update architecture diagrams
4. Remove Backup (After Verification Period)
After 1-2 weeks of verified stable operation:
cd /Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database
# List backups
ls -lhd _consolidation_backup_*
# Remove old backup (replace with actual name)
rm -rf _consolidation_backup_20250111_120000
echo "β
Cleanup fully complete"Troubleshooting
Problem: Import errors after cleanup
Solution:
# Check if files exist in new location
ls -lh schemas/types/types.py
ls -lh schemas/integrations/api/
# Update Python path if needed
export PYTHONPATH="/Users/kbselander/Developer/Notebook/mcp-servers/servers/mcp-server-altsportsleagues.ai/2.1-cloud-run-docker-mcp/database:$PYTHONPATH"Problem: Seed data not loading
Solution:
# Verify JSONL files
python -c "
import json
from pathlib import Path
for f in Path('few_shot_examples_training_data/data').glob('*.jsonl'):
with open(f) as file:
lines = list(file)
print(f'{f.name}: {len(lines)} lines')
"
# Try rerunning consolidation script
python scripts/consolidate_seeds.pyProblem: Can't find old files
Solution:
# Search for file in backup
find _consolidation_backup_* -name "types.py"
# Restore specific file
cp _consolidation_backup_*/kb_catalog_schemas_backup/types.py kb_catalog/schemas/Summary
β Before running cleanup:
- Verify schema imports work
- Verify JSONL files are valid
- Test seed retrieval
β Run cleanup:
- Use automated script OR manual commands
- Backup is created automatically
- Old directories removed
β After cleanup:
- Update import references
- Seed database with new JSONL files
- Update documentation
- Remove backup after verification period
Documentation References
- Consolidation Summary:
docs/schema.consolidation-complete.md - Schema README:
schemas/README.md - Types README:
schemas/types/README.md - Few-Shot Examples:
few_shot_examples_training_data/README.md
Questions?
If you encounter issues, check:
- Backup directory exists:
_consolidation_backup_*/ - New files exist:
schemas/types/,schemas/integrations/ - JSONL files valid:
few_shot_examples_training_data/data/*.jsonl
For major issues, restore from backup and review the consolidation process.