Architecture
πŸš€ Practical Migration Guide

Source: data_layer/docs/MIGRATION_GUIDE_PRACTICAL.md

πŸš€ Practical Migration Guide

Goal: Reorganize data_fabric/ from mixed organization to hybrid lifecycle structure

Timeline: 3 weeks (non-breaking, incremental)


πŸ“‹ Pre-Migration Checklist

1. Backup Current State

# Create timestamped backup
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
tar -czf "data_fabric_backup_${TIMESTAMP}.tar.gz" data_fabric/
echo "Backup created: data_fabric_backup_${TIMESTAMP}.tar.gz"

2. Document Current Import Paths

# Find all Python imports referencing data_fabric
grep -r "from data_fabric" . --include="*.py" > migration_imports_before.txt
grep -r "import data_fabric" . --include="*.py" >> migration_imports_before.txt
 
# Find all file references in configs
grep -r "data_fabric/" . --include="*.json" --include="*.yaml" > migration_paths_before.txt

3. Create .gitignore for views/

# Add to data_fabric/.gitignore
cat >> data_fabric/.gitignore << 'EOF'
# Generated outputs (views/)
views/*
!views/README.md
!views/**/README.md
 
# Runtime caches
weave/storage/examples/data/
weave/knowledge/storage/cache/
EOF

πŸ—οΈ Week 1: Build New Structure (Non-Breaking)

Phase 1A: Create Directory Structure

# Execute this script to create all directories at once
cat > scripts/create_new_structure.sh << 'EOF'
#!/bin/bash
set -e
 
echo "Creating new directory structure..."
 
# Level 1: definitions/
mkdir -p data_fabric/definitions/{schemas,config,templates,examples,catalog}
 
# schemas/ (already exists, just ensure structure)
mkdir -p data_fabric/definitions/schemas/{domain,generated,seeds}
mkdir -p data_fabric/definitions/schemas/generated/{drizzle,pydantic,typescript}
 
# config/
mkdir -p data_fabric/definitions/config/{business,sports,pipeline}
mkdir -p data_fabric/definitions/config/business/{pricing,scoring,contracts}
 
# templates/
mkdir -p data_fabric/definitions/templates/{prompts,contracts}
mkdir -p data_fabric/definitions/templates/prompts/{onboarding,contracts,classification,components}
 
# examples/
mkdir -p data_fabric/definitions/examples/{onboarding,sports_classification}
mkdir -p data_fabric/definitions/examples/onboarding/{questionnaire_extraction,tier_classification,contract_assembly}
 
# catalog/ (if kb_catalog should be here)
mkdir -p data_fabric/definitions/catalog/{constants,registry,manifests}
 
# Level 2: weave/
mkdir -p data_fabric/weave/{knowledge,storage,prompts,generators,validators}
 
# knowledge/ (already exists, ensure subdirs)
mkdir -p data_fabric/weave/knowledge/{embeddings,intent,retrieval,storage,templates}
 
# storage/ (already exists, ensure subdirs)
mkdir -p data_fabric/weave/storage/{examples,postgres,redis,supabase}
 
# prompts/
mkdir -p data_fabric/weave/prompts/{builders,registry}
 
# generators/ (new)
mkdir -p data_fabric/weave/generators
 
# validators/ (new)
mkdir -p data_fabric/weave/validators
 
# Level 3: views/
mkdir -p data_fabric/views/{onboarding,analytics,contracts,uploads}
mkdir -p data_fabric/views/onboarding/{02-ingest-validate-questionnaire,03-enhance-documents,04-classify-and-score,05-upsert-and-crossref,06-suggest-tiers-and-terms,07-assemble-contract,07a-output-contract-export,07b-output-gamekeeper-scorekeeper-ui,07c-output-marketing-nxt-onboarding-materials}
 
echo "βœ“ Directory structure created successfully"
EOF
 
chmod +x scripts/create_new_structure.sh
./scripts/create_new_structure.sh

Phase 1B: Create README Files

# Create READMEs to explain each directory
cat > scripts/create_readmes.sh << 'EOF'
#!/bin/bash
 
# definitions/ README
cat > data_fabric/definitions/README.md << 'INNER_EOF'
# definitions/ - Source of Truth
 
This directory contains all **canonical, version-controlled definitions**.
 
## Structure
 
- `schemas/` - Data structure definitions (JSON Schema, SQL DDL)
- `config/` - Business rules and configuration files
- `templates/` - Jinja2/Mustache templates for prompts and documents
- `examples/` - Training data and reference examples (JSONL)
- `catalog/` - System metadata and inventories
 
## Principles
 
- βœ… All files are **git-tracked**
- βœ… Files are **immutable** (don't change at runtime)
- βœ… These are **sources**, not generated outputs
- βœ… Changes require code review and versioning
 
## What Goes Here?
 
- Hand-written schemas
- Business configuration (pricing, scoring rules)
- Prompt templates
- Training examples for ML/LLM
- System constants and enums
INNER_EOF
 
# weave/ README
cat > data_fabric/weave/README.md << 'INNER_EOF'
# weave/ - Operational Runtime
 
This directory contains all **operational Python code** that runs the system.
 
## Structure
 
- `knowledge/` - AI/ML operations (embeddings, RAG, intent classification)
- `storage/` - Database operations (PostgreSQL, Redis, Supabase)
- `prompts/` - Dynamic prompt assembly and building
- `generators/` - Data transformation pipelines
- `validators/` - Data validation logic
 
## Principles
 
- βœ… All files are **Python modules** (.py)
- βœ… Code is **imported and executed**
- βœ… These are **operational services**, not data
- βœ… Well-tested with unit tests
 
## What Goes Here?
 
- Python modules that perform operations
- Services that interact with databases/APIs
- Code that transforms data
- Logic that validates data
INNER_EOF
 
# views/ README
cat > data_fabric/views/README.md << 'INNER_EOF'
# views/ - Generated Outputs
 
This directory contains all **generated, materialized outputs**.
 
## Structure
 
- `onboarding/` - Onboarding pipeline stage outputs
- `analytics/` - Analytics and reporting outputs
- `contracts/` - Generated contracts and documents
- `uploads/` - User-uploaded files
 
## Principles
 
- ⚠️ **ALL files are .gitignored**
- ⚠️ Files are **ephemeral** (can be deleted/regenerated)
- ⚠️ These are **outputs**, not sources
- ⚠️ No code reviews needed (auto-generated)
 
## What Goes Here?
 
- Pipeline stage artifacts
- Generated contracts/documents
- User uploads
- Cached/materialized data
- Temporary processing files
 
## Cleanup
 
```bash
# Safe to delete everything (will be regenerated)
rm -rf data_fabric/views/*

INNER_EOF

echo "βœ“ README files created" EOF

chmod +x scripts/create_readmes.sh ./scripts/create_readmes.sh


---

## πŸ“¦ Week 1: Copy Files (Non-Destructive)

### Phase 2A: Copy Config Files

```bash
# Copy (don't move) config files
cat > scripts/copy_configs.sh << 'EOF'
#!/bin/bash
set -e

echo "Copying config files..."

# Business config
if [ -d "data_fabric/output-styles/config/business/pricing" ]; then
  cp -r data_fabric/output-styles/config/business/pricing/* \
        data_fabric/definitions/config/business/pricing/
  echo "βœ“ Copied pricing configs"
fi

if [ -d "data_fabric/output-styles/config/business/scoring" ]; then
  cp -r data_fabric/output-styles/config/business/scoring/* \
        data_fabric/definitions/config/business/scoring/
  echo "βœ“ Copied scoring configs"
fi

# Verify copies
echo ""
echo "Verification:"
ls -la data_fabric/definitions/config/business/pricing/
ls -la data_fabric/definitions/config/business/scoring/

echo ""
echo "βœ“ Config files copied (originals preserved)"
EOF

chmod +x scripts/copy_configs.sh
./scripts/copy_configs.sh

Phase 2B: Copy Prompt Components

# Copy prompt components to templates
cat > scripts/copy_prompts.sh << 'EOF'
#!/bin/bash
set -e
 
echo "Copying prompt components..."
 
if [ -d "data_fabric/prompts/components" ]; then
  cp -r data_fabric/prompts/components/* \
        data_fabric/definitions/templates/prompts/components/
  echo "βœ“ Copied prompt components"
fi
 
# Verify
echo ""
echo "Verification:"
ls -la data_fabric/definitions/templates/prompts/components/
 
echo ""
echo "βœ“ Prompt components copied (originals preserved)"
EOF
 
chmod +x scripts/copy_prompts.sh
./scripts/copy_prompts.sh

Phase 2C: Move Code Files (Builders)

# Move (not copy) code files to weave/
cat > scripts/move_builders.sh << 'EOF'
#!/bin/bash
set -e
 
echo "Moving prompt builders to weave..."
 
if [ -d "data_fabric/prompts/builders" ]; then
  # Builders are code, should be in weave/
  mv data_fabric/prompts/builders/* \
     data_fabric/weave/prompts/builders/
  echo "βœ“ Moved prompt builders"
fi
 
# Verify
echo ""
echo "Verification:"
ls -la data_fabric/weave/prompts/builders/
 
echo ""
echo "βœ“ Builders moved to weave/"
EOF
 
chmod +x scripts/move_builders.sh
./scripts/move_builders.sh

πŸ”„ Week 2: Update Import Paths

Phase 3A: Create Import Mapping

# scripts/update_imports.py
"""
Automated import path updater for migration
"""
import re
from pathlib import Path
from typing import Dict
 
# Define import mappings
IMPORT_MAPPINGS: Dict[str, str] = {
    # Config imports
    r'from data_fabric\.output_styles\.config\.business\.pricing': 
        'from data_fabric.definitions.config.business.pricing',
    
    r'from data_fabric\.output_styles\.config\.business\.scoring': 
        'from data_fabric.definitions.config.business.scoring',
    
    # Prompt imports
    r'from data_fabric\.prompts\.components': 
        'from data_fabric.definitions.templates.prompts.components',
    
    r'from data_fabric\.prompts\.builders': 
        'from data_fabric.weave.prompts.builders',
    
    # Knowledge imports (if needed)
    r'from data_fabric\.knowledge': 
        'from data_fabric.weave.knowledge',
    
    # Storage imports (if needed)
    r'from data_fabric\.storage': 
        'from data_fabric.weave.storage',
}
 
def update_imports_in_file(file_path: Path) -> bool:
    """Update imports in a single Python file"""
    try:
        content = file_path.read_text()
        original_content = content
        
        # Apply all mappings
        for old_pattern, new_import in IMPORT_MAPPINGS.items():
            content = re.sub(old_pattern, new_import, content)
        
        # Only write if changes were made
        if content != original_content:
            file_path.write_text(content)
            print(f"βœ“ Updated: {file_path}")
            return True
        
        return False
    except Exception as e:
        print(f"βœ— Error updating {file_path}: {e}")
        return False
 
def main():
    """Update all Python files"""
    root = Path(".")
    
    # Find all Python files
    py_files = list(root.rglob("*.py"))
    
    # Exclude certain directories
    excluded = {"node_modules", ".git", "__pycache__", "venv", ".venv"}
    py_files = [
        f for f in py_files 
        if not any(ex in f.parts for ex in excluded)
    ]
    
    print(f"Found {len(py_files)} Python files")
    print("Updating imports...")
    print()
    
    updated_count = 0
    for py_file in py_files:
        if update_imports_in_file(py_file):
            updated_count += 1
    
    print()
    print(f"βœ“ Updated {updated_count} files")
    print(f"βœ“ Skipped {len(py_files) - updated_count} files (no changes needed)")
 
if __name__ == "__main__":
    main()

Phase 3B: Run Import Updates

# Run the import updater
python scripts/update_imports.py
 
# Review changes
git diff --stat
 
# If satisfied, commit
git add .
git commit -m "refactor: Update imports for new data_fabric structure"

βœ… Week 2: Test & Validate

Phase 4A: Test Imports

# scripts/test_imports.py
"""
Validate that all imports still work
"""
import sys
import importlib
from pathlib import Path
 
def test_config_imports():
    """Test config imports"""
    try:
        from data_fabric.definitions.config.business.pricing import tier_presets
        print("βœ“ Config imports work")
        return True
    except ImportError as e:
        print(f"βœ— Config import failed: {e}")
        return False
 
def test_prompt_imports():
    """Test prompt imports"""
    try:
        from data_fabric.weave.prompts.builders import onboarding_prompts
        print("βœ“ Prompt builder imports work")
        return True
    except ImportError as e:
        print(f"βœ— Prompt import failed: {e}")
        return False
 
def test_knowledge_imports():
    """Test knowledge imports"""
    try:
        from data_fabric.weave.knowledge.retrieval import rag_service
        print("βœ“ Knowledge imports work")
        return True
    except ImportError as e:
        print(f"βœ— Knowledge import failed: {e}")
        return False
 
def main():
    print("Testing imports after migration...")
    print()
    
    results = [
        test_config_imports(),
        test_prompt_imports(),
        test_knowledge_imports(),
    ]
    
    print()
    if all(results):
        print("βœ“ All imports working!")
        return 0
    else:
        print("βœ— Some imports failed")
        return 1
 
if __name__ == "__main__":
    sys.exit(main())
# Run import tests
python scripts/test_imports.py

Phase 4B: Run Existing Tests

# Run all unit tests
python -m pytest tests/ -v
 
# Run specific integration tests
python -m pytest tests/integration/ -v
 
# Check for any import errors
python -m pytest --co  # Collect tests (will fail if imports are broken)

πŸ—‘οΈ Week 3: Clean Up Old Structure

Phase 5A: Create Cleanup Script (DRY RUN FIRST!)

# scripts/cleanup_old_structure.sh
#!/bin/bash
set -e
 
DRY_RUN=${1:-"--dry-run"}
 
echo "Cleanup script starting..."
echo "Mode: ${DRY_RUN}"
echo ""
 
if [ "$DRY_RUN" == "--dry-run" ]; then
  echo "πŸ” DRY RUN MODE (no files will be deleted)"
  echo ""
fi
 
cleanup_file() {
  local file=$1
  local reason=$2
  
  if [ "$DRY_RUN" == "--dry-run" ]; then
    echo "Would delete: ${file} (${reason})"
  else
    if [ -e "$file" ]; then
      git rm -r "$file"
      echo "βœ“ Deleted: ${file}"
    fi
  fi
}
 
# Remove config from old location (now in definitions/config/)
cleanup_file "data_fabric/output-styles/config/" "moved to definitions/config/"
 
# Remove prompt components (now in definitions/templates/)
cleanup_file "data_fabric/prompts/components/" "moved to definitions/templates/prompts/"
 
# Note: Keep prompts/builders/ empty since files moved to weave/
 
if [ "$DRY_RUN" == "--dry-run" ]; then
  echo ""
  echo "βœ“ Dry run complete. Review changes above."
  echo ""
  echo "To actually delete files, run:"
  echo "  ./scripts/cleanup_old_structure.sh --execute"
else
  echo ""
  echo "βœ“ Cleanup complete"
  echo ""
  echo "Don't forget to commit:"
  echo "  git commit -m 'refactor: Remove old data_fabric structure after migration'"
fi

Phase 5B: Run Cleanup (Carefully!)

# First, DRY RUN to see what would be deleted
chmod +x scripts/cleanup_old_structure.sh
./scripts/cleanup_old_structure.sh --dry-run
 
# Review the output carefully!
 
# If everything looks good, execute
./scripts/cleanup_old_structure.sh --execute
 
# Commit the cleanup
git add .
git commit -m "refactor: Remove old data_fabric structure after migration"

🎯 Post-Migration Checklist

1. Verify Structure

# Check new structure exists
ls -la data_fabric/definitions/
ls -la data_fabric/weave/
ls -la data_fabric/views/
 
# Check files are in correct locations
ls -la data_fabric/definitions/config/business/pricing/
ls -la data_fabric/weave/prompts/builders/

2. Run Full Test Suite

# All tests should still pass
python -m pytest tests/ -v --tb=short
 
# Integration tests
python -m pytest tests/integration/ -v
 
# E2E tests
python -m pytest tests/e2e/ -v

3. Update Documentation

# Update main README
# Update architecture docs
# Update developer onboarding docs

4. Team Communication

## Migration Complete! πŸŽ‰
 
The `data_fabric/` directory has been reorganized for better clarity:
 
**New Structure:**
- `definitions/` - Source of truth (git-tracked)
- `weave/` - Operational code (Python modules)
- `views/` - Generated outputs (gitignored)
 
**What Changed:**
- Config files moved: `output-styles/config/` β†’ `definitions/config/`
- Prompt templates moved: `prompts/components/` β†’ `definitions/templates/prompts/`
- Builders moved: `prompts/builders/` β†’ `weave/prompts/builders/`
 
**Action Items:**
- Pull latest changes: `git pull`
- No code changes needed (imports auto-updated)
- Read new READMEs in each directory
 
**Questions?** See `data_fabric/ORGANIZATION_STRATEGY_COMPLETE.md`

🚨 Rollback Plan (If Needed)

If something goes wrong:

# 1. Restore from backup
tar -xzf data_fabric_backup_YYYYMMDD_HHMMSS.tar.gz
 
# 2. Reset git changes
git reset --hard HEAD~1  # Go back one commit
# or
git reset --hard <commit-hash>  # Go back to specific commit
 
# 3. Verify restoration
python -m pytest tests/ -v
 
# 4. Document what went wrong
# ... and plan better next time

πŸ“Š Success Metrics

Migration is complete when:

  • βœ… All files moved to correct locations
  • βœ… No broken imports
  • βœ… All tests passing
  • βœ… Old structure removed
  • βœ… Documentation updated
  • βœ… Team informed
  • βœ… CI/CD pipeline green

Bottom Line: Follow this guide step-by-step, TEST EVERYTHING, and you'll have a clean, well-organized data_fabric/ structure in 3 weeks.

Platform

Documentation

Community

Support

partnership@altsportsdata.comdev@altsportsleagues.ai

2025 Β© AltSportsLeagues.ai. Powered by AI-driven sports business intelligence.

πŸ€– AI-Enhancedβ€’πŸ“Š Data-Drivenβ€’βš‘ Real-Time