Source: data_layer/docs/EXECUTION_PLAN.md
Few-Shot Examples Restructuring - Execution Plan
π― Objective
Move few-shot examples infrastructure to apps/backend/services/few_shot_examples/ and consolidate data under database/output-styles/examples/.
β οΈ Pre-Migration Checklist
1. Create Backup
# Backup current structure
timestamp=$(date +%Y%m%d_%H%M%S)
mkdir -p backups/pre-migration-${timestamp}
# Backup data
cp -r database/few_shot_examples_training_data/ \
backups/pre-migration-${timestamp}/
# Backup any scripts that use it
cp scripts/seed.examples.py \
backups/pre-migration-${timestamp}/ 2>/dev/null || true
echo "β
Backup created: backups/pre-migration-${timestamp}/"2. Identify Dependencies
# Find all files importing from old location
echo "π Finding dependencies..."
grep -r "from database.few_shot_examples_training_data" \
apps/ \
scripts/ \
2>/dev/null | tee migration-dependencies.txt
echo "β
Dependencies saved to: migration-dependencies.txt"3. Run Existing Tests
# Run any existing tests to establish baseline
pytest database/few_shot_examples_training_data/ -v 2>/dev/null || \
echo "β οΈ No existing tests found"
# Note: We'll verify these still pass after migrationπ Step-by-Step Execution
STEP 1: Create New Structure (No Breaking Changes)
Duration: 5 minutes
Risk: Low (only creating new directories)
# Create service directory structure
mkdir -p apps/backend/services/few_shot_examples/tests
touch apps/backend/services/few_shot_examples/__init__.py
touch apps/backend/services/few_shot_examples/tests/__init__.py
# Create data directory structure
mkdir -p database/output-styles/examples/seeds
mkdir -p database/output-styles/examples/embeddings
touch database/output-styles/examples/seeds/README.md
touch database/output-styles/examples/embeddings/README.md
echo "β
STEP 1 COMPLETE: New structure created"Verification:
# Verify directories exist
test -d apps/backend/services/few_shot_examples && echo "β
Service dir exists"
test -d database/output-styles/examples/seeds && echo "β
Seeds dir exists"
test -d database/output-styles/examples/embeddings && echo "β
Embeddings dir exists"STEP 2: Create Configuration File
Duration: 5 minutes
Risk: Low (new file, doesn't break existing)
# Create config.py
cat > apps/backend/services/few_shot_examples/config.py << 'EOF'
"""Configuration for Few-Shot Examples Service."""
from pathlib import Path
from typing import Optional
from pydantic_settings import BaseSettings
class FewShotConfig(BaseSettings):
"""Configuration for Few-Shot Examples Service."""
# Data paths (relative to project root)
EXAMPLES_ROOT: Path = Path("database/output-styles/examples")
SEEDS_DIR: Path = EXAMPLES_ROOT / "seeds"
STRUCTURED_DIR: Path = EXAMPLES_ROOT / "by-scenario"
EDGE_CASES_DIR: Path = EXAMPLES_ROOT / "edge-cases"
EMBEDDINGS_DIR: Path = EXAMPLES_ROOT / "embeddings"
# Cache configuration
CACHE_MAX_SIZE: int = 2000
CACHE_TTL_SECONDS: int = 7200 # 2 hours
# Retrieval configuration
DEFAULT_MAX_EXAMPLES: int = 5
DEFAULT_QUALITY_THRESHOLD: float = 0.80
# Database configuration
DATABASE_URL: Optional[str] = None
class Config:
env_prefix = "FEW_SHOT_"
case_sensitive = False
# Global config instance
config = FewShotConfig()
EOF
echo "β
STEP 2 COMPLETE: Config file created"Verification:
# Verify config file exists and is valid Python
python3 -c "import sys; sys.path.insert(0, '.'); from apps.backend.services.few_shot_examples.config import config; print(f'β
Config valid: {config.SEEDS_DIR}')"STEP 3: Copy (Don't Move Yet) Infrastructure Files
Duration: 5 minutes
Risk: Low (copying, not moving)
# Copy Python modules (keeping originals as backup)
cp database/few_shot_examples_training_data/api.py \
apps/backend/services/few_shot_examples/
cp database/few_shot_examples_training_data/retriever.py \
apps/backend/services/few_shot_examples/
cp database/few_shot_examples_training_data/matcher.py \
apps/backend/services/few_shot_examples/
cp database/few_shot_examples_training_data/cache.py \
apps/backend/services/few_shot_examples/
cp database/few_shot_examples_training_data/example_manager.py \
apps/backend/services/few_shot_examples/
# Copy tests if they exist
cp database/few_shot_examples_training_data/test_*.py \
apps/backend/services/few_shot_examples/tests/ 2>/dev/null || true
# Copy README
cp database/few_shot_examples_training_data/README.md \
apps/backend/services/few_shot_examples/README.old.md
echo "β
STEP 3 COMPLETE: Files copied to new location"Verification:
# Count files in both locations
old_count=$(ls -1 database/few_shot_examples_training_data/*.py 2>/dev/null | wc -l)
new_count=$(ls -1 apps/backend/services/few_shot_examples/*.py 2>/dev/null | wc -l)
echo "Old location: $old_count Python files"
echo "New location: $new_count Python files"
echo "Expected: At least 6 files (config + 5 copied)"STEP 4: Update Imports in New Service Files
Duration: 15 minutes
Risk: Medium (modifying code, but originals still exist)
Create update script:
cat > scripts/update_service_imports.py << 'EOF'
#!/usr/bin/env python3
"""Update imports in newly copied service files."""
import re
from pathlib import Path
SERVICE_DIR = Path("apps/backend/services/few_shot_examples")
# Files to update
files_to_update = [
"api.py",
"retriever.py",
"matcher.py",
"cache.py",
"example_manager.py"
]
# Import replacements
replacements = [
# Add config import
(r'^(from pathlib import Path)', r'\1\nfrom apps.backend.services.few_shot_examples.config import config'),
# Update internal imports
(r'from database\.few_shot_examples_training_data',
r'from apps.backend.services.few_shot_examples'),
# Update data directory references
(r'Path\(__file__\)\.parent / "data"',
r'config.SEEDS_DIR'),
(r'DEFAULT_DATA_DIR = Path.*',
r'DEFAULT_DATA_DIR = config.SEEDS_DIR'),
]
for filename in files_to_update:
filepath = SERVICE_DIR / filename
if not filepath.exists():
print(f"β οΈ Skipping {filename} - not found")
continue
print(f"π Updating {filename}...")
content = filepath.read_text()
original_content = content
for pattern, replacement in replacements:
content = re.sub(pattern, replacement, content, flags=re.MULTILINE)
if content != original_content:
filepath.write_text(content)
print(f" β
Updated")
else:
print(f" βΉοΈ No changes needed")
print("\nβ
Import updates complete")
EOF
chmod +x scripts/update_service_imports.py
python3 scripts/update_service_imports.pyManual Review Required:
# Review changes in each file
echo "π Please review these files manually:"
echo " - apps/backend/services/few_shot_examples/api.py"
echo " - apps/backend/services/few_shot_examples/retriever.py"
echo " - apps/backend/services/few_shot_examples/matcher.py"
echo " - apps/backend/services/few_shot_examples/cache.py"
echo " - apps/backend/services/few_shot_examples/example_manager.py"
echo ""
echo "Look for:"
echo " 1. Config imports added"
echo " 2. Internal imports updated"
echo " 3. Path references using config"STEP 5: Create Service init.py
Duration: 5 minutes
Risk: Low
cat > apps/backend/services/few_shot_examples/__init__.py << 'EOF'
"""Few-Shot Examples Service.
Provides intelligent retrieval of few-shot examples for prompt engineering.
Usage:
from apps.backend.services.few_shot_examples import FewShotExamplesAPI
api = FewShotExamplesAPI()
examples = await api.get_examples_for_prompt(
prompt_text="partnership inquiry",
prompt_type="triage",
max_examples=5
)
"""
from .api import FewShotExamplesAPI
from .retriever import FewShotRetriever, RetrievalContext, RetrievalStrategy
from .matcher import SemanticMatcher
from .cache import ExampleCache
from .example_manager import ExampleManager
from .config import config, FewShotConfig
__all__ = [
"FewShotExamplesAPI",
"FewShotRetriever",
"RetrievalContext",
"RetrievalStrategy",
"SemanticMatcher",
"ExampleCache",
"ExampleManager",
"config",
"FewShotConfig",
]
__version__ = "1.0.0"
EOF
echo "β
STEP 5 COMPLETE: Service __init__.py created"Verification:
# Test import
python3 -c "from apps.backend.services.few_shot_examples import config; print(f'β
Service imports work: {config.SEEDS_DIR}')" || echo "β Import failed - review STEP 4 & 5"STEP 6: Copy Data Files
Duration: 5 minutes
Risk: Low (copying, not moving)
# Copy JSONL files to seeds/
echo "π¦ Copying JSONL files..."
cp database/few_shot_examples_training_data/data/*.jsonl \
database/output-styles/examples/seeds/ 2>/dev/null || true
# Copy any subdirectories
if [ -d "database/few_shot_examples_training_data/data/contract_sections" ]; then
cp -r database/few_shot_examples_training_data/data/contract_sections \
database/output-styles/examples/seeds/
fi
if [ -d "database/few_shot_examples_training_data/data/league_matching" ]; then
cp -r database/few_shot_examples_training_data/data/league_matching \
database/output-styles/examples/seeds/
fi
echo "β
STEP 6 COMPLETE: Data files copied"Verification:
# Count JSONL files
old_jsonl=$(find database/few_shot_examples_training_data/data -name "*.jsonl" 2>/dev/null | wc -l)
new_jsonl=$(find database/output-styles/examples/seeds -name "*.jsonl" 2>/dev/null | wc -l)
echo "Old location: $old_jsonl JSONL files"
echo "New location: $new_jsonl JSONL files"
[ "$old_jsonl" -eq "$new_jsonl" ] && echo "β
Counts match" || echo "β οΈ Counts don't match - review"STEP 7: Update Scripts That Use Old Paths
Duration: 10 minutes
Risk: Medium (modifying scripts)
# Create update script
cat > scripts/update_script_imports.py << 'EOF'
#!/usr/bin/env python3
"""Update imports in scripts that use few-shot examples."""
import re
from pathlib import Path
# Find scripts to update
scripts_dir = Path("scripts")
apps_dir = Path("apps")
# Import replacements
old_import = r'from database\.few_shot_examples_training_data'
new_import = r'from apps.backend.services.few_shot_examples'
old_path = r'database/few_shot_examples_training_data/data'
new_path = r'database/output-styles/examples/seeds'
files_updated = []
for script_file in scripts_dir.glob("*.py"):
content = script_file.read_text()
original = content
content = re.sub(old_import, new_import, content)
content = re.sub(old_path, new_path, content)
if content != original:
script_file.write_text(content)
files_updated.append(str(script_file))
print(f"β
Updated: {script_file}")
print(f"\nπ Updated {len(files_updated)} files")
EOF
chmod +x scripts/update_script_imports.py
python3 scripts/update_script_imports.pyManual Check:
# List files that still have old imports
echo "π Checking for remaining old imports..."
grep -r "from database.few_shot_examples_training_data" \
scripts/ \
apps/ \
2>/dev/null || echo "β
No old imports found"STEP 8: Test New Service
Duration: 15 minutes
Risk: Low (testing only)
# Create test script
cat > scripts/test_new_service.py << 'EOF'
#!/usr/bin/env python3
"""Test the new service location."""
import sys
from pathlib import Path
# Add project root to path
sys.path.insert(0, str(Path(__file__).parent.parent))
def test_imports():
"""Test that imports work."""
print("1οΈβ£ Testing imports...")
try:
from apps.backend.services.few_shot_examples import (
FewShotExamplesAPI,
config,
ExampleManager
)
print(" β
Imports successful")
return True
except Exception as e:
print(f" β Import failed: {e}")
return False
def test_config():
"""Test configuration."""
print("2οΈβ£ Testing configuration...")
try:
from apps.backend.services.few_shot_examples import config
print(f" β
Seeds dir: {config.SEEDS_DIR}")
print(f" β
Structured dir: {config.STRUCTURED_DIR}")
print(f" β
Cache size: {config.CACHE_MAX_SIZE}")
return True
except Exception as e:
print(f" β Config test failed: {e}")
return False
def test_paths():
"""Test that paths exist."""
print("3οΈβ£ Testing paths...")
from apps.backend.services.few_shot_examples import config
paths_to_check = [
config.EXAMPLES_ROOT,
config.SEEDS_DIR,
config.STRUCTURED_DIR,
]
all_exist = True
for path in paths_to_check:
if path.exists():
print(f" β
Exists: {path}")
else:
print(f" β Missing: {path}")
all_exist = False
return all_exist
def test_example_manager():
"""Test ExampleManager can load files."""
print("4οΈβ£ Testing ExampleManager...")
try:
from apps.backend.services.few_shot_examples import ExampleManager
manager = ExampleManager()
# Try to load triage examples
examples = manager.load_examples("triage")
print(f" β
Loaded {len(examples)} triage examples")
return True
except Exception as e:
print(f" β οΈ ExampleManager test: {e}")
print(f" βΉοΈ This may fail if seed script hasn't run yet")
return True # Don't fail on this
if __name__ == "__main__":
print("π§ͺ Testing New Few-Shot Examples Service\n")
results = [
test_imports(),
test_config(),
test_paths(),
test_example_manager(),
]
print(f"\nπ Results: {sum(results)}/{len(results)} tests passed")
if all(results):
print("β
All tests passed! Service is ready.")
sys.exit(0)
else:
print("β οΈ Some tests failed. Review above for details.")
sys.exit(1)
EOF
chmod +x scripts/test_new_service.py
python3 scripts/test_new_service.pySTEP 9: Update All Dependent Code
Duration: 20 minutes
Risk: High (modifying working code)
# Review the dependency file we created earlier
echo "π Files that need updating:"
cat migration-dependencies.txt
# For each file, update imports
echo ""
echo "π§ Manual updates required:"
echo " Replace: from database.few_shot_examples_training_data import X"
echo " With: from apps.backend.services.few_shot_examples import X"
echo ""
echo " Replace: database/few_shot_examples_training_data/data"
echo " With: database/output-styles/examples/seeds"Create helper script:
cat > scripts/final_import_update.py << 'EOF'
#!/usr/bin/env python3
"""Final pass to update all imports."""
import re
import sys
from pathlib import Path
# Directories to search
search_dirs = ["apps/backend", "scripts"]
old_import = r'from database\.few_shot_examples_training_data'
new_import = r'from apps.backend.services.few_shot_examples'
files_updated = []
for search_dir in search_dirs:
for py_file in Path(search_dir).rglob("*.py"):
try:
content = py_file.read_text()
if old_import in content:
original = content
content = re.sub(old_import, new_import, content)
if content != original:
py_file.write_text(content)
files_updated.append(str(py_file))
print(f"β
Updated: {py_file}")
except Exception as e:
print(f"β οΈ Error with {py_file}: {e}")
print(f"\nπ Updated {len(files_updated)} files")
if files_updated:
print("\nπ Updated files:")
for f in files_updated:
print(f" - {f}")
EOF
chmod +x scripts/final_import_update.py
python3 scripts/final_import_update.pySTEP 10: Run Full Test Suite
Duration: 10 minutes
Risk: Low (verification only)
# Run pytest on backend
echo "π§ͺ Running backend tests..."
pytest apps/backend/services/few_shot_examples/tests/ -v 2>/dev/null || \
echo "β οΈ No tests found or tests failed"
# Test imports across codebase
echo "π§ͺ Testing imports..."
python3 scripts/test_new_service.py
# Try seed script if it exists
if [ -f "scripts/seed.examples.py" ]; then
echo "π§ͺ Testing seed script..."
python3 scripts/seed.examples.py --help || \
echo "β οΈ Seed script needs review"
fi
echo "β
STEP 10 COMPLETE: Testing done"STEP 11: Remove Old Structure (Point of No Return)
Duration: 2 minutes
Risk: HIGH (deleting old files)
β οΈ ONLY proceed if all tests pass!
# Final verification before deletion
echo "β οΈ FINAL CHECK before deletion:"
echo ""
echo "1. Have all tests passed? (y/n)"
read -r tests_pass
echo "2. Have you reviewed all changed files? (y/n)"
read -r reviewed
echo "3. Is there a backup? (y/n)"
read -r has_backup
if [ "$tests_pass" = "y" ] && [ "$reviewed" = "y" ] && [ "$has_backup" = "y" ]; then
echo ""
echo "ποΈ Removing old structure..."
mv database/few_shot_examples_training_data/ \
backups/few_shot_examples_training_data.old/
echo "β
STEP 11 COMPLETE: Old structure archived"
echo " Location: backups/few_shot_examples_training_data.old/"
else
echo "β Skipping deletion - review criteria not met"
exit 1
fiSTEP 12: Update Documentation
Duration: 10 minutes
Risk: Low
# Create service README
cat > apps/backend/services/few_shot_examples/README.md << 'EOF'
# Few-Shot Examples Service
## Quick Start
```python
from apps.backend.services.few_shot_examples import FewShotExamplesAPI
api = FewShotExamplesAPI()
examples = await api.get_examples_for_prompt(
prompt_text="partnership inquiry",
prompt_type="triage",
max_examples=5
)Location
- Service:
apps/backend/services/few_shot_examples/ - Data:
database/output-styles/examples/
Data Structure
database/output-styles/examples/seeds/- JSONL for DB seedingdatabase/output-styles/examples/by-scenario/- Structured JSONdatabase/output-styles/examples/edge-cases/- Edge casesdatabase/output-styles/examples/embeddings/- Vector embeddings
Configuration
Environment variables:
FEW_SHOT_CACHE_MAX_SIZE=2000
FEW_SHOT_CACHE_TTL_SECONDS=7200Seeding Database
uv run python scripts/seed.examples.py --category triageSee service documentation for more details. EOF
Create data README
cat > database/output-styles/examples/seeds/README.md << 'EOF'
Few-Shot Examples Seeds
This directory contains JSONL files for seeding the few-shot examples database.
Format
Each line is a complete JSON object:
{"id": "triage_001", "scenario": "partnership", "sport": "soccer", ...}
{"id": "triage_002", "scenario": "support", "sport": "basketball", ...}Usage
Seed database:
uv run python scripts/seed.examples.py --category triageFiles
triage.jsonl- Email triage examplescontract-generation.jsonl- Contract examplespdf-processing.jsonl- PDF extraction examplesresponse-generation.jsonl- Response templatesonboarding-response.jsonl- Onboarding workflows
See main service README for more details. EOF
echo "β STEP 12 COMPLETE: Documentation created"
---
## β
Post-Migration Verification
### Final Checklist
```bash
cat > scripts/post_migration_checklist.sh << 'EOF'
#!/bin/bash
echo "π Post-Migration Verification Checklist"
echo ""
# 1. Service imports work
echo "1οΈβ£ Testing service imports..."
python3 -c "from apps.backend.services.few_shot_examples import FewShotExamplesAPI, config; print('β
Imports work')" || echo "β Imports failed"
# 2. Data files exist
echo "2οΈβ£ Checking data files..."
jsonl_count=$(find database/output-styles/examples/seeds -name "*.jsonl" | wc -l)
echo " Found $jsonl_count JSONL files"
[ "$jsonl_count" -gt 0 ] && echo "β
Data files exist" || echo "β No data files"
# 3. Old location removed
echo "3οΈβ£ Checking old location removed..."
[ ! -d "database/few_shot_examples_training_data" ] && \
echo "β
Old directory removed" || \
echo "β οΈ Old directory still exists"
# 4. No old imports remain
echo "4οΈβ£ Checking for old imports..."
old_imports=$(grep -r "from database.few_shot_examples_training_data" apps/ scripts/ 2>/dev/null | wc -l)
[ "$old_imports" -eq 0 ] && \
echo "β
No old imports found" || \
echo "β οΈ Found $old_imports old imports - review needed"
# 5. Configuration works
echo "5οΈβ£ Testing configuration..."
python3 -c "from apps.backend.services.few_shot_examples import config; assert config.SEEDS_DIR.exists(); print('β
Config valid')" || echo "β Config issue"
# 6. Backup exists
echo "6οΈβ£ Checking backup..."
backup_exists=$(find backups/ -type d -name "*pre-migration*" 2>/dev/null | wc -l)
[ "$backup_exists" -gt 0 ] && \
echo "β
Backup found" || \
echo "β οΈ No backup found"
echo ""
echo "β
Migration verification complete"
EOF
chmod +x scripts/post_migration_checklist.sh
./scripts/post_migration_checklist.shπ¨ Rollback Procedure
If something goes wrong:
#!/bin/bash
# rollback.sh
echo "π Rolling back migration..."
# Find most recent backup
backup=$(find backups/ -type d -name "pre-migration-*" | sort -r | head -1)
if [ -z "$backup" ]; then
echo "β No backup found!"
exit 1
fi
echo "Using backup: $backup"
# Restore old structure
rm -rf database/few_shot_examples_training_data/
cp -r "$backup/few_shot_examples_training_data" database/
# Remove new structure
rm -rf apps/backend/services/few_shot_examples/
# Restore scripts
if [ -f "$backup/seed.examples.py" ]; then
cp "$backup/seed.examples.py" scripts/
fi
echo "β
Rollback complete"
echo "β οΈ You may need to manually revert code changes"π Success Criteria
Migration is complete when:
- β
Service imports work:
from apps.backend.services.few_shot_examples import FewShotExamplesAPI - β
All data files in
database/output-styles/examples/seeds/ - β
No files in
database/few_shot_examples_training_data/ - β No old imports remain in codebase
- β All tests pass
- β Seed script works
- β Documentation updated
- β Backup created
π― Quick Start (Run All Steps)
#!/bin/bash
# execute_migration.sh
set -e # Exit on error
echo "π Starting Few-Shot Examples Migration"
echo "======================================"
# Run all steps
./scripts/migration_step_1.sh # Create structure
./scripts/migration_step_2.sh # Create config
./scripts/migration_step_3.sh # Copy files
./scripts/migration_step_4.sh # Update imports
./scripts/migration_step_5.sh # Create __init__
./scripts/migration_step_6.sh # Copy data
./scripts/migration_step_7.sh # Update scripts
./scripts/migration_step_8.sh # Test service
./scripts/migration_step_9.sh # Update code
./scripts/migration_step_10.sh # Test all
# MANUAL: Review everything
./scripts/migration_step_11.sh # Remove old (requires confirmation)
./scripts/migration_step_12.sh # Update docs
./scripts/post_migration_checklist.sh
echo ""
echo "β
Migration complete!"π Notes
- Each step is idempotent where possible
- Steps 1-10 can be run multiple times safely
- Step 11 (deletion) is point of no return
- Always verify backups before step 11
- Test thoroughly before removing old structure
π Getting Help
If stuck at any step:
- Check the step's verification section
- Review error messages
- Check backup exists
- Consider rollback if needed
- Review
migration-dependencies.txtfor affected files