Source: data_layer/docs/QUICK_START.md
Quick Start Guide: Prompt Management System
Last Updated: October 18, 2025
π 5-Minute Quickstart
1. Verify System Works
# Run comprehensive tests (proves both use cases)
python data_layer/scripts/test_prompt_retrieval.py
# Expected: β
All tests passing2. Search for Prompts
from data_layer.scripts.test_prompt_retrieval import SimplePromptRetriever
# Initialize
retriever = SimplePromptRetriever()
# Find prompts for league onboarding
results = retriever.search_by_keywords(
keywords=["league", "questionnaire", "extraction", "database"],
top_k=5
)
# Print results
for i, r in enumerate(results, 1):
print(f"{i}. {r['title']} (score: {r['score']:.1f})")
print(f" ID: {r['id']}")
print(f" Type: {r['type']}")
print()3. Get Specific Prompt
# Get by ID
prompt = retriever.get_by_id("workflows.league-questionnaire-extraction.v1")
print(f"Title: {prompt['title']}")
print(f"Type: {prompt['type']}")
print(f"Confidence: {prompt['confidence']*100:.0f}%")
print(f"Tags: {', '.join(prompt['tags'][:5])}")π Common Use Cases
Use Case 1: League Onboarding Pipeline
from data_layer.scripts.test_prompt_retrieval import SimplePromptRetriever
from data_layer.scripts.generate_adapters import (
LeagueQuestionnaireSchema,
TierClassificationSchema
)
# 1. Find relevant prompts
retriever = SimplePromptRetriever()
prompts = retriever.search_by_keywords(
keywords=["league", "questionnaire", "extraction", "database", "upsert"],
top_k=5
)
# 2. Build workflow
print("League Onboarding Workflow:")
print()
print("Step 1: Extract Questionnaire")
print(f" Prompt: {prompts[0]['id']}")
print(f" Output: LeagueQuestionnaireSchema")
print()
print("Step 2: Enrich Data")
print(f" Prompt: {prompts[1]['id']}")
print(f" Output: EnrichedLeagueDataSchema")
print()
print("Step 3: Classify Tier")
print(f" Prompt: {prompts[2]['id']}")
print(f" Output: TierClassificationSchema")
print()
print("Step 4: Upsert to Database")
print(" Output: DatabaseUpsertResultSchema")Use Case 2: Contract Generation
from data_layer.scripts.test_prompt_retrieval import SimplePromptRetriever
from data_layer.scripts.generate_adapters import (
ContractTermsSchema,
NegotiationPackageSchema
)
# 1. Find contract templates
retriever = SimplePromptRetriever()
prompts = retriever.search_by_keywords(
keywords=["tier", "contract", "partnership", "pricing", "premium"],
top_k=5
)
# 2. Build workflow
print("Contract Generation Workflow:")
print()
print("Step 1: Load Profile")
print(" Input: league_id")
print(f" Output: LeagueProfileSchema")
print()
print("Step 2: Generate Terms")
print(f" Prompt: {prompts[0]['id']}")
print(f" Output: ContractTermsSchema")
print()
print("Step 3: Create Variants")
print(f" Prompt: {prompts[1]['id']}")
print(f" Output: PricingVariantsSchema")
print()
print("Step 4: Generate Documents")
print(f" Prompt: {prompts[2]['id']}")
print(f" Output: NegotiationPackageSchema")
print()
print("Step 5: Save Outputs")
print(" Location: ./output/contracts/League_Name_TIMESTAMP/")
print(" Files: contract_deal.md, contract_list.md, contract_ceiling.md")Use Case 3: Search with Filters
# Search only workflows
workflow_results = retriever.search_by_keywords(
keywords=["email", "processing"],
top_k=3,
filter_type="workflow"
)
# Search only contract templates
contract_results = retriever.search_by_keywords(
keywords=["tier", "partnership"],
top_k=3,
filter_type="contract_template"
)
# Search with minimum confidence
high_quality = retriever.search_by_keywords(
keywords=["league", "analysis"],
top_k=5,
min_confidence=0.80 # Only prompts with 80%+ confidence
)π§ System Maintenance
Rebuild Registry (When Prompts Change)
# Step 1: Scan prompts and rebuild registry
python data_layer/scripts/scan_prompts.py
# Step 2: Regenerate enriched documentation
python data_layer/scripts/generate_prompt_docs.py
# Step 3: Sync to Google Drive
python data_layer/scripts/sync_to_drive.py
# Step 4: Re-index for semantic search (optional)
python data_layer/scripts/index_prompts.pyCheck System Status
# View registry statistics
python -c "
import json
from pathlib import Path
registry_file = Path('data_layer/kb_catalog/manifests/prompt_registry.json')
registry = json.loads(registry_file.read_text())
print(f'Total Prompts: {len(registry[\"prompts\"])}')
print(f'Last Updated: {registry.get(\"generated_at\", \"N/A\")}')
# Count by type
types = {}
for p in registry['prompts']:
t = p['type']
types[t] = types.get(t, 0) + 1
print('\\nBy Type:')
for t, count in sorted(types.items()):
print(f' {t}: {count}')
"π Performance Benchmarks
| Operation | Time | Notes |
|---|---|---|
| Registry lookup by ID | < 1ms | Direct dictionary access |
| Keyword search | < 10ms | 116 prompts scanned |
| LangMem semantic search | < 100ms | With embeddings (optional) |
| Registry rebuild | ~2-3 seconds | Scans all .md files |
| Docs generation | ~5-10 seconds | Enriches 116 prompts |
| Google Drive sync | ~30-60 seconds | Uploads 116 files |
| LangMem indexing | ~2-3 minutes | Creates embeddings |
π― Prompt Types
| Type | Count | Description |
|---|---|---|
agent | 25 | AI agent prompts with specific capabilities |
workflow | 22 | Multi-step process prompts |
contract_template | 20 | Partnership contract templates |
legal_template | 3 | Legal document templates |
component | 4 | Reusable prompt components |
general | 42 | General-purpose prompts |
| Total | 116 | All prompts cataloged |
π Search Tips
Effective Keywords
For League Onboarding:
- "league", "questionnaire", "extraction"
- "database", "upsert", "fingerprint"
- "processing", "data", "classification"
For Contract Generation:
- "tier", "contract", "partnership"
- "pricing", "terms", "agreement"
- "premium", "negotiation"
For Email Processing:
- "email", "classification", "routing"
- "processing", "workflow"
For Data Validation:
- "validation", "quality", "consistency"
- "sanity", "completeness"
Best Practices
- Use 3-5 keywords for best results
- Include domain terms (league, contract, tier, etc.)
- Add action words (extraction, generation, validation)
- Filter by type when searching for specific formats
- Set minimum confidence to ensure quality
π Schema Integration
All prompts integrate with Pydantic schemas from generate_adapters.py:
from data_layer.scripts.generate_adapters import (
# League schemas
LeagueQuestionnaireSchema,
LeagueProfileSchema,
TierClassificationSchema,
# Contract schemas
ContractTermsSchema,
NegotiationPackageSchema,
PricingVariantsSchema,
# Database schemas
DatabaseUpsertResultSchema,
# And many more...
)Usage Example:
# Get prompt with schema info
prompt = retriever.get_by_id("workflows.league-questionnaire-extraction.v1")
print(f"Required Input Schemas: {prompt['requires_schemas']}")
# Output: ['LeagueQuestionnaireSchema']
print(f"Output Schema: {prompt['output_schema']}")
# Output: 'ExtractedLeagueDataSchema'
# Use schema for validation
from data_layer.scripts.generate_adapters import LeagueQuestionnaireSchema
data = extract_questionnaire("./questionnaire.pdf")
validated = LeagueQuestionnaireSchema(**data) # Validates structureπ€ Agent Suggestions
Each prompt suggests relevant agents:
prompt = retriever.get_by_id("workflows.league-questionnaire-extraction.v1")
print(f"Suggested Agents: {prompt['agents_suggested']}")
# Output: ['document-processor', 'email-handler', 'ocr-agent']
# Use agents from kb_catalog
from pathlib import Path
import json
agents_file = Path("data_layer/kb_catalog/manifests/agents.json")
agents = json.loads(agents_file.read_text())
for agent_id in prompt['agents_suggested']:
agent = next((a for a in agents if a['id'] == agent_id), None)
if agent:
print(f"\nAgent: {agent['name']}")
print(f"Description: {agent['description']}")
print(f"Tools: {', '.join(agent['tools'][:3])}")π File Locations
Key Files
data_layer/
βββ scripts/
β βββ scan_prompts.py # Build registry
β βββ generate_prompt_docs.py # Generate docs
β βββ sync_to_drive.py # Sync to Drive
β βββ index_prompts.py # LangMem indexing
β βββ test_prompt_retrieval.py # Test suite
β βββ demo_prompt_workflows.py # Demo workflows
β
βββ kb_catalog/manifests/
β βββ prompt_registry.json # Main registry (116 prompts)
β βββ agents.json # Agent catalog
β
βββ storage/
β βββ prompts/docs/ # Enriched documentation
β βββ prompts/drive_sync/ # Drive sync state
β βββ embeddings/langmem_index/ # Semantic search index
β
βββ prompts/ # Source .md files (source of truth)
βββ workflows/
βββ agents/
βββ specs/contracts/
βββ specs/legal/
βββ components/Viewing Prompts
# List all prompts by type
ls data_layer/storage/prompts/docs/workflow/
ls data_layer/storage/prompts/docs/contract_template/
ls data_layer/storage/prompts/docs/agent/
# Read specific prompt doc
cat "data_layer/storage/prompts/docs/workflow/workflows.league-questionnaire-extraction.v1.md"
# View source prompt
cat "data_layer/prompts/workflows/league_questionnaire_extraction_v1.md"β Quick Verification
Run this to verify system is operational:
python data_layer/scripts/test_prompt_retrieval.pyExpected Output:
β
Found 5 relevant prompts (League Onboarding)
β
Found 5 relevant prompts (Contract Generation)
β
All workflows generated successfully
β
SYSTEM STATUS: FULLY OPERATIONALπ Next Steps
- Try the Examples above with your own queries
- Explore the Registry at
kb_catalog/manifests/prompt_registry.json - Read Enriched Docs in
storage/prompts/docs/ - Browse in Google Drive (if Phase 3 sync completed)
- Install LangMem (optional) for semantic search:
pip install langmem python data_layer/scripts/index_prompts.py
π Additional Documentation
- EXECUTIVE_SUMMARY.md - High-level overview and business impact
- PROMPT_SYSTEM_IMPLEMENTATION.md - Complete technical documentation
- PHASE_4_COMPLETE.md - Phase 4 results and proof
- LANGMEM_SETUP.md - LangMem installation and usage guide
- SYSTEM_COMPLETE_SUMMARY.md - Overall system status
Status: β OPERATIONAL Test Results: β ALL PASSING Ready for Production: β YES
Need help? Check the documentation files above or run the test suite.