MCP Integration for Knowledge Catalogs
AI-powered schema management and knowledge graph integration using the Model Context Protocol.
Overview
The AltSportsData MCP server provides Knowledge Catalog managers with tools for:
- Schema Management - Create, update, and version data schemas
- Knowledge Graph Operations - Navigate and query graph relationships
- Semantic Search - AI-powered search across schemas and documentation
- Schema Validation - Ensure data quality and consistency
- Documentation Generation - Auto-generate API docs from schemas
Quick Setup
Configure for Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"altsportsdata-catalogs": {
"command": "npx",
"args": ["-y", "@altsportsdata/mcp-server"],
"env": {
"API_KEY": "your-catalog-manager-api-key",
"ROLE": "knowledge_catalog",
"CATALOG_ID": "your-catalog-id"
}
}
}
}Available Tools
1. Search Schemas
Semantic search across all schemas in the knowledge catalog.
{
"name": "search_schemas",
"arguments": {
"query": "basketball player statistics",
"filters": {
"sport": "basketball",
"category": "player",
"status": "published"
},
"format": "typescript",
"limit": 10
}
}Response:
{
"results": [
{
"schema_id": "basketball.player.stats.v2",
"name": "BasketballPlayerStats",
"description": "Comprehensive basketball player statistics",
"version": "2.1.0",
"fields": 45,
"relevance_score": 0.95,
"preview": {
"points": "number",
"rebounds": "number",
"assists": "number"
}
}
],
"total_results": 8,
"search_time_ms": 42
}2. Get Schema by ID
Retrieve complete schema definition with documentation.
{
"name": "get_schema",
"arguments": {
"schema_id": "soccer.match.events.v3",
"format": "json_schema",
"include_examples": true,
"include_relationships": true
}
}3. Create or Update Schema
Add new schemas or update existing ones.
{
"name": "upsert_schema",
"arguments": {
"schema_id": "hockey.player.advanced_stats.v1",
"definition": {
"type": "object",
"properties": {
"player_id": {"type": "string"},
"corsi_for_percentage": {"type": "number"},
"fenwick_rating": {"type": "number"}
}
},
"metadata": {
"sport": "hockey",
"category": "advanced_analytics",
"maintainer": "analytics_team"
},
"version": "1.0.0"
}
}4. Navigate Knowledge Graph
Explore relationships between schemas and entities.
{
"name": "navigate_graph",
"arguments": {
"start_node": "League:EPL",
"relationship_types": ["HAS_TEAM", "HAS_SEASON"],
"depth": 2,
"include_properties": true
}
}Response:
{
"start_node": {
"id": "League:EPL",
"labels": ["League", "Soccer"],
"properties": {
"name": "English Premier League",
"country": "England",
"tier": 1
}
},
"relationships": [
{
"type": "HAS_TEAM",
"nodes": [
{"id": "Team:ManchesterUnited", "name": "Manchester United"},
{"id": "Team:Liverpool", "name": "Liverpool"}
]
},
{
"type": "HAS_SEASON",
"nodes": [
{"id": "Season:2024-25", "year": "2024-25"}
]
}
],
"total_nodes": 23,
"total_relationships": 42
}5. Validate Data Against Schema
Check if data conforms to a schema.
{
"name": "validate_data",
"arguments": {
"schema_id": "basketball.match.result.v2",
"data": {
"match_id": "nba-2024-001",
"home_team": "Lakers",
"away_team": "Celtics",
"home_score": 108,
"away_score": 105
}
}
}Knowledge Catalog Workflows
Workflow 1: Schema Discovery
User: "Find schemas related to soccer match events"
AI Agent (using MCP):
1. Searches schemas with search_schemas
2. Returns relevant schemas with descriptions
3. Shows relationships with navigate_graph
4. Provides code examples with generate_examples
5. Suggests related schemas with suggest_related
Result: Comprehensive list of relevant schemas with usage examplesWorkflow 2: Schema Creation
User: "Create a new schema for lacrosse player statistics"
AI Agent (using MCP):
1. Analyzes similar schemas with find_similar_schemas
2. Generates base structure with generate_schema_template
3. Validates structure with validate_schema
4. Creates schema with upsert_schema
5. Generates documentation with generate_docs
Result: New schema created with full documentationWorkflow 3: Data Quality Audit
User: "Validate all match data for the past week"
AI Agent (using MCP):
1. Retrieves match data with get_match_data
2. Identifies applicable schema with resolve_schema
3. Validates each record with validate_data
4. Identifies errors with find_validation_errors
5. Suggests fixes with suggest_corrections
Result: Data quality report with actionable fixesWorkflow 4: Documentation Generation
User: "Generate API documentation for all basketball schemas"
AI Agent (using MCP):
1. Lists basketball schemas with list_schemas
2. Retrieves each schema with get_schema
3. Generates docs with generate_documentation
4. Creates examples with generate_examples
5. Builds interactive docs with build_docs_site
Result: Complete API documentation websiteSchema Management
Schema Versioning
Follow semantic versioning for schemas:
{
"name": "version_schema",
"arguments": {
"schema_id": "soccer.match.result",
"current_version": "2.1.0",
"new_version": "3.0.0",
"change_type": "breaking",
"changelog": "Added required field: referee_id"
}
}Version Strategy:
- Major (3.0.0): Breaking changes
- Minor (2.1.0): New optional fields
- Patch (2.0.1): Documentation or bug fixes
Schema Inheritance
Create schema hierarchies:
{
"base_schema": "sports.match.base",
"derived_schemas": [
"soccer.match",
"basketball.match",
"hockey.match"
],
"inheritance_type": "composition"
}Schema Relationships
Define how schemas relate to each other:
{
"relationships": [
{
"from": "Match",
"to": "Team",
"type": "HAS_HOME_TEAM",
"cardinality": "one-to-one"
},
{
"from": "Match",
"to": "Player",
"type": "HAS_PLAYER",
"cardinality": "one-to-many"
}
]
}Knowledge Graph Operations
Graph Queries
Use Cypher-like queries to explore relationships:
{
"name": "query_graph",
"arguments": {
"query": `
MATCH (l:League)-[:HAS_TEAM]->(t:Team)-[:HAS_PLAYER]->(p:Player)
WHERE l.country = 'USA' AND p.position = 'QB'
RETURN l.name, t.name, p.name, p.stats
`,
"parameters": {},
"limit": 100
}
}Graph Visualization
Generate visual representations:
{
"name": "visualize_graph",
"arguments": {
"root_node": "League:NBA",
"relationships": ["HAS_TEAM", "HAS_PLAYER", "PLAYS_IN_ARENA"],
"depth": 3,
"layout": "force_directed",
"output_format": "svg"
}
}Relationship Discovery
Find hidden connections:
{
"name": "discover_relationships",
"arguments": {
"node_a": "Player:LeBronJames",
"node_b": "Team:Lakers",
"max_hops": 4,
"relationship_types": ["all"]
}
}Semantic Search
Vector-Based Search
Use AI embeddings for semantic understanding:
{
"name": "semantic_search",
"arguments": {
"query": "How do I track player injuries across seasons?",
"search_targets": ["schemas", "documentation", "examples"],
"embedding_model": "text-embedding-3-large",
"top_k": 5
}
}Search Filters
Narrow results with filters:
{
"filters": {
"sport": ["basketball", "soccer"],
"category": "medical",
"min_version": "2.0.0",
"status": "published",
"tags": ["injuries", "health"]
}
}Data Validation
Schema Validation
Ensure schemas follow standards:
{
"name": "validate_schema",
"arguments": {
"schema_definition": {
"type": "object",
"properties": {...}
},
"validation_rules": [
"check_naming_conventions",
"check_required_fields",
"check_data_types",
"check_relationships"
]
}
}Data Quality Rules
Define custom validation rules:
{
"name": "add_validation_rule",
"arguments": {
"schema_id": "basketball.player.stats",
"rule": {
"name": "points_must_be_positive",
"field": "points",
"condition": "value >= 0",
"error_message": "Points cannot be negative"
}
}
}Integration Examples
TypeScript Integration
import { MCPClient } from '@modelcontextprotocol/sdk';
const catalogClient = new MCPClient({
serverUrl: 'http://localhost:3000/mcp',
apiKey: process.env.CATALOG_API_KEY,
role: 'knowledge_catalog'
});
// Search schemas
const schemas = await catalogClient.call('search_schemas', {
query: 'basketball player statistics',
format: 'typescript',
limit: 10
});
// Get specific schema
const schema = await catalogClient.call('get_schema', {
schema_id: 'basketball.player.stats.v2',
format: 'json_schema',
include_examples: true
});
// Navigate knowledge graph
const graph = await catalogClient.call('navigate_graph', {
start_node: 'League:NBA',
relationship_types: ['HAS_TEAM', 'HAS_PLAYER'],
depth: 2
});Python Integration
from altsportsdata_mcp import MCPClient
client = MCPClient(
api_key=os.getenv('CATALOG_API_KEY'),
role='knowledge_catalog'
)
# Validate data
validation = client.call_tool('validate_data', {
'schema_id': 'soccer.match.result.v2',
'data': {
'match_id': 'epl-2024-001',
'home_score': 2,
'away_score': 1
}
})
# Create new schema
new_schema = client.call_tool('upsert_schema', {
'schema_id': 'hockey.player.advanced_stats.v1',
'definition': {...},
'version': '1.0.0'
})Example Prompts
Schema Discovery
"Find all schemas related to player injuries"
"What schemas are available for soccer match data?"
"Search for basketball statistics schemas"Schema Management
"Create a new schema for rugby player stats"
"Update the basketball match schema to version 3.0"
"Show me the changelog for soccer.match.events"Knowledge Graph
"Show me all teams in the Premier League"
"Find relationships between LeBron James and the Lakers"
"Visualize the NBA team hierarchy"Validation
"Validate this match data against the schema"
"Check if all player records follow the required format"
"Find data quality issues in the last 100 matches"Best Practices
Schema Design
- Consistency: Use consistent naming conventions
- Documentation: Every field should have a description
- Versioning: Follow semantic versioning strictly
- Validation: Define validation rules for critical fields
- Examples: Provide real-world examples
Knowledge Graph
- Use clear, descriptive relationship names
- Avoid overly deep hierarchies (max 5 levels)
- Index frequently queried properties
- Regularly audit for orphaned nodes
- Document relationship semantics
Search Optimization
- Maintain rich metadata on schemas
- Update search indices regularly
- Use tags for better categorization
- Provide multiple examples per schema
- Keep documentation current
Advanced Features
Schema Generation from Data
Automatically generate schemas from sample data:
{
"name": "generate_schema_from_data",
"arguments": {
"sample_data": [
{"player_id": "p1", "points": 25, "rebounds": 10},
{"player_id": "p2", "points": 18, "rebounds": 7}
],
"schema_name": "basketball.player.basic_stats",
"infer_types": true,
"suggest_constraints": true
}
}Cross-Catalog Synchronization
Sync schemas across multiple catalogs:
{
"name": "sync_catalogs",
"arguments": {
"source_catalog": "production",
"target_catalog": "staging",
"schema_filters": {"status": "published"},
"sync_mode": "incremental"
}
}Schema Analytics
Track schema usage and health:
{
"name": "analyze_schema_usage",
"arguments": {
"schema_id": "basketball.match.result.v2",
"metrics": [
"validation_rate",
"error_rate",
"usage_count",
"api_calls"
],
"timeframe": "30d"
}
}Troubleshooting
Common Issues
Issue: Schema validation failing unexpectedly
Solutions:
- Check for schema version mismatches
- Verify required fields are present
- Review data types carefully
- Check for custom validation rules
Issue: Slow semantic search
Solutions:
- Update search indices
- Use more specific filters
- Limit search scope
- Cache frequent queries
Issue: Graph queries timeout
Solutions:
- Reduce query depth
- Add indexes on query properties
- Use pagination
- Optimize relationship traversal