Source: data_layer/docs/SCHEMA_MAPPING.md
Schema Mapping: JSON Schema ↔ Neo4j Cypher
Architecture Decision
Decision: Maintain separate JSON Schema and Neo4j Cypher schema files (no GraphQL, no x-graph extensions)
Rationale:
- JSON Schema = Validation layer (API contracts, data validation)
- Neo4j Cypher = Graph topology layer (relationships, constraints, indexes)
- GraphQL = Query language (not needed for internal AI-driven workflows)
- x-graph extensions = Non-standard hack (pollutes schemas, no tooling support)
Current Architecture
database/
├── schemas/domain/v1/*.json # JSON Schema (validation)
│ └── Generated → Pydantic, TypeScript, Drizzle
│
└── sql/neo4j_*.cypher # Neo4j schema (graph)
└── Applied → Neo4j database
Manual sync maintained via documentation (this file)Entity Mapping
League Entity
JSON Schema: database/schemas/domain/v1/league_payload_schema.json
{
"type": "object",
"properties": {
"league_id": {"type": "string", "pattern": "^[A-Z0-9_-]{3,40}$"},
"name": {"type": "string", "minLength": 2},
"sport": {"type": "string"},
"tier": {"type": "string", "enum": ["T1", "T2", "T3", "T4"]},
"verified": {"type": "boolean"}
},
"required": ["league_id", "name", "sport"]
}Neo4j Cypher: database/sql/neo4j_comprehensive_schema.cypher
// Constraints
CREATE CONSTRAINT league_id_unique IF NOT EXISTS
FOR (l:League) REQUIRE l.id IS UNIQUE;
// Indexes
CREATE INDEX league_name_idx IF NOT EXISTS
FOR (l:League) ON (l.name);
// Node properties
CREATE (league:League {
id: 'league_id', // maps to league_id in JSON
name: 'string', // maps to name in JSON
sport: 'string', // maps to sport in JSON
tier: 'T1|T2|T3|T4', // maps to tier in JSON
verified: boolean // maps to verified in JSON
})Mapping:
| JSON Schema Property | Neo4j Property | Notes |
|---|---|---|
league_id | id | Neo4j uses id (shorter) |
name | name | 1:1 mapping |
sport | sport | 1:1 mapping |
tier | tier | Enum validated in JSON, string in Neo4j |
verified | verified | 1:1 mapping |
Team Entity
JSON Schema: database/schemas/domain/v1/team_schema.json (if exists)
{
"type": "object",
"properties": {
"team_id": {"type": "string"},
"name": {"type": "string", "minLength": 2},
"league_id": {"type": "string"}
}
}Neo4j Cypher:
CREATE CONSTRAINT team_id_unique IF NOT EXISTS
FOR (t:Team) REQUIRE t.id IS UNIQUE;
CREATE (team:Team {
id: 'team_id',
name: 'string'
})
// Relationship to League
CREATE (team)-[:COMPETES_IN {season: 'string'}]->(league:League)Mapping:
| JSON Schema | Neo4j Graph | Notes |
|---|---|---|
team_id | Team.id | Property mapping |
name | Team.name | Property mapping |
league_id (foreign key) | [:COMPETES_IN]->(League) | FK becomes relationship |
Player Entity
JSON Schema: database/schemas/domain/v1/player_schema.json (if exists)
{
"type": "object",
"properties": {
"player_id": {"type": "string"},
"full_name": {"type": "string"},
"team_id": {"type": "string"}
}
}Neo4j Cypher:
CREATE CONSTRAINT player_id_unique IF NOT EXISTS
FOR (p:Player) REQUIRE p.id IS UNIQUE;
CREATE (player:Player {
id: 'player_id',
fullName: 'string'
})
CREATE (player)-[:PLAYS_FOR {since: date()}]->(team:Team)Mapping:
| JSON Schema | Neo4j Graph | Notes |
|---|---|---|
player_id | Player.id | Property mapping |
full_name | Player.fullName | Camelcase in Neo4j |
team_id (FK) | [:PLAYS_FOR]->(Team) | FK becomes relationship |
Relationship Mapping Patterns
Pattern 1: Foreign Key → Relationship
JSON Schema (relational thinking):
{
"team": {
"team_id": "TEAM_001",
"league_id": "LEAGUE_001" // Foreign key
}
}Neo4j (graph thinking):
(team:Team {id: 'TEAM_001'})-[:COMPETES_IN]->(league:League {id: 'LEAGUE_001'})Pattern 2: Nested Objects → Relationship Properties
JSON Schema:
{
"player": {
"player_id": "PLR_001",
"contract": {
"start_date": "2025-01-01",
"end_date": "2026-12-31"
}
}
}Neo4j:
(player:Player)-[:HAS_CONTRACT {
startDate: date('2025-01-01'),
endDate: date('2026-12-31')
}]->(contract:Contract)Pattern 3: Array of IDs → Multiple Relationships
JSON Schema:
{
"league": {
"league_id": "LEAGUE_001",
"team_ids": ["TEAM_001", "TEAM_002", "TEAM_003"]
}
}Neo4j:
(league:League {id: 'LEAGUE_001'})
<-[:COMPETES_IN]-(team1:Team {id: 'TEAM_001'})
<-[:COMPETES_IN]-(team2:Team {id: 'TEAM_002'})
<-[:COMPETES_IN]-(team3:Team {id: 'TEAM_003'})Type Mapping
| JSON Schema Type | Neo4j Type | Notes |
|---|---|---|
string | STRING | Direct mapping |
number | INTEGER or FLOAT | Depends on use case |
boolean | BOOLEAN | Direct mapping |
string (format: date) | DATE | Use date() function |
string (format: date-time) | DATETIME | Use datetime() function |
array | LIST | Neo4j native list type |
object | Node with relationship | Nested object → separate node |
enum | STRING with constraint | Validation in JSON, string in Neo4j |
Validation Responsibilities
JSON Schema Validates:
- ✅ Data types (string, number, boolean)
- ✅ Required fields
- ✅ Format constraints (email, URL, date)
- ✅ Pattern matching (regex)
- ✅ Min/max length
- ✅ Enum values
- ✅ Nested object structures
Neo4j Validates:
- ✅ Uniqueness constraints (node IDs)
- ✅ Existence constraints (required properties)
- ✅ Relationship cardinality
- ✅ Graph topology (valid relationships)
- ✅ Index performance
Principle: JSON Schema guards API boundaries, Neo4j guards graph integrity.
Sync Workflow
When Creating New Entity:
-
Define JSON Schema first (
database/schemas/domain/v1/new_entity.schema.json)- Define properties, types, validation rules
- Generate Pydantic models:
./scripts/regenerate_adapters.sh
-
Define Neo4j schema (
database/sql/neo4j_comprehensive_schema.cypher)- Add constraints for unique IDs
- Add indexes for common queries
- Define node label and properties
- Define relationships to other entities
-
Document mapping (update this file)
- Add entity to "Entity Mapping" section
- Document property name differences
- Document relationship patterns
-
Validate sync (manual check)
- Every JSON Schema property has corresponding Neo4j property OR relationship
- Every Neo4j node has corresponding JSON Schema
- Foreign keys in JSON → relationships in Neo4j
When Modifying Entity:
- Update JSON Schema → regenerate adapters
- Update Neo4j Cypher → apply migration
- Update this mapping doc → keep documentation in sync
- Run validation script (see below)
Validation Script
Location: database/scripts/validate_schema_sync.py (to be created)
Purpose: Automated check that JSON Schema and Neo4j Cypher stay in sync
Checks:
- Every JSON Schema entity has Neo4j node definition
- Every Neo4j node has JSON Schema definition
- Property names match (with documented exceptions)
- Foreign keys map to relationships
- Required fields in JSON have existence constraints in Neo4j
Usage:
python database/scripts/validate_schema_sync.py
# Output: PASS or list of mismatchesExamples: Common Scenarios
Scenario 1: Adding New Property
Step 1: Update JSON Schema
{
"properties": {
"league_id": {"type": "string"},
"name": {"type": "string"},
"website_url": {"type": "string", "format": "uri"} // NEW
}
}Step 2: Update Neo4j Cypher
// Add property to node template
CREATE (league:League {
id: 'league_id',
name: 'string',
websiteUrl: 'string' // NEW (camelCase)
})Step 3: Document in this file
| `website_url` | `websiteUrl` | Snake case → camel case |Scenario 2: Adding New Relationship
Step 1: Update JSON Schema (optional - may be implicit via foreign key)
{
"league": {
"sportsbook_ids": ["array", "of", "ids"] // NEW
}
}Step 2: Define Neo4j relationship
CREATE (league:League)-[:PARTNERS_WITH {
since: date('2025-01-01')
}]->(sportsbook:Sportsbook)Step 3: Document pattern
### Pattern: Many-to-Many via Array
JSON: `sportsbook_ids` array
Neo4j: "`[:PARTNERS_WITH]` relationships"Future Enhancements
Option 1: Unified YAML Spec (Long-term)
Proposed: Single source of truth generating both JSON Schema and Cypher
# unified.graph.yaml
entities:
League:
properties:
id: {type: string, unique: true, indexed: true}
name: {type: string, required: true, indexed: true}
sport: {type: string, required: true}
relationships:
teams:
type: COMPETES_IN
direction: in
from: TeamGenerated:
league.schema.json(JSON Schema)league.cypher(Neo4j constraints + indexes)league.py(Pydantic models)league.ts(TypeScript types)
Option 2: Validation Automation
Script: database/scripts/validate_schema_sync.py
Features:
- Parse JSON Schema files
- Parse Cypher schema files
- Compare entity definitions
- Report mismatches
- Integrate into CI/CD pipeline
Key Principles
- JSON Schema = Validation → API contract, data validation, type safety
- Neo4j Cypher = Graph → Relationships, constraints, topology
- Keep them separate → Different concerns, different tools
- Document mapping → This file is the bridge
- Automate checks → Validation script prevents drift
Questions?
- "Should I add
x-graphextensions to JSON Schema?" → No, keep concerns separate - "Should I use GraphQL?" → No, not needed for internal AI workflows
- "How do I keep them in sync?" → Follow this doc + validation script
- "Can I generate one from the other?" → Future enhancement (unified YAML spec)
Related Documentation
database/schemas/README.md- JSON Schema systemdatabase/sql/neo4j_comprehensive_schema.cypher- Current Neo4j schema- Parent CLAUDE.md - Overall architecture
database/CLAUDE.md- Database layer architecture
Last Updated: 2025-01-14 Maintained By: Development team Review Frequency: Update when entities change