Source: data_layer/docs/ADR-001-SCHEMA-ARCHITECTURE.md

ADR-001: Schema Architecture - JSON Schema + Neo4j Cypher

Status: Accepted Date: 2025-01-14 Decision Maker: Development Team Context: ChatGPT conversation about GraphQL vs x-graph JSON Schema vs separate schemas

Decision

Use JSON Schema + Neo4j Cypher as separate, synchronized schema systems with documented mappings.

Context

We needed to decide how to model data structures for:

API validation (FastAPI + Pydantic)
Graph relationships (Neo4j)
Type safety (TypeScript frontend)
Code generation (adapters)

Three approaches were considered:

GraphQL - Single SDL schema generating resolvers
x-graph JSON Schema - Vendor extensions embedding graph metadata
Separate schemas - JSON Schema (validation) + Cypher (graph)

Options Considered

Option 1: GraphQL

Pros:

Single source of truth
Nice client query flexibility
Strong tooling ecosystem

Cons:

❌ Wrong layer - query language, not data modeling
❌ Resolver infrastructure overhead
❌ Not needed for internal/AI-driven workflows
❌ Doesn't help with core need: validation + graph topology
❌ New paradigm for existing Python/FastAPI stack

Verdict: ❌ Rejected

Option 2: x-graph JSON Schema

Pros:

Single file per entity
Extends familiar JSON Schema

Cons:

❌ Non-standard vendor extensions
❌ No tooling support for x-graph conventions
❌ Mixes concerns (validation + graph topology)
❌ Custom parsing required

Example:

{
  "properties": {"id": {"type": "string"}},
  "x-graph": {
    "label": "League",
    "relationships": [{"type": "COMPETES_IN", "to": "Team"}]
  }
}

Verdict: ❌ Rejected

Option 3: Separate Schemas (JSON + Cypher)

Pros:

✅ Separation of concerns (validation ≠ graph)
✅ Standard tooling works (JSON Schema validators, Neo4j tools)
✅ Already aligned with existing architecture
✅ Clear mental model
✅ Can evolve independently

Cons:

⚠️ Manual sync required
⚠️ Two files per entity

Mitigation:

Documentation in SCHEMA_MAPPING.md
Validation script (validate_schema_sync.py)
Future: unified YAML spec → generate both

Verdict: ✅ ACCEPTED

Implementation

Architecture

database/
├── schemas/domain/v1/*.json     # JSON Schema (validation)
│   └── → generate_adapters.py → Pydantic, TypeScript, Drizzle
│
└── sql/neo4j_*.cypher           # Neo4j schema (graph)
    └── → apply to Neo4j database

Documentation: docs/SCHEMA_MAPPING.md
Validation: scripts/validate_schema_sync.py

Responsibilities

JSON Schema:

API contracts
Data validation (types, formats, required fields)
Code generation (Pydantic, TypeScript)
Self-documentation

Neo4j Cypher:

Graph topology (nodes, relationships)
Constraints (uniqueness, existence)
Indexes (performance)
Graph traversal patterns

Workflow

Define JSON Schema → regenerate adapters
Define Neo4j Cypher → apply to database
Document mapping in SCHEMA_MAPPING.md
Run validate_schema_sync.py

Consequences

Positive

✅ Clear separation of concerns

Validation logic separate from graph topology
Standard tools for each domain

✅ Existing architecture preserved

No need to refactor Python/FastAPI stack
Schema-driven generation still works

✅ Tooling compatibility

JSON Schema validators work
Neo4j tools work
No custom parsing needed

✅ Evolution path

Can migrate to unified YAML spec later
Both schemas version independently

Negative

⚠️ Manual synchronization

Must keep JSON and Cypher in sync manually
Documentation overhead
Potential for drift

Mitigation: Validation script + clear documentation

⚠️ Two files per entity

More files to manage

Mitigation: Clear naming conventions + directory structure

Neutral

📝 Learning curve

Team must understand both systems
But both are industry standards

Future Enhancements

Phase 1 (Current)

✅ Separate JSON Schema + Cypher
✅ Documentation in SCHEMA_MAPPING.md
✅ Validation script

Phase 2 (Post-MVP)

Unified YAML spec format
Code generator reads YAML → outputs JSON + Cypher
Single source of truth maintained

Phase 3 (Long-term)

CI/CD integration (fail builds on schema drift)
Auto-migration generator
Schema diff/change detection

References

ChatGPT Conversation: User provided side-by-side comparison
JSON Schema: https://json-schema.org/draft/2020-12/schema (opens in a new tab)
Neo4j Cypher: https://neo4j.com/docs/cypher-manual/current/ (opens in a new tab)
Implementation Docs:

Related ADRs

ADR-002: Schema versioning strategy (future)
ADR-003: Migration to unified YAML spec (future)

Approval

Architecture reviewed
Implementation plan documented
Validation tooling created
Team informed

Author: AI Development Team Reviewers: Project Architecture Team Date: 2025-01-14

🚀 Practical Migration Guide API Schema

ADR-001: Schema Architecture - JSON Schema + Neo4j Cypher

Decision

Context

Options Considered

Option 1: GraphQL

Option 2: x-graph JSON Schema

Option 3: Separate Schemas (JSON + Cypher)

Implementation

Architecture

Responsibilities

Workflow

Consequences

Positive

Negative

Neutral

Future Enhancements

Phase 1 (Current)

Phase 2 (Post-MVP)

Phase 3 (Long-term)

References

Related ADRs

Approval

Platform

Documentation

Community

Support