Back to Docs

Architecture

Interactive diagrams of the Health Dataspace v2 architecture — 5-layer graph model, data flows, deployment topology, service dependencies, and identity trust framework.

1. Five-Layer Knowledge Graph

The Neo4j knowledge graph organises health data across five architectural layers: DSP Marketplace (connector discovery), HealthDCAT-AP (dataset metadata), FHIR R4 (clinical data), OMOP CDM (research analytics), and Ontology (terminology alignment).

Fig 1. Five-layer knowledge graph architecture

2. Data Flow Pipeline

Synthetic patient data flows from Synthea generation through FHIR R4 resource loading into Neo4j, then transforms to OMOP CDM for research analytics. Each stage preserves full provenance through graph relationships.

Fig 2. End-to-end data flow from Synthea to analytics

3. Deployment Topology

The full JAD stack runs 19+ Docker Compose services across six layers: infrastructure (Traefik, PostgreSQL, Vault, NATS, Keycloak), EDC-V / DCore (Control Plane, dual Data Planes), Identity (Identity Hub, Issuer Service), CFM (Tenant/Provision Managers, 4 background agents), Application (Neo4j, Proxy, UI), and a static GitHub Pages export. The same topology is deployed to Azure Container Apps (13 apps + 3 jobs, see ADR-012). Arrows show runtime dependencies.

Fig 3. Full deployment topology — 19+ services with dependency graph

4. Service Dependencies

Complete inventory of all services in the docker-compose.yml and docker-compose.jad.yml stacks, their exposed ports, upstream dependencies, and purpose.

ServiceLayerPort(s)Depends OnPurpose
TraefikInfrastructure:80 / :8090--API gateway, reverse proxy, *.localhost routing
PostgreSQL 17Infrastructure:5432--Shared database (8 DBs: controlplane, dataplane-fhir, dataplane-omop, identityhub, issuerservice, keycloak, tenant-mgr, provision-mgr)
HashiCorp VaultInfrastructure:8200--Secrets management (dev/in-memory mode, lost on restart)
NATS JetStreamInfrastructure:4222 / :8222--Async event mesh for DSP protocol events
KeycloakInfrastructure:8080 / :9000PostgreSQLOIDC SSO provider, realm edcv, 7 personas
vault-bootstrapInfrastructure--Vault, KeycloakInit sidecar: seeds Vault secrets and Keycloak config
Control PlaneEDC-V / DCore:11003PostgreSQL, Vault, NATS, KeycloakEDC-V runtime: DSP negotiation, management API, policy engine
Data Plane FHIREDC-V / DCore:11002PostgreSQL, Vault, Control PlaneDCore data plane for FHIR R4 resource transfer
Data Plane OMOPEDC-V / DCore:11012PostgreSQL, Vault, Control PlaneDCore data plane for OMOP CDM data transfer
Identity HubIdentity:11005PostgreSQL, Vault, KeycloakDCP: DID resolution, Verifiable Credential storage
Issuer ServiceIdentity:10013PostgreSQL, Vault, KeycloakVC issuance: EHDS membership, data permits, org credentials
Tenant ManagerCFM:11006PostgreSQL, KeycloakCFM: multi-tenant participant management
Provision ManagerCFM:11007PostgreSQL, Keycloak, Control PlaneCFM: automated resource provisioning
cfm-keycloak-agentCFM--KeycloakBackground: syncs Keycloak realm configuration
cfm-edcv-agentCFM--Control PlaneBackground: manages EDC-V connector lifecycle
cfm-registration-agentCFM--Identity HubBackground: handles participant DID registration
cfm-onboarding-agentCFM--Tenant ManagerBackground: automates tenant onboarding workflows
Neo4j 5Application:7474 / :7687--Knowledge graph: 5-layer model, APOC + n10s plugins
Neo4j SPE2Application:7475 / :7688--Secondary graph instance (federated profile)
Neo4j ProxyApplication:9090Neo4j, Control PlaneExpress bridge: FHIR/OMOP REST endpoints over Neo4j
Next.js UIApplication:3000 / :3003Neo4j ProxyGraph Explorer: 16 pages, 36 API routes, 7 personas
jad-seedSeed--All servicesOne-shot: phases 1-7 data seeding (Synthea, FHIR, OMOP, DSP)
GitHub PagesStatic--Next.js UI (static export)Public demo site with mock data fixtures

5. DSP Contract Negotiation

The Dataspace Protocol (DSP) governs how data holders and data users negotiate access to health datasets. The EHDS regulation adds HDAB approval as a pre-requisite for data permit issuance before contract negotiation can proceed.

Fig 4. DSP contract negotiation with EHDS compliance

6. Identity & Trust Framework

The Decentralized Claims Protocol (DCP) manages identity, credentials, and trust. Identity Hub stores DIDs and Verifiable Credentials, the Issuer Service mints EHDS-specific credentials, and Keycloak provides SSO/OIDC authentication.

Fig 5. DCP identity and trust architecture

7. SIMPL-Open & Compliance

This reference implementation aligns with the EU SIMPL-Open programme for federated data spaces. The architecture satisfies EHDS regulation, DSP 2025-1, DCP v1.0, and supply chain transparency requirements.

SIMPL-Open Alignment

  • DSP 2025-1: Sovereign data exchange via Control Plane
  • DCP v1.0: DID:web identity + Verifiable Credentials
  • Trust Framework: Gaia-X compatible credential attestation
  • Federated Catalog: HealthDCAT-AP 3.0 metadata profiles
  • SBOM: CycloneDX 1.5 supply chain transparency

Regulatory Compliance

  • EHDS Art. 3-12: Patient rights (access, rectification, portability)
  • EHDS Art. 50-51: Secondary use — HDAB approval, data permits
  • GDPR Art. 15-22: Data subject rights enforcement
  • EU CRA Art. 13: SBOM mandate, vulnerability disclosure
  • BSI C5: Cloud security baseline (DEV, OPS controls)

8. Architecture Decision Records

All ADRs are maintained as standalone Markdown files in docs/ADRs/ .

ADR-001
PostgreSQL / Neo4j Split

EDC runtime metadata in PostgreSQL (8 databases), health knowledge graph in Neo4j.

ADR-002
Dual EDC Data Planes

Separate FHIR R4 (PUSH) and OMOP CDM (PULL) data planes for type-safe access.

ADR-003
HealthDCAT-AP Alignment

Graph nodes aligned with HealthDCAT-AP 3.0 profile for EU catalog interoperability.

ADR-004
Next.js App Router

Client SPA with 36 API routes proxying to Neo4j and EDC-V; static export for demo.

ADR-005
Source Builds from Public Repos

Build EDC-V, DCore, CFM from source using Gradle multi-module layout.

ADR-006
GHCR Image Publishing

Publish OCI images to GitHub Container Registry for consistent deployments.

ADR-007
DID:web for Participant Identity

W3C DID:web method for decentralised participant identification.

ADR-008
Vitest + MSW + Playwright Testing

1,613 unit tests with MSW API mocking, 19 Playwright E2E specs, pre-push gate.

ADR-009
Credential Issuance Flow

EHDS membership, data permits, and org VCs issued via DCP Issuer Service.

ADR-010
WCAG 2.2 AA Accessibility

Zero WCAG violations enforced by automated axe-core audits in CI.

ADR-011
Security Penetration Testing

OWASP ZAP + BSI C5 baseline scans integrated into CI pipeline.

ADR-012
Azure Container Apps Deployment

13 Container Apps + 3 jobs on Azure with OIDC federation and VNet isolation.

ADR-013
SIMPL-Open Alignment

Gap analysis and alignment roadmap for EU SIMPL-Open programme compatibility.

ADR-014
Weekly Demo Reset

Scheduled Monday 05:15 UTC reset of demo state for GDPR data minimisation.

ADR-015
Single-VM Dev Deployment

Single-VM fallback deployment mode for personal Visual Studio subscription budgets.

ADR-016
ACA Off-Hours Scale-Down

Weekday business-hours scaling schedule to reduce Azure Container Apps cost by ~60%.

ADR-017
Persistent Storage for Stateful Services

Azure Files volume mounts for Neo4j, PostgreSQL, and Vault on Container Apps.

ADR-018
24×7 Operation — Workaround B

Postgres-on-ACA workaround for INF-STG-EU_EHDS subscription policy constraints.

Diagram Legend

Solid lines — direct data flow or API calls
Dashed lines — mapping / transformation relationships
Subgraphs — logical boundary groupings
Participants — protocol actors in sequence diagrams