Back to Docs

Developer Guide

Technical documentation for developing, testing, and deploying the Health Dataspace v2 platform — an EHDS regulation reference implementation built on Eclipse Dataspace Components.

Onboarding

New to the project? Follow this quick-start path to get productive within your first day.

1. Get Running

  1. Clone the repository
  2. docker compose up -d (Neo4j)
  3. Seed schema & data (cypher-shell)
  4. cd ui && npm install && npm run dev
  5. Open http://localhost:3000

2. Explore the Platform

  • Switch personas via the User Menu
  • Explore the Graph Explorer (center node)
  • Browse the Data Catalog
  • Check Patient Portal (PATIENT role)
  • Review ODRL policies (HDAB role)

3. Key Concepts

  • DSP: Dataspace Protocol — sovereign data exchange
  • DCP: Decentralised Claims — DID + VC identity
  • FHIR R4: Clinical data standard (EHR)
  • OMOP CDM: Analytics layer for research
  • EHDS: EU regulation for health data sharing
Full Onboarding Guide →

JAD Stack Architecture

The full JAD (Java Application Deployment) stack runs 19 Docker services orchestrated via docker-compose.yml + docker-compose.jad.yml. Services are grouped into five layers:

JAD stack — 19 services with dependency relationships
ServicePortTraefikPurposeDepends On
Traefik:80 / :8090traefik.localhostAPI gateway & reverse proxy
PostgreSQL 17:5432Runtime store (8 databases)
Vault:8200vault.localhostSecret management (dev mode)
Keycloak:8080keycloak.localhostOIDC SSO (realm: edcv, 7 users)PostgreSQL
NATS:4222 / :8222Async event mesh (JetStream)
Control Plane:11003cp.localhostDSP protocol + management APIPG, Vault, NATS, KC
Data Plane FHIR:11002dp-fhir.localhostFHIR PUSH transfer typePG, Vault, CP
Data Plane OMOP:11012dp-omop.localhostOMOP PULL transfer typePG, Vault, CP
Identity Hub:11005ih.localhostDCP v1.0 — DID + VC storePG, Vault, KC
Issuer Service:10013issuer.localhostVC issuance + DID:webPG, Vault, KC
Tenant Manager:11006tm.localhostCFM tenant lifecyclePG, KC
Provision Manager:11007pm.localhostCFM resource provisioningPG, KC, CP
Neo4j 5:7474 / :7687Knowledge graph (APOC + n10s)
Neo4j Proxy:9090proxy.localhostExpress bridge: UI ↔ Neo4jCP
Next.js UI:3000 / :3003Application frontendNeo4j

Plus 4 background CFM agents (keycloak, edcv, registration, onboarding) and 1 one-shot seed container. Vault-bootstrap runs as a sidecar.

Prerequisites

Required

  • Node.js 20+ — runtime for UI and proxy
  • Docker Desktop — with Docker Compose V2
  • 8 GB Docker RAM — required for full JAD stack
  • Git — with pre-commit hooks enabled

Optional

  • Python 3.11+ — for Synthea FHIR data loading
  • gitleaks — local secret scanning (brew install gitleaks)
  • lychee — broken link checker (brew install lychee)
  • Playwright browsers — for E2E tests (npx playwright install)

Port Requirements

The JAD stack requires these ports to be free: 80, 3000, 3003, 4222, 5432, 7474, 7687, 8080, 8090, 8200, 8222, 9090, 10013, 11002, 11003, 11005, 11006, 11007, 11012

Quick Start — Minimal Stack

Run Neo4j + Next.js UI with synthetic data. No JAD services needed.

1. Start Neo4j & load schema

docker compose up -d

# Initialize schema (idempotent — safe to re-run)
cat neo4j/init-schema.cypher | \
  docker exec -i health-dataspace-neo4j \
  cypher-shell -u neo4j -p healthdataspace

# Load synthetic data (127 patients, 5300+ nodes)
cat neo4j/insert-synthetic-schema-data.cypher | \
  docker exec -i health-dataspace-neo4j \
  cypher-shell -u neo4j -p healthdataspace

2. Start the UI

cd ui
npm install
npm run dev          # → http://localhost:3000

npm test             # Run 1,613 unit tests
npm run lint         # ESLint (max 55 warnings)

Quick Start — Full JAD Stack

The bootstrap script starts all 19 services with health checks, initializes Vault secrets, imports the Keycloak realm, and runs the 7-phase seed pipeline.

1. Bootstrap everything

# Full stack — takes ~3-5 min on first run
./scripts/bootstrap-jad.sh

# Check status & endpoints
./scripts/bootstrap-jad.sh --status

2. Seed the dataspace

# Run all 7 seed phases (sequential, strict order)
./jad/seed-all.sh

# Resume from a specific phase
./jad/seed-all.sh --from 3

# Run only one phase
./jad/seed-all.sh --only 5

3. Access the platform

# Live UI (production build)
open http://localhost:3003

# Keycloak Admin Console
open http://keycloak.localhost  # admin / admin

# Neo4j Browser
open http://localhost:7474      # neo4j / healthdataspace

# Traefik Dashboard
open http://traefik.localhost

Common operations

./scripts/bootstrap-jad.sh --ui-only   # Rebuild UI only (fast)
./scripts/bootstrap-jad.sh --seed      # Re-run seed pipeline
./scripts/bootstrap-jad.sh --pull      # Pull latest images
./scripts/bootstrap-jad.sh --down      # Stop all services
./scripts/bootstrap-jad.sh --reset     # Stop + remove volumes

Data Seeding Pipeline

The 7-phase seed pipeline populates the dataspace with tenants, credentials, policies, assets, and contracts. Phases must run in strict order — each depends on the previous.

PhaseScriptTarget ServiceWhat It Does
1seed-health-tenants.shTenant ManagerCreate 5 participant tenants via CFM
2seed-ehds-credentials.shIssuer ServiceRegister EHDS credential types
3seed-ehds-policies.shControl PlaneCreate ODRL policies for all participants
4seed-data-assets.shControl PlaneRegister data assets + contracts
5seed-contract-negotiation.shControl PlanePharmaCo ↔ AlphaKlinik negotiations + data planes
6seed-federated-catalog.shControl PlaneMedReg ↔ LMC federated catalog negotiation
7seed-data-transfer.shData PlaneVerify EDR tokens and data plane transfers

Important: Vault secrets are lost on Docker restart (in-memory dev mode). Re-run ./scripts/bootstrap-jad.sh --seed after any docker compose down.

Project Structure

├── .github/workflows/         # CI/CD (test.yml, pages.yml, compliance.yml)
├── connector/                 # EDC-V connector (Gradle multi-module)
│   ├── controlplane/          # DSP + Management API
│   ├── dataplane/             # FHIR + OMOP data planes
│   └── identityhub/           # DCP Identity Hub
├── docs/                      # Architecture docs, journeys, ADRs, reports
├── jad/                       # JAD infrastructure configs
│   ├── keycloak-realm.json    # Keycloak realm (edcv, 7 users, 6 roles)
│   ├── edcv-assets/           # Contract definitions & ODRL policies
│   ├── seed-*.sh              # 7-phase seed scripts
│   └── openapi/               # OpenAPI specifications
├── k8s/                       # Kubernetes / OrbStack manifests
├── neo4j/                     # Cypher scripts & data
│   ├── init-schema.cypher     # Constraints, indexes, vector indexes
│   ├── insert-synthetic-schema-data.cypher
│   └── fhir-to-omop-transform.cypher
├── scripts/                   # Automation (bootstrap, synthea, compliance)
├── services/neo4j-proxy/      # Express bridge (Neo4j ↔ UI)
├── ui/                        # Next.js 14 application
│   ├── src/app/               # 16 pages, 36 API routes
│   ├── src/components/        # Shared React components
│   ├── src/lib/               # auth.ts, api.ts, graph-constants.ts
│   ├── __tests__/unit/        # Vitest unit tests
│   ├── __tests__/e2e/         # Playwright specs (journeys/)
│   └── public/mock/           # 38 JSON fixtures for static export
├── docker-compose.yml         # Minimal stack (Neo4j + UI)
└── docker-compose.jad.yml     # Full JAD stack (19 services)

Neo4j Graph Schema

The 5-layer knowledge graph spans 27 node labels with 70+ indexes and 3 vector indexes for GraphRAG. Schema defined in neo4j/init-schema.cypher (idempotent — safe to re-run).

Core entity-relationship diagram (FHIR ↔ OMOP mapping)

5 Semantic Layers

  • L1 Marketplace: Participant, DataProduct, Contract, HDABApproval, OdrlPolicy
  • L2 HealthDCAT-AP: Catalogue, HealthDataset, Distribution, DataService
  • L3 FHIR R4: Patient, Encounter, Condition, Observation, MedicationRequest
  • L4 OMOP CDM: OMOPPerson, ConditionOccurrence, DrugExposure, Measurement
  • L5 Ontology: SnomedConcept, ICD10Code, RxNormConcept, LoincCode

Key Conventions

  • Labels: PascalCase
  • Relationships: UPPER_SNAKE_CASE
  • Properties: camelCase
  • Always MERGE, never CREATE
  • Constraints use IF NOT EXISTS
  • 3 fulltext indexes (clinical, catalog, ontology search)
  • 3 vector indexes (384-dim, cosine — for GraphRAG)

PostgreSQL Schema

PostgreSQL serves as the runtime store for all JAD services — EDC-V state machines, Keycloak identity, and CFM tenant metadata. Neo4j holds the health knowledge graph. This split follows ADR-1.

DatabaseServiceContents
controlplaneEDC-V Control PlaneContract negotiations, transfer processes, asset definitions, policy store
dataplane_fhirDCore FHIRFHIR data plane state, EDR tokens, transfer tracking
dataplane_omopDCore OMOPOMOP data plane state, EDR tokens, transfer tracking
identityhubIdentity HubDID documents, verifiable credential store, key pairs
issuerserviceIssuer ServiceCredential definitions, attestation records, issued VCs
keycloakKeycloakUsers, roles, realm config, sessions, client scopes
cfm_tenantTenant ManagerTenant records, VPA (Virtual Participant Agents)
cfm_provisionProvision ManagerProvisioning tasks, resource allocation records

Neo4j vs PostgreSQL Split

  • Neo4j: Health knowledge graph (FHIR, OMOP, ontologies), graph traversal queries, semantic search, GraphRAG vectors
  • PostgreSQL: EDC-V runtime state machines, OIDC sessions, tenant metadata, credential storage — transactional ACID workloads
  • Rationale: Graph queries for clinical relationships are orders of magnitude faster in Neo4j; EDC-V requires PostgreSQL for its state machine persistence

Integration Flows

Two canonical flows cover 90% of real-world EHDS integrations. Pick the one that matches your role, then use the Scalar API Reference to try each endpoint from the browser.

Data Consumer flow

Researcher / pharma / HTA body discovers and requests cross-border health data.

  1. Discover GET /api/catalog returns HealthDCAT-AP datasets
  2. Inspect GET /api/assets for access policies (ODRL)
  3. Negotiate POST /api/negotiations opens a DSP 2025-1 contract
  4. Attest POST /api/credentials/present proves role via DCP VC
  5. Transfer GET /api/transfers/:id monitors the data plane
  6. Analyse GET /api/analytics or POST /api/nlq
Role: DATA_USER · Persona: Dr. Petra Lang (PharmaCo Research)

Data Provider flow

Hospital / clinic / registry publishes datasets for secondary use.

  1. Register POST /api/participants creates a did:web identity
  2. Publish POST /api/catalog adds a HealthDCAT-AP dataset
  3. Policy PUT /api/assets/:id attaches an ODRL policy
  4. HDAB approval GET /api/compliance tracks approval state
  5. Accept GET /api/negotiations shows incoming contract offers
  6. Deliver → data plane pushes FHIR/OMOP bundles to the consumer
Role: DATA_HOLDER · Persona: Dr. Klaus Weber (AlphaKlinik Berlin)

API Reference

38 Next.js API routes proxy to Neo4j and EDC-V services. Routes are disabled in static export — mock data served from ui/public/mock/*.json.

RouteMethodsDescription
/api/graphGETKnowledge graph nodes & relationships
/api/graph/node, /expand, /validateGET/POSTNode details, expansion, schema validation
/api/catalogGET/POST/DELETEHealthDCAT-AP dataset catalog
/api/analyticsGETOMOP cohort analytics aggregates
/api/patient/*GETPatient profile, insights, research programmes
/api/eehrxfGETEEHRxF profile alignment data
/api/compliance, /tckGETEHDS compliance status, DSP TCK results
/api/credentials/*GET/POSTVerifiable credential management
/api/negotiations/*GET/POSTDSP contract negotiation lifecycle
/api/participants/*GETParticipant registry and profiles
/api/transfers/*GETData transfer history and status
/api/assetsGETEDC-V asset registry
/api/tasksGETTransfer task queue
/api/nlqPOSTNatural language query (via proxy)
/api/federatedPOSTFederated cross-participant query
/api/trust-centerGETTrust center configuration
/api/healthGETHealth check endpoint (public)

Testing

Unit Tests (Vitest)

  • 1,613 tests across 80+ files
  • 93.8% statement / 94.7% line coverage
  • MSW for API mocking, Testing Library for components
  • View Test Report
npm test               # Run once
npm run test:watch     # Watch mode
npm run test:coverage  # With v8 coverage

E2E Tests (Playwright)

  • 19 spec files (J001–J260 journeys)
  • WCAG 2.2 AA accessibility audit (27-wcag-accessibility)
  • OWASP/BSI security & pentest (28-security-pentest)
  • Playwright Report
npm run test:e2e       # Headless (chromium)
npm run test:e2e:ui    # Interactive UI

# Against JAD stack
PLAYWRIGHT_BASE_URL=http://localhost:3003 \
  npm run test:e2e

DSP 2025-1 TCK

Validates EDC connector implements Dataspace Protocol correctly — catalog queries, contract negotiations, transfer processes.

./scripts/run-dsp-tck.sh

DCP v1.0

Verifies Decentralized Claims Protocol — DID resolution, credential presentation, trust framework.

./scripts/run-dcp-tests.sh

EHDS Domain

Health domain compliance — FHIR R4 bundles, OMOP transformation, HDAB approval chains, patient rights.

./scripts/run-ehds-tests.sh

EHDS User Journey

The full 8-step EHDS secondary-use journey with sequence diagrams, persona mappings, and E2E test coverage:

View Full User Journey

Quality Gates

Four-stage quality pipeline aligned with BSI C5, OWASP Top 10, EHDS regulation, and WCAG 2.2 AA.

15
Pre-commit hooks
2
Pre-push gates
13
CI jobs
3
Compliance suites
View full Quality Gates documentation →

CI/CD Pipeline

CI/CD workflow — test.yml (8 jobs), compliance.yml (3 suites), pages.yml (deploy)

test.yml — Every Push

  • UI Tests (Vitest) + coverage upload
  • Neo4j Proxy Tests (Vitest)
  • ESLint lint check
  • Secret scan (gitleaks v8.27.2, SHA-256 verified)
  • Dependency audit (npm audit --audit-level=high)
  • Trivy security scan (v0.69.3, CVE-2026-33634 safe)
  • Kubescape K8s posture (NSA + CIS frameworks)
  • E2E + WCAG 2.2 AA + Security pentest (main only)

pages.yml — Deploy to GitHub Pages

  1. Run full Vitest suite with coverage
  2. Build Next.js for E2E, run Playwright
  3. Run WCAG 2.2 AA accessibility audit
  4. Run OWASP/BSI security tests
  5. Disable API routes (mv src/app/api /tmp/api_disabled)
  6. Build static export (NEXT_PUBLIC_STATIC_EXPORT=true)
  7. Copy test reports to output
  8. Deploy to GitHub Pages

compliance.yml — Weekly + Push to Main

Runs 3 protocol compliance suites against the full JAD stack: DSP 2025-1 TCK, DCP v1.0, and EHDS domain tests. Scheduled: Monday 06:00 UTC.

deploy-azure.yml — Azure Deployment

Deploys 13 Container Apps + 3 jobs to Azure via OIDC federation. Includes E2E smoke tests against the live Azure environment.

reset-demo.yml — Nightly Reset

Scheduled at 02:00 UTC daily. Restarts stateful services, re-bootstraps Vault/Keycloak, reseeds data, and runs smoke tests. Ensures GDPR data minimisation.

Latest Reports

Data Flow

Data pipeline: Synthea → Neo4j → Proxy → UI

Release Notes

All releases

Conventions

  • Commit messages: Conventional Commits format (feat:, fix:, docs:, chore:)
  • Branch strategy: Feature branches → PR to main
  • Pre-commit: 15 hooks — Prettier, ESLint, TypeScript, gitleaks, broken links, screenshot guard
  • Pre-push: Full Vitest suite (--bail), npm audit (HIGH+)
  • Cypher: UPPER_SNAKE_CASE relationships, PascalCase labels, camelCase properties, always MERGE
  • TypeScript: Strict mode, no any, @/* path alias → ui/src/*
  • Fictional orgs only: AlphaKlinik Berlin, PharmaCo Research AG, MedReg DE, Limburg Medical Centre, Institut de Recherche Santé

Related Documentation