Developer Guide
Technical documentation for developing, testing, and deploying the Health Dataspace v2 platform — an EHDS regulation reference implementation built on Eclipse Dataspace Components.
Onboarding
New to the project? Follow this quick-start path to get productive within your first day.
1. Get Running
- Clone the repository
docker compose up -d(Neo4j)- Seed schema & data (cypher-shell)
cd ui && npm install && npm run dev- Open
http://localhost:3000
2. Explore the Platform
- Switch personas via the User Menu
- Explore the Graph Explorer (center node)
- Browse the Data Catalog
- Check Patient Portal (PATIENT role)
- Review ODRL policies (HDAB role)
3. Key Concepts
- DSP: Dataspace Protocol — sovereign data exchange
- DCP: Decentralised Claims — DID + VC identity
- FHIR R4: Clinical data standard (EHR)
- OMOP CDM: Analytics layer for research
- EHDS: EU regulation for health data sharing
JAD Stack Architecture
The full JAD (Java Application Deployment) stack runs 19 Docker services orchestrated via docker-compose.yml + docker-compose.jad.yml. Services are grouped into five layers:
| Service | Port | Traefik | Purpose | Depends On |
|---|---|---|---|---|
| Traefik | :80 / :8090 | traefik.localhost | API gateway & reverse proxy | — |
| PostgreSQL 17 | :5432 | — | Runtime store (8 databases) | — |
| Vault | :8200 | vault.localhost | Secret management (dev mode) | — |
| Keycloak | :8080 | keycloak.localhost | OIDC SSO (realm: edcv, 7 users) | PostgreSQL |
| NATS | :4222 / :8222 | — | Async event mesh (JetStream) | — |
| Control Plane | :11003 | cp.localhost | DSP protocol + management API | PG, Vault, NATS, KC |
| Data Plane FHIR | :11002 | dp-fhir.localhost | FHIR PUSH transfer type | PG, Vault, CP |
| Data Plane OMOP | :11012 | dp-omop.localhost | OMOP PULL transfer type | PG, Vault, CP |
| Identity Hub | :11005 | ih.localhost | DCP v1.0 — DID + VC store | PG, Vault, KC |
| Issuer Service | :10013 | issuer.localhost | VC issuance + DID:web | PG, Vault, KC |
| Tenant Manager | :11006 | tm.localhost | CFM tenant lifecycle | PG, KC |
| Provision Manager | :11007 | pm.localhost | CFM resource provisioning | PG, KC, CP |
| Neo4j 5 | :7474 / :7687 | — | Knowledge graph (APOC + n10s) | — |
| Neo4j Proxy | :9090 | proxy.localhost | Express bridge: UI ↔ Neo4j | CP |
| Next.js UI | :3000 / :3003 | — | Application frontend | Neo4j |
Plus 4 background CFM agents (keycloak, edcv, registration, onboarding) and 1 one-shot seed container. Vault-bootstrap runs as a sidecar.
Prerequisites
Required
- Node.js 20+ — runtime for UI and proxy
- Docker Desktop — with Docker Compose V2
- 8 GB Docker RAM — required for full JAD stack
- Git — with pre-commit hooks enabled
Optional
- Python 3.11+ — for Synthea FHIR data loading
- gitleaks — local secret scanning (
brew install gitleaks) - lychee — broken link checker (
brew install lychee) - Playwright browsers — for E2E tests (
npx playwright install)
Port Requirements
The JAD stack requires these ports to be free: 80, 3000, 3003, 4222, 5432, 7474, 7687, 8080, 8090, 8200, 8222, 9090, 10013, 11002, 11003, 11005, 11006, 11007, 11012
Quick Start — Minimal Stack
Run Neo4j + Next.js UI with synthetic data. No JAD services needed.
1. Start Neo4j & load schema
docker compose up -d # Initialize schema (idempotent — safe to re-run) cat neo4j/init-schema.cypher | \ docker exec -i health-dataspace-neo4j \ cypher-shell -u neo4j -p healthdataspace # Load synthetic data (127 patients, 5300+ nodes) cat neo4j/insert-synthetic-schema-data.cypher | \ docker exec -i health-dataspace-neo4j \ cypher-shell -u neo4j -p healthdataspace
2. Start the UI
cd ui npm install npm run dev # → http://localhost:3000 npm test # Run 1,613 unit tests npm run lint # ESLint (max 55 warnings)
Quick Start — Full JAD Stack
The bootstrap script starts all 19 services with health checks, initializes Vault secrets, imports the Keycloak realm, and runs the 7-phase seed pipeline.
1. Bootstrap everything
# Full stack — takes ~3-5 min on first run ./scripts/bootstrap-jad.sh # Check status & endpoints ./scripts/bootstrap-jad.sh --status
2. Seed the dataspace
# Run all 7 seed phases (sequential, strict order) ./jad/seed-all.sh # Resume from a specific phase ./jad/seed-all.sh --from 3 # Run only one phase ./jad/seed-all.sh --only 5
3. Access the platform
# Live UI (production build) open http://localhost:3003 # Keycloak Admin Console open http://keycloak.localhost # admin / admin # Neo4j Browser open http://localhost:7474 # neo4j / healthdataspace # Traefik Dashboard open http://traefik.localhost
Common operations
./scripts/bootstrap-jad.sh --ui-only # Rebuild UI only (fast) ./scripts/bootstrap-jad.sh --seed # Re-run seed pipeline ./scripts/bootstrap-jad.sh --pull # Pull latest images ./scripts/bootstrap-jad.sh --down # Stop all services ./scripts/bootstrap-jad.sh --reset # Stop + remove volumes
Data Seeding Pipeline
The 7-phase seed pipeline populates the dataspace with tenants, credentials, policies, assets, and contracts. Phases must run in strict order — each depends on the previous.
| Phase | Script | Target Service | What It Does |
|---|---|---|---|
| 1 | seed-health-tenants.sh | Tenant Manager | Create 5 participant tenants via CFM |
| 2 | seed-ehds-credentials.sh | Issuer Service | Register EHDS credential types |
| 3 | seed-ehds-policies.sh | Control Plane | Create ODRL policies for all participants |
| 4 | seed-data-assets.sh | Control Plane | Register data assets + contracts |
| 5 | seed-contract-negotiation.sh | Control Plane | PharmaCo ↔ AlphaKlinik negotiations + data planes |
| 6 | seed-federated-catalog.sh | Control Plane | MedReg ↔ LMC federated catalog negotiation |
| 7 | seed-data-transfer.sh | Data Plane | Verify EDR tokens and data plane transfers |
Important: Vault secrets are lost on Docker restart (in-memory dev mode). Re-run ./scripts/bootstrap-jad.sh --seed after any docker compose down.
Project Structure
├── .github/workflows/ # CI/CD (test.yml, pages.yml, compliance.yml) ├── connector/ # EDC-V connector (Gradle multi-module) │ ├── controlplane/ # DSP + Management API │ ├── dataplane/ # FHIR + OMOP data planes │ └── identityhub/ # DCP Identity Hub ├── docs/ # Architecture docs, journeys, ADRs, reports ├── jad/ # JAD infrastructure configs │ ├── keycloak-realm.json # Keycloak realm (edcv, 7 users, 6 roles) │ ├── edcv-assets/ # Contract definitions & ODRL policies │ ├── seed-*.sh # 7-phase seed scripts │ └── openapi/ # OpenAPI specifications ├── k8s/ # Kubernetes / OrbStack manifests ├── neo4j/ # Cypher scripts & data │ ├── init-schema.cypher # Constraints, indexes, vector indexes │ ├── insert-synthetic-schema-data.cypher │ └── fhir-to-omop-transform.cypher ├── scripts/ # Automation (bootstrap, synthea, compliance) ├── services/neo4j-proxy/ # Express bridge (Neo4j ↔ UI) ├── ui/ # Next.js 14 application │ ├── src/app/ # 16 pages, 36 API routes │ ├── src/components/ # Shared React components │ ├── src/lib/ # auth.ts, api.ts, graph-constants.ts │ ├── __tests__/unit/ # Vitest unit tests │ ├── __tests__/e2e/ # Playwright specs (journeys/) │ └── public/mock/ # 38 JSON fixtures for static export ├── docker-compose.yml # Minimal stack (Neo4j + UI) └── docker-compose.jad.yml # Full JAD stack (19 services)
Neo4j Graph Schema
The 5-layer knowledge graph spans 27 node labels with 70+ indexes and 3 vector indexes for GraphRAG. Schema defined in neo4j/init-schema.cypher (idempotent — safe to re-run).
5 Semantic Layers
- L1 Marketplace: Participant, DataProduct, Contract, HDABApproval, OdrlPolicy
- L2 HealthDCAT-AP: Catalogue, HealthDataset, Distribution, DataService
- L3 FHIR R4: Patient, Encounter, Condition, Observation, MedicationRequest
- L4 OMOP CDM: OMOPPerson, ConditionOccurrence, DrugExposure, Measurement
- L5 Ontology: SnomedConcept, ICD10Code, RxNormConcept, LoincCode
Key Conventions
- Labels:
PascalCase - Relationships:
UPPER_SNAKE_CASE - Properties:
camelCase - Always
MERGE, neverCREATE - Constraints use
IF NOT EXISTS - 3 fulltext indexes (clinical, catalog, ontology search)
- 3 vector indexes (384-dim, cosine — for GraphRAG)
PostgreSQL Schema
PostgreSQL serves as the runtime store for all JAD services — EDC-V state machines, Keycloak identity, and CFM tenant metadata. Neo4j holds the health knowledge graph. This split follows ADR-1.
| Database | Service | Contents |
|---|---|---|
| controlplane | EDC-V Control Plane | Contract negotiations, transfer processes, asset definitions, policy store |
| dataplane_fhir | DCore FHIR | FHIR data plane state, EDR tokens, transfer tracking |
| dataplane_omop | DCore OMOP | OMOP data plane state, EDR tokens, transfer tracking |
| identityhub | Identity Hub | DID documents, verifiable credential store, key pairs |
| issuerservice | Issuer Service | Credential definitions, attestation records, issued VCs |
| keycloak | Keycloak | Users, roles, realm config, sessions, client scopes |
| cfm_tenant | Tenant Manager | Tenant records, VPA (Virtual Participant Agents) |
| cfm_provision | Provision Manager | Provisioning tasks, resource allocation records |
Neo4j vs PostgreSQL Split
- Neo4j: Health knowledge graph (FHIR, OMOP, ontologies), graph traversal queries, semantic search, GraphRAG vectors
- PostgreSQL: EDC-V runtime state machines, OIDC sessions, tenant metadata, credential storage — transactional ACID workloads
- Rationale: Graph queries for clinical relationships are orders of magnitude faster in Neo4j; EDC-V requires PostgreSQL for its state machine persistence
Integration Flows
Two canonical flows cover 90% of real-world EHDS integrations. Pick the one that matches your role, then use the Scalar API Reference to try each endpoint from the browser.
Data Consumer flow
Researcher / pharma / HTA body discovers and requests cross-border health data.
- Discover →
GET /api/catalogreturns HealthDCAT-AP datasets - Inspect →
GET /api/assetsfor access policies (ODRL) - Negotiate →
POST /api/negotiationsopens a DSP 2025-1 contract - Attest →
POST /api/credentials/presentproves role via DCP VC - Transfer →
GET /api/transfers/:idmonitors the data plane - Analyse →
GET /api/analyticsorPOST /api/nlq
DATA_USER · Persona: Dr. Petra Lang (PharmaCo Research)Data Provider flow
Hospital / clinic / registry publishes datasets for secondary use.
- Register →
POST /api/participantscreates a did:web identity - Publish →
POST /api/catalogadds a HealthDCAT-AP dataset - Policy →
PUT /api/assets/:idattaches an ODRL policy - HDAB approval →
GET /api/compliancetracks approval state - Accept →
GET /api/negotiationsshows incoming contract offers - Deliver → data plane pushes FHIR/OMOP bundles to the consumer
DATA_HOLDER · Persona: Dr. Klaus Weber (AlphaKlinik Berlin)API Reference
38 Next.js API routes proxy to Neo4j and EDC-V services. Routes are disabled in static export — mock data served from ui/public/mock/*.json.
| Route | Methods | Description |
|---|---|---|
| /api/graph | GET | Knowledge graph nodes & relationships |
| /api/graph/node, /expand, /validate | GET/POST | Node details, expansion, schema validation |
| /api/catalog | GET/POST/DELETE | HealthDCAT-AP dataset catalog |
| /api/analytics | GET | OMOP cohort analytics aggregates |
| /api/patient/* | GET | Patient profile, insights, research programmes |
| /api/eehrxf | GET | EEHRxF profile alignment data |
| /api/compliance, /tck | GET | EHDS compliance status, DSP TCK results |
| /api/credentials/* | GET/POST | Verifiable credential management |
| /api/negotiations/* | GET/POST | DSP contract negotiation lifecycle |
| /api/participants/* | GET | Participant registry and profiles |
| /api/transfers/* | GET | Data transfer history and status |
| /api/assets | GET | EDC-V asset registry |
| /api/tasks | GET | Transfer task queue |
| /api/nlq | POST | Natural language query (via proxy) |
| /api/federated | POST | Federated cross-participant query |
| /api/trust-center | GET | Trust center configuration |
| /api/health | GET | Health check endpoint (public) |
Testing
Unit Tests (Vitest)
- 1,613 tests across 80+ files
- 93.8% statement / 94.7% line coverage
- MSW for API mocking, Testing Library for components
- View Test Report
npm test # Run once npm run test:watch # Watch mode npm run test:coverage # With v8 coverage
E2E Tests (Playwright)
- 19 spec files (J001–J260 journeys)
- WCAG 2.2 AA accessibility audit (
27-wcag-accessibility) - OWASP/BSI security & pentest (
28-security-pentest) - Playwright Report
npm run test:e2e # Headless (chromium) npm run test:e2e:ui # Interactive UI # Against JAD stack PLAYWRIGHT_BASE_URL=http://localhost:3003 \ npm run test:e2e
DSP 2025-1 TCK
Validates EDC connector implements Dataspace Protocol correctly — catalog queries, contract negotiations, transfer processes.
./scripts/run-dsp-tck.sh
DCP v1.0
Verifies Decentralized Claims Protocol — DID resolution, credential presentation, trust framework.
./scripts/run-dcp-tests.sh
EHDS Domain
Health domain compliance — FHIR R4 bundles, OMOP transformation, HDAB approval chains, patient rights.
./scripts/run-ehds-tests.sh
EHDS User Journey
The full 8-step EHDS secondary-use journey with sequence diagrams, persona mappings, and E2E test coverage:
View Full User JourneyQuality Gates
Four-stage quality pipeline aligned with BSI C5, OWASP Top 10, EHDS regulation, and WCAG 2.2 AA.
CI/CD Pipeline
test.yml — Every Push
- UI Tests (Vitest) + coverage upload
- Neo4j Proxy Tests (Vitest)
- ESLint lint check
- Secret scan (gitleaks v8.27.2, SHA-256 verified)
- Dependency audit (npm audit --audit-level=high)
- Trivy security scan (v0.69.3, CVE-2026-33634 safe)
- Kubescape K8s posture (NSA + CIS frameworks)
- E2E + WCAG 2.2 AA + Security pentest (main only)
pages.yml — Deploy to GitHub Pages
- Run full Vitest suite with coverage
- Build Next.js for E2E, run Playwright
- Run WCAG 2.2 AA accessibility audit
- Run OWASP/BSI security tests
- Disable API routes (
mv src/app/api /tmp/api_disabled) - Build static export (
NEXT_PUBLIC_STATIC_EXPORT=true) - Copy test reports to output
- Deploy to GitHub Pages
compliance.yml — Weekly + Push to Main
Runs 3 protocol compliance suites against the full JAD stack: DSP 2025-1 TCK, DCP v1.0, and EHDS domain tests. Scheduled: Monday 06:00 UTC.
deploy-azure.yml — Azure Deployment
Deploys 13 Container Apps + 3 jobs to Azure via OIDC federation. Includes E2E smoke tests against the live Azure environment.
reset-demo.yml — Nightly Reset
Scheduled at 02:00 UTC daily. Restarts stateful services, re-bootstraps Vault/Keycloak, reseeds data, and runs smoke tests. Ensures GDPR data minimisation.
Latest Reports
Test Coverage Report
Vitest coverage report with per-file breakdown
Playwright E2E Report
E2E test results with screenshots and traces
CI Workflow Runs
GitHub Actions test suite history
Compliance Runs
DSP TCK, DCP, EHDS compliance results
Security Advisories
Trivy SARIF findings and dependency alerts
GitHub Pages Deploy
Static export build and deployment history
Data Flow
Release Notes
Azure Container Apps deployment, SIMPL-Open gap analysis, nightly demo reset, 14 ADRs, quality gates documentation
WCAG 2.2 AA zero violations, ODRL policy enforcement, OWASP/BSI security pentest, 778 Playwright assertions
Initial release: 5-layer Neo4j knowledge graph, 127 synthetic patients, DSP 2025-1 + DCP v1.0, 7 personas, GitHub Pages demo
Planning & Issues
Planning Document
Project roadmap, phase tracking, cross-cutting concerns, and ADR index
GitHub Issues
Feature requests, bug reports, and task tracking
CI/CD Workflows
test.yml, pages.yml, compliance.yml, deploy-azure.yml, reset-demo.yml
SIMPL-Open Gap Analysis
Comprehensive SIMPL vs EHDS gap analysis — DID/SSI rationale, supply chain transparency
Onboarding Guide
New developer onboarding — first day checklist, key concepts, common tasks
Conventions
- Commit messages: Conventional Commits format (feat:, fix:, docs:, chore:)
- Branch strategy: Feature branches → PR to main
- Pre-commit: 15 hooks — Prettier, ESLint, TypeScript, gitleaks, broken links, screenshot guard
- Pre-push: Full Vitest suite (--bail), npm audit (HIGH+)
- Cypher: UPPER_SNAKE_CASE relationships, PascalCase labels, camelCase properties, always MERGE
- TypeScript: Strict mode, no any, @/* path alias → ui/src/*
- Fictional orgs only: AlphaKlinik Berlin, PharmaCo Research AG, MedReg DE, Limburg Medical Centre, Institut de Recherche Santé