StaticFDP Ecosystem

Andra Waagmeester  ·  Amsterdam UMC  ·  ORCID 0000-0001-9773-4008

FAIR Data Points
on static hosting

Three open-source reference implementations — FDP, FDP Index, and Virtual Platform — deployable with no dedicated server. This deck explains how each layer works and what you need to deploy one.

staticfdp

Publishes a DCAT-conformant FDP from form submissions. Forms → Issues → RDF → Pages.

staticfdp-index

Harvests registered FDPs and serves a discovery catalog as static RDF + HTML.

staticfdp-vp

Aggregates multiple FDP Indexes into a federated discovery hub.

github.com/StaticFDP → codeberg.org/StaticFDP →

FDP · FDP Index · Virtual Platform

Three distinct layers. Each can be deployed independently — together they form a fully federated FAIR data infrastructure.

Layer 1 FAIR Data Point Publishes datasets as machine-readable DCAT metadata at a stable URL
  • One per project / institution
  • Metadata hierarchy: MetadataService → Catalog → Dataset → Distribution
  • Serves Turtle + JSON-LD via static files
  • StaticFDP: GitHub/Codeberg Pages
Live examples:
· fdp.dcc.sib.swiss — SIB/SPHN (Switzerland)
· catalogus.healthdata.nl — Health-RI (Netherlands)
· fdp.semscape.org/ga4gh-rare-disease-trajectories/fdp/ — this session
registers
(ping / YAML)
harvests
catalog.ttl
Layer 2 FDP Index Central registry that discovers and harvests metadata from all registered FDPs
  • One per community / domain
  • Stores: FDP IRI, health state, harvested titles
  • Exposes an index catalog as RDF
  • StaticFDP: staticfdp-index
Live example:
· index.vp.erdera.org — ERDERA rare disease index
· index.fairdatapoint.org — public FDP index (reference impl)
registers
index URL
aggregates
index.ttl
Layer 3 Virtual Platform Federated discovery hub — aggregates multiple indexes, data stays at source
  • One per federation / consortium
  • No data centralisation — only pointers
  • Serves a federation graph as RDF
  • StaticFDP: staticfdp-vp
Live example:
· vp.erdera.org — ERDERA RD Discovery Portal
170+ partners, 37 countries — Horizon Europe grant N°101156595

What is a FAIR Data Point?

A stable URL that returns machine-readable metadata about datasets — written in RDF, structured as a DCAT hierarchy.

Three roles in one

  • Metadata API — serves RDF at stable IRIs (Turtle, JSON-LD)
  • Catalogue — organises datasets into a hierarchy
  • FAIR gateway — carries provenance, licence, access links

Standards

DCAT 3 Dublin Core FDP-O ontology SHACL profiles schema.org re3data

Specification & reference implementation

Spec: specs.fairdatapoint.org
Ontology: FDP-Ohttps://w3id.org/fdp/fdp-o#
Docs: docs.fairdatapoint.org

Reference impl: Java 17 + Spring Boot + MongoDB + Blazegraph
FAIRDataTeam/FAIRDataPoint

Metadata hierarchy

fdp:MetadataService
— the FDP root
dcat:Catalog
— thematic collection
dcat:Dataset
— one cohort, study, registry…
dcat:Distribution
— TTL, CSV, FHIR, API…
# FDP root (Turtle) <https://example.org/fdp/> a fdp:MetadataService, r3d:Repository ; dcterms:title "My FAIR Data Point"@en ; dcterms:license <https://creativecommons.org/licenses/by/4.0/> ; fdp:hasCatalog :catalog . :catalog a dcat:Catalog ; dcat:dataset :my-dataset . :my-dataset a dcat:Dataset ; dcat:distribution :dist-ttl . :dist-ttl a dcat:Distribution ; dcterms:format "text/turtle" ; dcat:downloadURL <https://example.org/fdp/data.ttl> .

staticfdp

Layer 1 — FAIR Data Point

Form submissions become GitHub / Forgejo Issues. A CI pipeline converts them to RDF Turtle. GitHub / Codeberg Pages serves the FDP.

How it works

👤 ORCID login + web form
↓ POST to GitHub / Forgejo Issues
☁ Cloudflare Worker (or Deno / Hetzner)
↓ issue created (dual-write optional)
⚙ GitHub Actions / Woodpecker CI
↓ runs issues_to_datasets.py
🐢 RDF Turtle + JSON-LD → docs/fdp/
↓ git commit + push
🌐 GitHub Pages / Codeberg Pages → FDP URL
Key insight: GitHub Issues = structured, ORCID-attributed data store. Git = version history. Pages = FDP server. No new infrastructure needed.

What you need to deploy

  • 1Fork StaticFDP/staticfdp on GitHub or Codeberg
  • 2Run bash scripts/setup.sh — choose GitHub, Codeberg, or both; sets fdp-config.yaml
  • 3Register an ORCID Member API app at orcid.org/developer-tools → get Client ID + Secret
  • 4Deploy the Cloudflare Worker: cd worker && npx wrangler deploy
    (or run infra/hetzner-deploy.sh on a Hetzner VPS for EU hosting)
  • 5Set 5 secrets via wrangler secret put:
    GITHUB_TOKEN  ·  ORCID_CLIENT_ID  ·  ORCID_CLIENT_SECRET  ·  SESSION_SECRET
    + FORGEJO_TOKEN (if dual-write to Codeberg)
  • 6Enable GitHub Pages in repo Settings → Pages → Branch: main, Path: /docs
  • 7✓ Done — every form submission creates a GitHub Issue; CI converts it to RDF within ~60 s

Minimal cost

  • GitHub / Codeberg — free
  • Cloudflare Workers free tier — 100k req/day free
  • Hetzner CX11 (EU alt.) — €3.79 / month
  • ORCID Member API — free for researchers

staticfdp-index

Layer 2 — FDP Index

FDPs register by opening an Issue or adding a YAML file. A CI pipeline harvests their catalogs daily and publishes a DCAT index catalog as static RDF + HTML.

How it works

📋 FDP operator opens "Register FDP" Issue
↓ YAML added to registered-fdps/
⏰ Daily schedule (or on new issue)
↓ CI triggers harvest_fdps.py
🔍 Fetches catalog.ttl from each FDP
↓ validates DCAT, extracts titles
🐢 Writes docs/fdp-index/index.ttl + index.jsonld
↓ git commit + push
🌐 Static index live on GitHub / Codeberg Pages

What the index catalog contains

  • One dcat:Dataset per registered FDP
  • Title, description, landing page, catalog URL
  • Last harvest timestamp
  • dcat:Distribution pointing to each FDP's catalog.ttl

What you need to deploy

  • 1Fork StaticFDP/staticfdp-index on GitHub or Codeberg
  • 2Run bash scripts/setup.sh — choose platform, set title + publisher
  • 3Enable GitHub Pages (branch: main, path: /docs)
    and / or Codeberg Pages in repo settings
  • 4Set GITHUB_TOKEN secret for the Actions runner to commit generated files
    (fine-grained PAT: Contents read + write)
  • 5Add FDPs to register: create registered-fdps/my-fdp.yaml with catalog_url, title, description
  • 6✓ Done — harvest runs daily at 04:25 UTC; trigger manually via workflow_dispatch any time
# registered-fdps/ga4gh-rdp.yaml title: "GA4GH Rare Disease Trajectories" catalog_url: "https://fdp.semscape.org/…/fdp/catalog.ttl" landing_page: "https://fdp.semscape.org/…/" description: "Rare disease cases from GA4GH BYOD 2026" contact: "0000-0001-9773-4008"
No server needed. The index is rebuilt by CI and served as static files — same principle as the FDP itself.

Real-world FDP Indexes

  • index.vp.erdera.org — ERDERA rare disease community index (Horizon Europe)
  • index.fairdatapoint.org — public index, reference implementation
  • Health-RI National Catalogue harvests from per-institution FDPs across the Netherlands

staticfdp-vp

Layer 3 — Virtual Platform

Multiple FDP Indexes are aggregated into one federation graph. Data stays at source — the VP only holds pointers. Rebuilt by CI on a schedule.

How it works

📋 Index operator opens "Register FDP Index" Issue
↓ YAML added to registered-indexes/
⏰ Daily schedule (or on new issue)
↓ CI triggers build_vp.py
🔍 Fetches index.ttl from each FDP Index
↓ counts FDPs, merges catalogs
🐢 Writes docs/vp/federation.ttl + federation.jsonld
↓ git commit + push
🌐 Federation graph live on GitHub / Codeberg Pages

In practice (rare disease example)

  • Hospital A FDP registers with Rare Disease Index
  • Registry B FDP registers with Rare Disease Index
  • Rare Disease Index registers with the VP
  • Researcher queries VP federation.ttl → gets all FDP IRIs
  • Downloads distributions directly from A and B — no central warehouse

What you need to deploy

  • 1Fork StaticFDP/staticfdp-vp on GitHub or Codeberg
  • 2Run bash scripts/setup.sh — choose platform, set title + publisher
  • 3Enable GitHub Pages (branch: main, path: /docs)
    and / or Codeberg Pages
  • 4Set GITHUB_TOKEN secret (Contents read + write)
  • 5Add FDP Indexes: create registered-indexes/my-index.yaml with index_url, title, description
  • 6✓ Done — aggregation runs daily at 04:40 UTC
# registered-indexes/rare-disease.yaml title: "Rare Disease Community FDP Index" index_url: "https://example.org/fdp-index/index.ttl" landing_page: "https://example.org/fdp-index/" description: "Index for rare disease FDPs worldwide" contact: "maintainer@example.org"
ERDERA Virtual Platform — the largest live deployment:
vp.erdera.org aggregates rare disease FDPs across 170+ partners in 37 countries.
Funded by Horizon Europe (grant N°101156595), coordinated by INSERM.
Successor to the EJP-RD Virtual Platform.

Real-world VPs

  • vp.erdera.org — ERDERA RD Discovery Portal (EU, Horizon Europe)
  • ELIXIR national nodes federation (Health-RI, BBMRI-ERIC, EMBL-EBI)
  • GA4GH Data Connect + Beacon v2 share the same "data stays at source" principle

The Full Ecosystem Stack

Three independent deployments that compose into a federated infrastructure — each with its own GitHub + Codeberg mirror.

FDP (ref. impl)
SIB/SPHN
fdp.dcc.sib.swiss
FDP (ref. impl)
Health-RI NL
catalogus.healthdata.nl
FDP (any impl)
Your institution
fork staticfdp →
staticfdp
GA4GH session ↑
fdp.semscape.org
↓   ↓   ↓   ↓ register + harvest
FDP Index (ref. impl / staticfdp-index)
ERDERA Rare Disease Index
index.vp.erdera.org — harvests 100s of FDPs across ERNs
index.fairdatapoint.org
… other community indexes
↓   ↓ register + aggregate
Virtual Platform (ref. impl / staticfdp-vp)
vp.erdera.org — ERDERA RD Discovery Portal  ·  170+ partners, 37 countries, Horizon Europe
GitHub: Pages + Actions  ·  github.com/StaticFDP
Codeberg: Pages + Woodpecker CI  ·  codeberg.org/StaticFDP

StaticFDP vs Reference Implementation

Same metadata model — different infrastructure philosophy. Choose based on your needs.

When to use StaticFDP

  • No server / DevOps budget
  • Community-contributed data (forms + ORCID auth)
  • Want Git-native version history
  • Short-term event / session FDP
  • EU or US jurisdiction choice matters
  • Biohackathon / research group

When to use reference impl

  • Need dynamic REST API
  • Data steward web UI required
  • Real-time SHACL validation
  • Built-in SPARQL endpoint
  • Production hospital / national node
Feature Reference impl
FAIRDataTeam
staticfdp staticfdp-index staticfdp-vp
DCAT + FDP-O metadata
Content negotiation Dynamic Static files Static files Static files
SHACL validation On write On CI On CI On CI
SPARQL endpoint Built-in External demo
Data submission UI Admin only Public + ORCID Issue template Issue template
Version history Internal DB Git Git Git
FDP Index registration Auto on startup GitHub Action Is the index n/a
Hosting cost Server required Free Free Free
EU hosting option Deploy anywhere Codeberg / Hetzner Codeberg Codeberg

Live reference deployment

Deploy your own & access the data

Fork any of the three repos, run setup.sh, and your FAIR Data Point infrastructure is live in minutes.

Fork on GitHub
staticfdp
Static FAIR Data Point
Fork on GitHub
staticfdp-index
Static FDP Index
Fork on GitHub
staticfdp-vp
Static Virtual Platform
RDF Turtle
catalog.ttl
Load with rdflib, Jena, any triple store
Per-disease datasets
diseases/index
One TTL + JSON-LD per disease
Interactive
SPARQL demo
Live query — no setup needed
# Python — load the reference deployment from rdflib import Graph g = Graph() g.parse("https://fdp.semscape.org/ga4gh-rare-disease-trajectories/fdp/catalog.ttl") print(len(g), "triples")
Arrow keys · click to advance