FAIR Data Points
on static hosting
Three open-source reference implementations — FDP, FDP Index, and Virtual Platform —
deployable with no dedicated server. This deck explains how each layer works
and what you need to deploy one.
staticfdp
Publishes a DCAT-conformant FDP from form submissions. Forms → Issues → RDF → Pages.
staticfdp-index
Harvests registered FDPs and serves a discovery catalog as static RDF + HTML.
staticfdp-vp
Aggregates multiple FDP Indexes into a federated discovery hub.
FDP · FDP Index · Virtual Platform
Three distinct layers. Each can be deployed independently —
together they form a fully federated FAIR data infrastructure.
Layer 1
FAIR Data Point
Publishes datasets as machine-readable DCAT metadata at a stable URL
- One per project / institution
- Metadata hierarchy: MetadataService → Catalog → Dataset → Distribution
- Serves Turtle + JSON-LD via static files
- StaticFDP: GitHub/Codeberg Pages
registers
(ping / YAML)
→
harvests
catalog.ttl
←
Layer 2
FDP Index
Central registry that discovers and harvests metadata from all registered FDPs
- One per community / domain
- Stores: FDP IRI, health state, harvested titles
- Exposes an index catalog as RDF
- StaticFDP:
staticfdp-index
registers
index URL
→
aggregates
index.ttl
←
Layer 3
Virtual Platform
Federated discovery hub — aggregates multiple indexes, data stays at source
- One per federation / consortium
- No data centralisation — only pointers
- Serves a federation graph as RDF
- StaticFDP:
staticfdp-vp
Live example:
·
vp.erdera.org — ERDERA RD Discovery Portal
170+ partners, 37 countries — Horizon Europe grant N°101156595
What is a FAIR Data Point?
A stable URL that returns machine-readable metadata about datasets — written in RDF, structured as a DCAT hierarchy.
Three roles in one
- Metadata API — serves RDF at stable IRIs (Turtle, JSON-LD)
- Catalogue — organises datasets into a hierarchy
- FAIR gateway — carries provenance, licence, access links
Standards
DCAT 3
Dublin Core
FDP-O ontology
SHACL profiles
schema.org
re3data
Metadata hierarchy
fdp:MetadataService
— the FDP root
dcat:Catalog
— thematic collection
dcat:Dataset
— one cohort, study, registry…
dcat:Distribution
— TTL, CSV, FHIR, API…
# FDP root (Turtle)
<https://example.org/fdp/>
a fdp:MetadataService, r3d:Repository ;
dcterms:title "My FAIR Data Point"@en ;
dcterms:license <https://creativecommons.org/licenses/by/4.0/> ;
fdp:hasCatalog :catalog .
:catalog a dcat:Catalog ;
dcat:dataset :my-dataset .
:my-dataset a dcat:Dataset ;
dcat:distribution :dist-ttl .
:dist-ttl a dcat:Distribution ;
dcterms:format "text/turtle" ;
dcat:downloadURL <https://example.org/fdp/data.ttl> .
staticfdp
Layer 1 — FAIR Data Point
Form submissions become GitHub / Forgejo Issues. A CI pipeline converts them to RDF Turtle. GitHub / Codeberg Pages serves the FDP.
How it works
👤 ORCID login + web form
↓ POST to GitHub / Forgejo Issues
☁ Cloudflare Worker (or Deno / Hetzner)
↓ issue created (dual-write optional)
⚙ GitHub Actions / Woodpecker CI
↓ runs issues_to_datasets.py
🐢 RDF Turtle + JSON-LD → docs/fdp/
↓ git commit + push
🌐 GitHub Pages / Codeberg Pages → FDP URL
Key insight: GitHub Issues = structured, ORCID-attributed data store.
Git = version history. Pages = FDP server. No new infrastructure needed.
What you need to deploy
- 1Fork StaticFDP/staticfdp on GitHub or Codeberg
- 2Run bash scripts/setup.sh — choose GitHub, Codeberg, or both; sets
fdp-config.yaml
- 3Register an ORCID Member API app at orcid.org/developer-tools → get Client ID + Secret
- 4Deploy the Cloudflare Worker:
cd worker && npx wrangler deploy
(or run infra/hetzner-deploy.sh on a Hetzner VPS for EU hosting)
- 5Set 5 secrets via
wrangler secret put:
GITHUB_TOKEN · ORCID_CLIENT_ID · ORCID_CLIENT_SECRET · SESSION_SECRET
+ FORGEJO_TOKEN (if dual-write to Codeberg)
- 6Enable GitHub Pages in repo Settings → Pages → Branch: main, Path: /docs
- 7✓ Done — every form submission creates a GitHub Issue; CI converts it to RDF within ~60 s
Minimal cost
- GitHub / Codeberg — free
- Cloudflare Workers free tier — 100k req/day free
- Hetzner CX11 (EU alt.) — €3.79 / month
- ORCID Member API — free for researchers
staticfdp-index
Layer 2 — FDP Index
FDPs register by opening an Issue or adding a YAML file. A CI pipeline harvests their catalogs daily and publishes a DCAT index catalog as static RDF + HTML.
How it works
📋 FDP operator opens "Register FDP" Issue
↓ YAML added to registered-fdps/
⏰ Daily schedule (or on new issue)
↓ CI triggers harvest_fdps.py
🔍 Fetches catalog.ttl from each FDP
↓ validates DCAT, extracts titles
🐢 Writes docs/fdp-index/index.ttl + index.jsonld
↓ git commit + push
🌐 Static index live on GitHub / Codeberg Pages
What the index catalog contains
- One
dcat:Dataset per registered FDP
- Title, description, landing page, catalog URL
- Last harvest timestamp
- dcat:Distribution pointing to each FDP's catalog.ttl
What you need to deploy
- 1Fork StaticFDP/staticfdp-index on GitHub or Codeberg
- 2Run bash scripts/setup.sh — choose platform, set title + publisher
- 3Enable GitHub Pages (branch: main, path: /docs)
and / or Codeberg Pages in repo settings
- 4Set
GITHUB_TOKEN secret for the Actions runner to commit generated files
(fine-grained PAT: Contents read + write)
- 5Add FDPs to register: create
registered-fdps/my-fdp.yaml with catalog_url, title, description
- 6✓ Done — harvest runs daily at 04:25 UTC; trigger manually via workflow_dispatch any time
# registered-fdps/ga4gh-rdp.yaml
title: "GA4GH Rare Disease Trajectories"
catalog_url: "https://fdp.semscape.org/…/fdp/catalog.ttl"
landing_page: "https://fdp.semscape.org/…/"
description: "Rare disease cases from GA4GH BYOD 2026"
contact: "0000-0001-9773-4008"
No server needed. The index is rebuilt by CI and served as static files —
same principle as the FDP itself.
Real-world FDP Indexes
- index.vp.erdera.org — ERDERA rare disease community index (Horizon Europe)
- index.fairdatapoint.org — public index, reference implementation
- Health-RI National Catalogue harvests from per-institution FDPs across the Netherlands
staticfdp-vp
Layer 3 — Virtual Platform
Multiple FDP Indexes are aggregated into one federation graph. Data stays at source — the VP only holds pointers. Rebuilt by CI on a schedule.
How it works
📋 Index operator opens "Register FDP Index" Issue
↓ YAML added to registered-indexes/
⏰ Daily schedule (or on new issue)
↓ CI triggers build_vp.py
🔍 Fetches index.ttl from each FDP Index
↓ counts FDPs, merges catalogs
🐢 Writes docs/vp/federation.ttl + federation.jsonld
↓ git commit + push
🌐 Federation graph live on GitHub / Codeberg Pages
In practice (rare disease example)
- Hospital A FDP registers with Rare Disease Index
- Registry B FDP registers with Rare Disease Index
- Rare Disease Index registers with the VP
- Researcher queries VP federation.ttl → gets all FDP IRIs
- Downloads distributions directly from A and B — no central warehouse
What you need to deploy
- 1Fork StaticFDP/staticfdp-vp on GitHub or Codeberg
- 2Run bash scripts/setup.sh — choose platform, set title + publisher
- 3Enable GitHub Pages (branch: main, path: /docs)
and / or Codeberg Pages
- 4Set
GITHUB_TOKEN secret (Contents read + write)
- 5Add FDP Indexes: create
registered-indexes/my-index.yaml with index_url, title, description
- 6✓ Done — aggregation runs daily at 04:40 UTC
# registered-indexes/rare-disease.yaml
title: "Rare Disease Community FDP Index"
index_url: "https://example.org/fdp-index/index.ttl"
landing_page: "https://example.org/fdp-index/"
description: "Index for rare disease FDPs worldwide"
contact: "maintainer@example.org"
ERDERA Virtual Platform — the largest live deployment:
vp.erdera.org
aggregates rare disease FDPs across 170+ partners in 37 countries.
Funded by Horizon Europe (grant N°101156595), coordinated by INSERM.
Successor to the EJP-RD Virtual Platform.
Real-world VPs
- vp.erdera.org — ERDERA RD Discovery Portal (EU, Horizon Europe)
- ELIXIR national nodes federation (Health-RI, BBMRI-ERIC, EMBL-EBI)
- GA4GH Data Connect + Beacon v2 share the same "data stays at source" principle
The Full Ecosystem Stack
Three independent deployments that compose into a federated infrastructure —
each with its own GitHub + Codeberg mirror.
FDP (ref. impl)
SIB/SPHN
fdp.dcc.sib.swiss
FDP (ref. impl)
Health-RI NL
catalogus.healthdata.nl
FDP (any impl)
Your institution
fork staticfdp →
staticfdp
GA4GH session ↑
fdp.semscape.org
↓ ↓ ↓ ↓
register + harvest
FDP Index (ref. impl / staticfdp-index)
ERDERA Rare Disease Index
index.vp.erdera.org — harvests 100s of FDPs across ERNs
index.fairdatapoint.org
… other community indexes
↓ ↓
register + aggregate
Virtual Platform (ref. impl / staticfdp-vp)
vp.erdera.org
— ERDERA RD Discovery Portal · 170+ partners, 37 countries, Horizon Europe
GitHub: Pages + Actions · github.com/StaticFDP
Codeberg: Pages + Woodpecker CI · codeberg.org/StaticFDP
StaticFDP vs Reference Implementation
Same metadata model — different infrastructure philosophy. Choose based on your needs.
When to use StaticFDP
- No server / DevOps budget
- Community-contributed data (forms + ORCID auth)
- Want Git-native version history
- Short-term event / session FDP
- EU or US jurisdiction choice matters
- Biohackathon / research group
When to use reference impl
- Need dynamic REST API
- Data steward web UI required
- Real-time SHACL validation
- Built-in SPARQL endpoint
- Production hospital / national node
| Feature |
Reference impl FAIRDataTeam |
staticfdp |
staticfdp-index |
staticfdp-vp |
| DCAT + FDP-O metadata |
✓ |
✓ |
✓ |
✓ |
| Content negotiation |
Dynamic |
Static files |
Static files |
Static files |
| SHACL validation |
On write |
On CI |
On CI |
On CI |
| SPARQL endpoint |
Built-in |
External demo |
— |
— |
| Data submission UI |
Admin only |
Public + ORCID |
Issue template |
Issue template |
| Version history |
Internal DB |
Git |
Git |
Git |
| FDP Index registration |
Auto on startup |
GitHub Action |
Is the index |
n/a |
| Hosting cost |
Server required |
Free |
Free |
Free |
| EU hosting option |
Deploy anywhere |
Codeberg / Hetzner |
Codeberg |
Codeberg |
Live reference deployment
Deploy your own & access the data
Fork any of the three repos, run setup.sh, and your FAIR Data Point infrastructure
is live in minutes.
# Python — load the reference deployment
from rdflib import Graph
g = Graph()
g.parse("https://fdp.semscape.org/ga4gh-rare-disease-trajectories/fdp/catalog.ttl")
print(len(g), "triples")