This project will deliver scientific use cases demonstrating how Earth Science collection data can be integrated into the EOSC Federation through the GeoCASe service and the EOSC Data Terra Node. Natural history collections curate extensive geological and paleontological specimens that document past climates, environmental change, natural hazards, and biodiversity dynamics across deep time. Despite their relevance for contemporary environmental research, these datasets remain underrepresented in EOSC and insufficiently connected to Earth observation infrastructures.
The project will enhance the FAIRness and interoperability of selected GeoCASe datasets by improving key metadata fields, including stratigraphic age, georeferencing, provenance, and alignment with recognised community standards. Two cross-domain use cases will demonstrate integration: (1) paleoclimate reconstruction using fossil pollen archives, and (2) stratigraphically constrained geological specimens relevant to hazard and environmental change research.
By linking deep-time specimen records with Data Terra’s present-day environmental observations, the project will extend the temporal depth of Earth observation systems. This integration supports improved environmental monitoring, calibration of climate and ecosystem models, and more robust hazard assessment within EOSC.
CETAF (Consortium of European Taxonomic Facilities) is a European network of natural history collection-holding institutions across 27 countries. Through its Earth Science Group (ESG), CETAF coordinates collaboration among institutions holding geological and palaeontological collections of high scientific and societal relevance. One of its core services, the Geoscience Collection Access Service (GeoCASe), aggregates geological and palaeontological specimen-level data from participating institutions, enabling cross-institutional discovery.
However, these datasets are not yet fully interoperable with EOSC services and remain largely disconnected from modelling and Earth Observation workflows. Key descriptors such as geological age, stratigraphic terminology, georeferencing, and taxonomic attribution are unevenly recorded across institutions, limiting reuse and cross-domain interoperability. This project addresses that gap directly.
Working in collaboration with the Data Terra Node, the project deploys two scientific use cases demonstrating how Earth Science collection data can be integrated into the EOSC Federation, with a focus on palaeoenvironmental and stratigraphically constrained material that provides crucial temporal depth to modern Earth Observation data.
The first use case centres on Holocene fossil pollen records (roughly 11,700 years BP to present), aggregated and quality-controlled by Tallinn University of Technology (TalTech). These well-dated records from central and northern Europe document plant community composition and dispersal through time at better-than-centennial resolution. Because fossil taxa can be linked to their Nearest Living Relatives, they can be compared directly with modern plant distributions and environmental datasets, enabling reconstruction of vegetation dynamics and derivation of quantitative climate parameters such as temperature and precipitation. This dataset contributes to the EU Horizon Europe “Past2Future” project, which aims to improve climate model performance by complementing modern observations with high-resolution palaeoecological data.
The second use case focuses on stratigraphically constrained mineral, rock, or fossil material relevant to solid earth applications, including specimens documenting past hazard events or geological processes. The final dataset will be defined in consultation with Data Terra partners to ensure alignment with solid earth priorities.
Across both use cases, the project will enhance metadata fields critical to environmental interpretation, including chronological information, stratigraphic descriptors, and refined locality data, aligned with community standards such as ABCD-EFG and assessed against FAIR principles to improve machine-readability and interoperability with the Data Terra Node.
Key outputs include a documented EOSC-aligned use case integrated with Data Terra; analysis of metadata requirements for modelling and environmental research; demonstrated linkage of specimen data with Data Terra domain datasets; practical recommendations for standards alignment; positioning of GeoCASe as a service within the Data Terra Node; and training materials contributed openly to the EOSC Academy.
Capacity building runs throughout the project, with workshops and webinars on metadata enhancement, standards application, and FAIR alignment, delivered in collaboration with CETAF DEST and DiSSCo. All documentation and workflows will be openly published to support sustainable integration of Earth Science collections within the EOSC Federation.
