DDI-CDI: Optimising Your Data Description for Integration and Reuse

The goal of this workshop is to explain the mechanism employed by DDI-CDI and how it can most easily be leveraged to enhance the reusability of research data. DDI-CDI is a model-based, platform- and technology-independent specification designed to supplement the metadata holdings of data disseminators, archives, and producers. By allowing for an expression of structural metadata, with references to external controlled vocabularies and ontologies, and by connecting metadata records intended for discovery, provenance, and process description, it can act as a connector format which is independent of domain standards. Typically, it can be produced in a programmatic fashion from existing metadata records held in more domain-specific models, although it can also be used as a stand-alone specification. It supports granular, machine-actionable description of a wide variety of data, from traditional wide data files to event/streaming data to key-value (“big”) data and multidimensional cubes.

This workshop will present an overview followed by a series of worked examples, with an exploration of different types of implementations and features of the standard in each. The intent is to give more than an overview, to help participants understand not only what DDI-CDI is intended to do, but also how it works to complement other popular metadata models and standards. Different syntax representations of the standards will be discussed.

DDI has long published metadata standards for the social, economic and behavioural sciences, which are widely used among data producers and archives, including those in the CESSDA network, such as the Swedish National Data Service, the UK Data Archive, Sciences Po, Gesis, Sikt - the Norwegian Agency for Shared Services in Education and Research and many more. DDI-CDI represents an evolution reflecting the growing importance of cross-disciplinary research and the requirement for data services to describe new types of data coming from other domains. The result is a specification which can describe any data in a domain agnostic fashion and is useful within domains for which other DDI specifications are not relevant. Because of this domain independent feature, it has become central to the WorldFAIR project work on the Cross-Domain Interoperability Framework.

Each part comprises three topics, which will each be structured around a presentation and discussion.

Target Audience:

This workshop is intended to be useful to both technical and operational staff working in organisations which produce, archive, integrate, and disseminate quantitative research data, regardless of domain orientation. It is intended to address questions about what the practical implementation of systems supporting the FAIR principles will look like, and will appeal to infrastructure players who are concerned with broadening and deepening the reusability of their data holding through enhanced data and provenance description.

Part One: FAIR Functional Drivers and Requirements 13:00-13:30: The Variable Cascade: concepts, measures and observations. 13:30-14:00: Data Structures: the roles of concepts and variables. 14:00-14:30: Provenance: connecting data through process. 14:30-15:00 UTC: Break Part Two: System Functions and Supporting Standards 15:00-15:30: Data integration across domains and structures. 15:30-16:00: Process description and alignment with PROV. 16:00-16:30: DDI-CDI as the connection point for a set of related specifications (CDIF example). Organisational Note: The workshop will be recorded and the recordings will be made available via CODATA Vimeo. If you plan to attend the event virtually, kindly note the Data Statement for CODATA Zoom at: https://drive.google.com/file/d/1QdZMRNs9h3Md4ArIiJepR15f3MLOYros/view?usp=sharing All attendees, onsite and online are expected to comply with the CODATA Code of Conduct: https://codata.org/about-codata/codata-policies-and-guidelines/code-of-conduct/