Relevance
This paper is one of the deliverable reports from the EOSC Association’s Long-Term Data Retention Task Force, focusing on the current needs and challenges related to long-term data retention. It provides an overview of the main issues faced by a wide range of stakeholders and communities regarding data retention, appraisal, and reappraisal, along with proposed solutions or recommendations addressing each identified challenge.
Scope
In total, nine key challenges are described in this report, based on the discussions and efforts of the Task Force. Several potential solutions or recommendations are provided for each identified challenge, encouraging relevant stakeholders and communities to consider them in their future practices.
Cross-links between challenges also has been observed, as some challenges may contribute to issues described in others, and certain solutions may apply to multiple challenge areas.
The target audience includes professionals working in data curation, retention, and long-term preservation, across research, education, government, and industry sectors.
Main highlights
Research data are often stored across a diverse and fragmented landscape of repositories and storage solutions, including institutional repositories, national archives, domain-specific repositories, and commercial cloud services. These repositories can differ significantly in their funding models, governance structures, and technological infrastructure, which may lead to inconsistencies in accessibility, metadata standards, data quality, and long-term sustainability. In this report, members of the LTDR Task Force have collaboratively identified nine key challenges. These, along with proposed solutions, are presented below:
- Challenge 1: Funding constraints
- Challenge 2: Lack of expertise
- Challenge 3: Discipline and content specific challenges
- Challenge 4: Lack of standards, guidelines and polices
- Challenge 5 : Understanding the value of data holdings
- Challenge 6: Technological challenges
- Challenge 7: Legal challenges
- Challenge 8: Ethical challenges
- Challenge 9: Trust
The report offers potential solutions and recommendations for each of the nine identified challenges.
Key recommendations
We have identified several recurring factors across different issues. These include gaps in funding and policy, limited awareness and understanding of data retention, and the lack of clear standards and guidelines. These issues are often complex and not easily addressed through single, one-size-fits-all recommendations. Instead, they require further investment in understanding how to better engage the relevant stakeholders to support data preservation activities.
Recommendation for funding constraints:
Centralised funding mechanisms could be one potential solution, particularly for the preservation of datasets originally generated through public funding. Similarly, the development of centrally funded digital preservation infrastructures may further strengthen and sustain long-term data retention practices.
Recommendations for limited awareness and understanding of data retention:
Develop strategies and guidelines to encourage organizations and institutions to support and promote the work of future researchers by embedding data retention and stewardship practices into everyday research workflows.
Specify flows and rules for the life cycle of the data, and help identify what and in which format to maintain and store the data.
Recommendation for the lack of standards and guidelines for data retention:
Specify recommended retention times and defining discipline-specific re-appraisal triggers ensures that digital assets remain relevant, accessible, and cost-efficient over time. Establishing clear guidelines not only supports compliance with ethical and legal requirements but also helps institutions make informed decisions about storage, curation, and eventual deaccessioning. Regular re-appraisal cycles further guarantee that preservation strategies adapt to evolving scientific priorities, technological changes, and community needs.