Data are the new oil, and many new data are produced every day in research laboratories. Their alignment may raise some challenges related to the different needs and requirements shown by single researchers, funding or research organisations. The ‘Practical Guide to the International Alignment of Research Data Management’ published by Science Europe offers a picture of the core requirements for the preparation of Data Management Plans (DMPs) and criteria for the selection of trustworthy repositories. The document was launched in Brussels on 29 January 2019.
“The guide is designed to aid researchers in complying with Research Data Management (RDM) requirements even when working with different research funders and research organisations. Alignment is especially important in light of the development of the European Open Science Cloud (EOSC) and an increasing tendency, in the research community, towards data sharing” explained professor Stan Gielen, president of the Netherlands Organisation for Scientific Research, member of the Science Europe Governing Board, and co-initiator of this project. The guide has been developed with the contribution of several Science Europe Member Organisations and upon consultation with relevant stakeholders. The goal, said Secretary General of Science Europe, Stephan Kuster, is to encourage the wider research community to use it as a basis to set up their own DMP templates. “At a later stage it can also serve as reference document for the evaluation of DMPs.”, he added.
The importance of the quality of data
A high quality of research data is fundamental to build a robust framework supporting the re-use of data. The FAIR Data Principles (Findable, Accessible, Interoperable, Re-usable) have by now represented the reference for the management of permanently, publicly, and freely available data. Access might be limited in some instances due to the need of confidentiality on specific projects or for privacy reasons. A balanced approach towards openness to research data is therefore needed, and the Science Europe’s guidance offers some more insights on how to structure a Research Data Management system.
The core requirements should be considered by every funding or research organisation while building their own data management plan, as they help to consider all different aspects – from data generation to collection, to storage – since the very beginning of the research project. A detailed and reasoned template for DMP is also provided within the guidance, as a way to build a shared vision on the issue. This is not secondary, considering the wide fragmentation in data management policies at the level of different grant requirements and institutional policies of the funding organisations. The Annex to the guidance provides a table comparing the proposed criteria with the FAIR Data Principles.
The core data principles
Science Europe’s experts have developed a concise list including six core topics to be always considered in the setting up of the DMP, each of those further detailed by questions aimed to guide the analysis. Data description and collection or re-use of existing data is the first step, followed by the need to define the documentation and the quality measures connected with the generation and transmission of data. Not less important are the storage and backup measures put in place to secure data during research activities and the codes of conducts researchers have to follow as for legal and ethical requirements linked to the management of data. Re-use of research data also implies the need of a clear policy for data sharing and the long term preservation of data, which should include considerations on the softwares used to store and access data, the choice of data repositories or archives, the application of unique identifiers (i.e. the Digital Object Identifier). Finally, the DMP should also clearly specify the responsibilities for data management and the resources (financial and time) available to run the system according to the FAIR principles.
How to choose a trustworthy repository
A pre-requisite for the re-use of data is their storage in a trustworthy repository, that can be accessed under specific conditions by the parties interested to use the data to run further research. According to Science Europe, there is currently no generally accepted list of such repositories. More than 2,000 repositories are listed in general registries, and it might result difficult to assess their real maturity and trustworthiness because, for example, they have not been certified by a certification body. Many discipline-specific repositories use their own standards for data management, which are typical of the considered discipline, while other are based on more general criteria.
The guide from Science Europe indicates the opportunity to pursue certification. A broadly recognised discipline-specific or certified repository should always be the first choice as for where to store data. The guide also provides a set of four criteria (and related guiding questions) to help identify trustworthy repositories.
The first criteria to be met is the provision of a Persistent and Unique Identifiers (PIDs), which allow for the correct identification, search and retrieval of data, and helps the support for data versioning. Metadata allow for the finding and referencing of data and other related information and provide public available information also for non-published, protected, retracted, or deleted data. Metadata standards should be broadly accepted by the scientific community; metadata should be machine-retrievable. Persistence of data and metadata should also be guaranteed in order to provide preservation of the information. All policies and plans for data management should be transparent, including governance, financial sustainability, retention period, and continuity plan.
Access to data should be provided under well-specified conditions. It is important to always ensure data authenticity and integrity, while allowing for their retrieval. Information about licensing and permissions should be always provided, ideally in a machine-readable form. No matter to say that confidentiality and respect of the rights of data subjects and creators have to be always guaranteed.
GDPR and health scientific research
The theme of data management is also under the attention of the European Federation of the Pharmaceutical Industry Associations (EFPIA), that in October 2018 discussed in a workshop the impact of the General Data Protection Regulation (GDPR) on health scientific research. A widespread uncertainty regarding the rules to be applied was one of the main issues emerged from the workshop. According to a post published on EFPIA’s website and signed by Brendan Barnes, the application of GDPR is making clinical research unnecessarily more complex, with the risk to miss some opportunities to re-use data for valid research projects.
EFPIA thus asks for “clarity, direction and dialogue”. Central to the debate is the legal basis of the informed consent that should be provided by patients in order to re-use data for further, secondary research activities. To this regard, some opinions ask for the disassociation of consent from the legal basis for processing data obtained through clinical trials, so to make their re-use simpler. According to EFPIA’s view, this would not mean “to renouncing accountability to research participants. Rather, the GDPR offers a new approach to accountability requiring anyone processing personal data to reflect on its basis, be purposeful, transparent, document and demonstrate compliance”. There is still a lack of a supportive political context to enable collective pan-European solutions with respect to health research, says Brendan Barnes in the post.