1.9 What are relevant legislations in relation to data archiving?

In European data archives, the archiving process is regulated by both national laws and international regulations

Image by FiveFlowersForFamilyFirst from Pixabay

The main legal issues in European data archives are intellectual property rights (e.g., copyright law) and personal data.


Copyright is an internationally recognised form of intellectual property right, which arises automatically as a result of original work such as research. It does not need to be registered to apply to a piece of work.

The copyrighted output from the research could include spreadsheets (and other forms of originally selected and organised data), publications, reports and computer programs. Copyright will not cover the underlying facts, ideas or concepts, but only the particular way in which the research outputs have been expressed. The right will lie with the author of the work, or with their relevant institution - different organisations will have different policies on intellectual property, see Copyright (CESSDA Training Team 2017-2022).

A role of a data archivist is to ensure that rights issues are resolved before accepting the data and documentation. Often, the data depositor is the owner of all deposited materials, but sometimes, with combining research disciplines or different research approaches, it can result in a complex structure of research materials with various authorship and therefore proprietary differences. In this case, all included parties must permit storing and sharing of data and/or other research outputs.

A common misunderstanding is that because material is openly available (e.g., from the internet) that it can also be archived. Archiving may be possible, however, public material can still be under copyright, and the owner will have to agree to archiving the data.

More information and useful tips are available in the chapter 'Copyright' of the Data Management Expert Guide (CESSDA Training Team 2017-2022).

Legal questions when handling personal data

The European General Data Protection Regulation of the European Union (2016/679) (GDPR) (GDPR.eu, n.d.) includes rules that organisations must follow to protect the personal information they collect. Archives must adhere to data protection requirements when managing or sharing personal data. Personal data are defined within the legislation as ‘any information relating to an identified or identifiable natural person’ whereby the person can be identified directly or indirectly (GDPR.eu 2019). Moreover, GDPR applies only to living persons.

Personal data must be processed in accordance with six principles (CESSDA Training Team 2017-2022). Archives typically handle two types of personal data that need to be addressed legally: (1) direct or indirect identifiers of research participants within the research data itself and (2) personal data of users that are collected and processed by accessing or using different archival services (e.g. names of depositors and secondary users of research data).

Personal data in research data

The GDPR contains an exemption which entails that some of the principles above are slightly different when collecting and processing personal data for research purposes. This is called the 'research exemption'.

Since the GDPR applies only to personal data, the first question to always ask is: Are personal data being processed in the study? If the answer is no, then the GDPR does not apply.

Personal data according to the GDPR:

Personal data means any information relating to an identified or identifiable natural person ('data subject')

'any information':

  • Objective or subjective
  • Any format
  • Accurate or inaccurate

'relating to':

  • Directly relating (including name)
  • Indirectly relating (not primary aim)

'an identified or identifiable':

  • Identified: directly differentiating one from the other
  • Identifiable: indirectly identifying by combining multiple information sources

'natural person':

  • Alive
  • Not including 'legal persons' (i.e., companies)

Depending on the nature and research design of a study, data that reaches the archive can either be 'raw' or be already curated (pseudonymised, anonymised, aggregated etc.). Since fully anonymised data may not always be very useful for secondary users, data archivists strive to always follow the rule: “As open as possible, as closed as necessary”, meaning they strive towards open data sharing as far as possible under the data protection laws. In this regard data archives, depending on their policies, may accept in special cases raw/sensitive or semi-anonymised research data. However, the distribution of this kind of data is limited by various access regulations (e.g. limited to a certain type of users, safeguarded access) - since, according to the GDPR, the distribution of personal data is not allowed without the prior consent of research participants. The data archive must have protocols and policies in place for receiving and storing data safely.

Personal data may be shared without restrictions when the researcher has received explicit consent from the research participants that allows the sharing of their personal data. (However, national laws vary, and this is not the case in e. g. Germany). The use of information sheets and especially consent forms is important in the light of the GDPR, as this defines the handling of the collected data. Archives usually demand getting a template of the study’s consent form in order to determine how data have to be prepared for dissemination, see 'DMEG - Chapter 5 ‘Protect’ (CESSDA Training Team 2017-2022).

Personal data in administrative data

When a user uses archival services (e.g. depositing data, accessing data or other services), personal data are shared with the archive, making users data subjects that are entitled to the rights set out in the GDPR (GDPR.eu 2019). Processing personal data relating to the users of a data archive must be done confidentially, transparently and lawfully, and data archivists must be sufficiently trained to handle personal data appropriately. A data archive must have a public data policy in place that provides sufficient details to its users about the personal data collection and processing, such as (1) who is the data controller/data protection officer and who can be contacted in case of questions, (2) how and for what purposes personal data is being processed, and (3) what the rights of users of archival services are.

Basic legal questions to ask regarding research data

A data archivist needs needs information about:

  • whether the researcher has obtained permission from the funder to share the data,
  • whether formal (valid) consent for sharing research data has been obtained from research participants,
  • who the owner of the data is (IPR = Intellectual Property Rights),
  • whether copyrights have been resolved,
  • whether research data include (sensitive) personal data.


Find out more about your archive

Here are some questions you can ask yourself to learn more about your own archive:

  • Does your archive provide support regarding privacy-sensitive data, complying with GDPR?
  • Are there national laws and codes of conduct in research that are important for your work, in addition to GDPR?
  • How does your archive handle ‘raw’ and processed (pseudonymised, anonymised, aggregated etc.) data?


Expert tips

Watch this video about the GDPR and research (Summers et al. 2019).