1.6 How does an archive make data available?
To make data available and reusable in the best way, it needs to be prepared adequately.
How an archive can make data available depends on the characteristics of the deposited data. Data archives set requirements for data and documentation deposited to the archive, that a data producer needs to follow. Archives often support the data producer to prepare the data:
- using file formats according to a preferred formats policy (see also What is archived?),
- ensuring data is stored and shared according to the consent from the participants and anonymised as defined by (inter)national legislation,
- using suitable license agreements for data sharing,
- determining the access category,
- providing proper documentation using suitable metadata standards.
Characteristics of the data determine how the data can be made accessible and available for reuse.
Licences
A data producer decides under what license data can be shared for reuse by the archive. Depending on the license, the data can be reused by a more broadly or narrowly defined group of users for more broad or narrow purposes. An archive should clearly state what licenses they support, to make it clear to the data producer whether their needs can be met. For an overview of the process of assigning licenses and the choice of licenses see ’DMEG - Chapter 6 Archive & Publish - Licensing your data’ (CESSDA Training Team 2017 - 2022).
Access categories
The license that a depositor assigns to a data set determines the level of access and the availability for reuse that can be offered by the data archive. Many data archives offer the following categories of access:
- open access: no restrictions to access
- access for registered users only
- restricted access, limited access upon request, for example in the case of sensitive data or if data can be used for specific purposes only
- under embargo: access only after a certain period of time.
Even if the data is only available for restricted access the metadata can be made openly available, so that the data is still easily discoverable.
Metadata standards
In social sciences data is often described and documented using the Data Documentation Initiative (DDI) standards (DDI Alliance n.d.) and using codebooks. See also ‘DMEG - Chapter 2 Organize & Document - Documentation and metadata’ (CESSDA Training Team 2017 - 2022). Using a standard metadata schema and terminology for common elements is crucial to descibe the data in a meaningful way and it is the first step into making the data discoverable.
Archives offer searching functionalities on the metadata provided. Furthermore, the discoverability of the data is highly improved by making the metadata available for harvesting by data portals. Common frameworks for this are OAI-PMH and ResourceSync, allowing the data to be harvested by repositories such as the CESSDA Data Catalogue and the ‘Explore dashboard’.
Promoting the reuse of data
Apart from offering search functionalities on their data and providing the metadata for harvesting, archives actively promote the discoverability and reuse of the data.
Examples of how archives can promote the use of data are:
- publish information on the data sets in their news channels.
- promoting data citation by providing guidance for researchers on using PIDs for person names, projects, organisations in articles and metadata of the data, see for example the video ‘What are Persistent Identifiers and why to use them?' by FAIRsFAIR EU (2022) explaining the principle, and the ‘DMEG - Chapter 6 Archive & Publish - Citing your data’ (CESSDA Training Team 2017-2022).
- organise events such as the ‘Dutch Data Prize’ (RDNL).
- organise webinars on the reuse of specific data, see the planned events of the UKDS (UK Data Service).
- create statistics on the usage of the data (OpenAIRE).
For more information on this, see ‘DMEG - Chapter 6 Archive and Publish - Promoting your data’ (CESSDA Training Team 2017-2022).
Find out more about your archive
Here are some questions that you can ask yourself to learn more about your archive:
- What license and access categories does your archive support?
- Are there agreements that depositors have to sign when they deposit data in the archive?
- Is the data of your data archive is harvested by other data repositories? What metadata service endpoint does your archive provide to facilitate this?
- What activities does your data archive carry out to promote the use of the data that is stored?
Expert tips
Watch webinars from the UKDS on the reuse of data, for example “Key issues in reusing data” (UK Data Service 2020)