1.2 What does an archive look like and what does it do?

What does a social science data archive look like?

Image presenting several colleagues working in the office behind a computer. Discussion. Archives nowadays mostly work with digital assets.

What data archives for social sciences look like differs considerably between European countries. They can vary in size, target audience, organisational position, the services that they perform and the community they serve.

An archive:

can be an independent institution, but it can also be part of an institution of higher education or other research institution,
will most likely have a physical visiting address,
as well as a director and employees,
provides research data to a variety of users via the Internet.

Differences in size can be defined as the differences in the number of employees, number of archived datasets, or number of datasets. An example of a smaller archive is the Czech Social Science Data Archive (CSDA), while an example of a larger archive is the UK Data Service (UKDS). For more information, you can browse the archives' websites, or refer to 'DMEG - Chapter 7 Discover' (CESSDA Training Team, 2017 - 2022).

Other differences can be due to the position of an archive. An archive embedded in or connected to a (higher education) institute can have a different workflow and focus than independent archives. The services provided by an archive can also differ based on its characteristics, as well as due to the national context (e.g., legislation and mandates). Different archives can also have different designated communities, or audiences. Some archives focus exclusively on certain disciplines or domains, whereas other archives are more general in their focus.

What does a social science data archive do?

An archive acquires data and documentation from the data producers, checks the quality of the documentation and data, prepares it for being shared with users and makes it available. In countries where the archiving of data from publicly funded research projects is not mandatory, an archive can actively address owners of data and ask them to provide their data for archiving. An archive also stores data and documentation, maintains and updates databases and file formats, and provides services to its users. Archives also provide training for data users (researchers, students, data journalists), data producers and the archiving community, e.g. the employees of other social science data archives.

The actual tasks of a social science data archive can be very broad and extensive. They vary based on the characteristics of an archive and its context, and will be developed and updated over time.

Curation of data

Data curation is one of the main activities of data archives. It is a chain of processes necessary for the long-term preservation of research data. Data curation means that datasets that are ingested, stored and shared need to be examined for consistency, authenticity, integrity, long-term quality and relevance over time. Implicit to the data curation are tasks of data preservation and data management (Palmer et al. 2013), which are both similarly broad terms encompassing many different tasks taken together. Depending on the division of roles and responsibilities in an archive, these tasks can be executed by one person or a large team.

Making data available

Another one of the core purposes of an archive is to make data available (e.g., for reuse). Data can be made available via an online data catalogue or specialized online software, such as Dataverse. Find out more about making data available in the section How does an archive make data available?

Providing sustainable links to information: Persistent Identifiers (PIDs)

A persistent identifier (PID) is a long-lasting reference to a document, file, web page, author, organisation or other object. While URLs can change or become unavailable, PIDs are meant to continue to point to a digital resource for the long term. By assigning PIDs to datasets, archives facilitate long lasting links that others can use to cite the dataset. A common PID used for datasets is a DOI (Digital Object Identifier).

What are the core job titles and what do they do?

There are typically several core roles in data archives, such as a data manager, data curator, policy advisor, privacy officer, and information system engineer. The type and naming of certain roles and positions, the number of people holding them, and the responsibilities assigned to them will vary across different archives, but they will often relate to the functions that a data archive must provide according to the Open Archival Information System - OAIS (find out more about OAIS in the section What is the overall process of archiving from beginning to end?)

Find out more about your archive

Here are some questions that you can ask yourself to learn more about your archive:

To what institution is your data archive affiliated?
Is your archive rather small or large?
What kind of training activities does your archive offer?
What are the core positions in your archive? What are the responsibilities of these positions?
How does your archive make data available to its users?

« Previous | Next »

1.2 What does an archive look like and what does it do?

What does a social science data archive look like?

What does a social science data archive do?

Curation of data

Making data available

Providing sustainable links to information: Persistent Identifiers (PIDs)

What are the core job titles and what do they do?

Find out more about your archive