1.4 What is data acquisition? 

Acquiring data from producers

Image presents a data archivist that receive several different digital objects (documentation, data file and a like) that she will need to review, archive, and publish.

Data acquisition is the process of acquiring data from data producers who are in this context referred to as data depositors.

In some countries there are requirements for sharing data by research funding organisations, i.e. the data producers are obliged to make their data available. They usually do so in cooperation with data archives. The data acquisition is thus initiated by producers and archives are “passive” ingestors of data.

In other countries, where data producers are not obliged to share their data, archives must actively search for potential depositors and convince them to store their data in the repository. Archives usually contact potential depositors and inform them about the benefits of depositing data in the archive (see the section Why is archiving important?). Then the submission agreement that specifies requirements regarding storage and access to data is negotiated, and the data are ingested by the archive. In case of regular depositors (research institutions, universities) the process of acquisition can be formalised by long-term agreements which set the principles of collaboration.

In all countries, no matter whether the archive initiates the acquisition or not, what follows is a similar process of data ingestion (you can find more information on the pre-ingest process in Chapter 3 and on ingest and curation in Chapter 4). When the data are obtained from the depositor, datasets and documentation (metadata) are inspected by the archive. If the data do not satisfy the archive’s requirements regarding the data quality, depending on archive policies (like a data collection policy), the data could be rejected, corrected or resubmitted following feedback. All the important steps of acquisition should be evaluated and recorded into the archive’s internal records.

Archives can set up tools which support self-archiving by researchers. Self-archiving in principle is a process when researchers archive data using online tools for the data to be available for other researchers. For these “self-archiving services” the levels of curation and checking activities on the part of the archive can differ significantly, but simply said, the depositor has the active role here. As examples: ReShare in UK Data Service (n.d. [Accessed July 31, 2022d]) or SowiDataNet|datorium (GESIS n.d.) hosted by GESIS. Both offer their expertise, archiving tools, procedures and curation services to researchers. See also the section What are the main tools used by archives?

 

Find out more about your archive

Here are some questions that you can ask yourself to learn more about your own archive:

  • What requirements or mandates does your country have regarding data sharing and archiving?
  • Does your archive have a formal acquisition policy?
  • Does your archive have restrictions regarding the data producers allowed to deposit data (e.g. only institutions are allowed to deposit data, not individual researchers)?
  • Which of your colleagues is responsible for administration of acquisitions? How do they do this?
  • Does your archive have a self-deposit tool?

 

Expert tips

Usually a huge amount of data is not only in “core social sciences” (sociology, psychology) but in related research areas (e.g. health sciences, education research) as well, so it can be fruitful to include them into acquisition policy.

It is useful to have different acquisition strategies for new depositors versus regular depositors as e.g. in UK Data Service (n.d. [Accessed July 31, 2022c]).