1.11 What tools do archives use?

What systems, software and technology do archives use in archiving research data?

Image exposing different elements in the process of data archiving, such as curation, documentation, and versioning and imply that there are tools you could use.

Different types of tools in different phases of the archiving process

When the data archive receives research data, the data will be quality checked to ensure that they meet the requirements set by the archive and/or the agreement between the data archive and the data producer.

Some archives have solutions where data producers can upload their own data directly into an archive - so called "Self-Deposit" solutions. This is as an addition to the main archive. Here are some examples of depositing and self-depositing technologies from some archives:

Archive

Deposit

Self-Deposit

Software/Technology

DANS

machine to machine automatic deposit (SWORD protocol)

DataverseNL + EASY self-deposit interface

SWORD protocol; Dataverse + inhouse developed software

NSD

Archiving portal

-

Inhouse developed software, Collectica

ADP

 Archiving portal

 Dataverse

 

CSDA

No tool. Communication via email or personally

 -

 

AUSSDA

 Deposit via filesender.

Service under development

 AUSSDA Dataverse

 

Curating, administration, documentation, upgrades, versioning 

The main function of a data archive is to curate the research data so that the data retains its value for the research community during long-term archiving. The archive also needs to make sure that the archived research data fulfill the FAIR principles.

Which tools and software the different archives use for this task varies. Some exmaples are presented here:

Archive

Administrative

Curation

Software/Technology

NSD

Inhouse/office365

NESSTAR

Nesstar + inhouse dev

DANS

DataverseNL + EASY

various software applications and scripts to export data to preferred file formats

DataverseNL + inhouse developed software + programming based on scripts such as python + software applications such as Microsoft Office, Adobe Creative Cloud, SPSS, STATtransfer, Irfanview, ArcGIS, QGIS, MapInfo, ffmpeg, ...

CSDA

 office

 Nesstar, spss, inhouse development,

 Nesstar,  spss, inhouse development

AUSSDA

Project management tool, Ticket system

 

Microsoft Office, Stata, STATtransfer, SPSS, Python

 

Dissemination

To enhance the value of archived data, data should be made FAIR, i.e.research data should be findable, accessible, interoperable and reusable. Data archives therefore need tools that make archived data easily findable and searchable for students, researchers and others.

Data archives also need to have tools that provide access management for the data, saving time and administrative procedures while ensuring the security of data that needs protection. This is especially important in order to avoid breaching agreements with data producers or regulations such as GDPR.

The two main tools are as of now: Dataverse and Nesstar.

Archive

Publication tool

Access management tool

NSD

NESSTAR

Inhouse dev,

DANS

DataverseNL + EASY + international portals harvesting metadata from EASY (Europeana, ARIADNE, ...)

Dataverse + Inhouse developed software + OAI-PMH harvesting

ADP

 NESSTAR/Dataverse in the near future

 

CSDA

NESSTAR

Inhouse dev,

AUSSDA

 Dataverse

 Dataverse + Ticket System

 

More information on tools that can be used regarding the wider topics of FAIR and research data management are covered in Chapter 5 [in progress]. 

These tools alone are not enough to facilitate quality archiving and curation. Archives need experts to employ and update these tools and to perform other archival tasks that tools cannot or do not yet cover.

Find out more about your archive

Here are some questions you can ask yourself to learn more about your own archive:

  • Which tool(s) is/are your archive using for:

    • the Ingest phase?
    • when curating data?
    • for administration, both data acquisition and dissemination?
    • for documentation of data?
    • handling upgrades and versioning of datasets?