5.1 Introduction to FAIR

The good care for and management of data and other digital objects has been the core objective of data archives since the beginning of their existence. As has been described in Chapters 3 and 4 of the DAG, these processes and practices are often based on the OAIS Reference Model. However, with more and more data being produced in the digital age of science, new data types coming into existence, more general-purpose archives emerged to host these data. These types of archives (or ‘repositories’ as explained in section 1.1 in Chapter 1) generally provide fewer services to connect and manage objects since their holdings vary so much. This development in the scientific ecosystem made obstacles to data discovery and reuse become more apparent, as data became more diverse and less visible and connected. It became apparent that what constitutes “good” data management wasn’t defined clearly enough, while more stakeholders in the scientific community developed a stronger focus on data sharing and reuse (e.g., funders, research performing organisations, and researchers). There was a need to design some kind of community-wide guidance to overcome these obstacles and make sure all data were cared for in a unified way by all relevant stakeholders.

5.1.1 The FAIR Guiding Principles

This all led to the conception of the FAIR Guiding Principles: the 15 principles to make data Findable, Accessible, Interoperable, and Reusable (Wilkinson et al. 2016). This set of guiding principles was defined to help all stakeholders in science overcome obstacles to data discovery and reuse and foster the awareness that digital objects deserved more care and attention than most were given. The ultimate goal of the creators of these principles was to make sure all data would be more easy to discover, understand, and reuse, for both people and machines. The FAIR principles are the way to get everyone involved in data management, not just archives adhering to the Open Archival Information System (OAIS) model, and to have everyone speak the same language when discussing the topic.

The FAIR Guiding Principles:

To be Findable

  1. (meta)data are assigned a globally unique and persistent identifier
  2. Data are described with rich metadata 
  3. Metadata clearly and explicitly include the identifier of the data it describes
  4. (meta)data are registered or indexed in a searchable resource

To be Accessible

  1. (meta)data are retrievable by their identifier using a standardised communications protocol
  2. The protocol is open, free, and universally implementable
  3. The protocol allows for an authentication and authorization procedure, where necessary
  4. Metadata are accessible, even when the data are no longer available

To be Interoperable

  1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
  2. (meta)data use vocabularies that follow FAIR principles
  3. (meta)data include qualified references to other (meta)data

To be Reusable

  1. meta(data) are richly described with a plurality of accurate and relevant attributes
  2. (meta)data are released with a clear and accessible data usage licence
  3. (meta)data are associated with detailed provenance
  4. (meta)data meet domain-relevant community standards

Source: Wilkinson et al. 2016

The principles are high-level, domain-independent, and, as the name says, meant as guidance. They make no specific suggestions for any one way to implement each principle. These choices are still left to the scientific sub-communities themselves to make. This is done intentionally to make the barrier to start implementing FAIR as low as possible. Data archives can adhere to the FAIR principles in any capacity, combination, and degree, as ‘FAIRness’ is not an all-or-nothing concept. 

Since the first workshop on this topic in 2014 (‘Jointly Designing a Data FAIRPORT’ 2014), much work has been done to translate these high-level guiding principles into practical applications in science. For example, researchers are now often urged or required to deposit their data in an archive, or to make a data management plan (DMP) (see 'DMEG - Chapter 1 Plan', CESSDA Training Team, 2017 - 2020). For archives, this means that more and more data will be deposited with them, and that these data might be in better shape than they used to be. In 2018, the European Commission published the report ‘Turning FAIR into Reality’ that contained recommendations and an action plan to advance the uptake of FAIR in the scientific ecosystem (Directorate-General for Research and Innovation (European Commission), 2018). This report still remains the baseline for many initiatives and a way to measure the progress made (for example, the FAIR-IMPACT Synchronisation Force (‘Synchronisation Force’ n.d.)).

 

5.1.2 What is a FAIR-enabling organisation?

The uptake of FAIR does not only consist of a cultural change, but also necessitates a technical infrastructure. FAIR data can only exist in a ‘FAIR ecosystem’, that consists of FAIR-enabling organisations and services. The term ‘FAIR-enabling’ can be used to indicate that an organisation or service can influence the level of FAIRness of a digital object, but that it cannot be FAIR itself in the sense that the FAIR principles cannot be applied to it. The degree to which an organisation is FAIR-enabling can also be referred to as its ‘maturity’. Especially in assessment frameworks (Section 5.2), this term is often used.

Archives are an important component of the FAIR ecosystem, as they facilitate the findability  and accessibility of data, and can offer services to facilitate interoperability and reuse. The introduction of FAIR has brought along the challenge for archives to make and keep data FAIR over time and how to integrate these principles into their pre-existing workflows. It is important to keep in mind that many aspects of FAIR have already been standard practice for most archives long before the term was coined. In these cases there is no need for major changes to existing workflows, but rather a focus on bringing together FAIR-enabling and existing archival practices. It is important that archives communicate transparently and explicitly about what FAIR practices they implement or support in order to inform and educate the broader scientific community. This allows other stakeholders to make informed choices about how to interact with the archive. For example, a researcher will now be able to make an informed choice on whether to deposit their data in the archive, which is especially useful when they need to justify their decision in a DMP. 

FAIR is one term of many, and it is important to realise the overlaps and distinctions between different frameworks and phenomena to make a seamless integration of concepts in the archive. Some comparisons are detailed below.

 

FAIR and OAIS

Image presenting OAIS concept, starting with the help of data steward leading researchers through pre-ingest.

These two concepts are both high-level guiding principles to structure data management processes, which have considerable common ground, but also unique aspects. Looking back at Chapters 3 and 4 in the DAG, which cover pre-ingest, ingest, and curation, many practices have already been discussed which impact one or more of the FAIR principles, e.g.: data sharing, data access levels, file formats, documentation, metadata, reuse, persistent identifier, and dissemination. FAIR has added to these processes that the metadata should be aligned with that of other digital objects in the same discipline, which may add some additional curation checks to make sure (domain-specific) standards and vocabularies are implemented. On the other hand, the FAIR principles don’t cover long-term digital preservation (although Wilkinson et al. 2016 does); the goal of the principles is to have FAIR data, but there is no expectation as for how long this data will be FAIR or even how long the data will exist altogether. This does not mean, however, that long-term preservation isn’t an essential part of archival work. 

 

FAIR and Open

FAIR data can be restricted data, but have metadata available.

A common misconception is that FAIR data must be open. While data can be both open and FAIR, the FAIR principles do not stipulate that data must be open in order to be FAIR-compliant. The concept of Accessibility means that data access conditions should be well-defined. Data can have any access category, and the data can be considered Accessible as long as this is clearly communicated in a data usage licence. Another way to be FAIR when you cannot be open is to accompany a restricted data file with openly accessible metadata. Chapter 4, section 2 covers the special conditions of data restriction and embargo, and describes how to approach this. 

 

FAIR and CARE

Illustration of FAIR and CARE principles

The CARE principles for Indigenous Data Governance (Collective benefit, Authority to control, Responsibility, and Ethics) specifically concern Indigenous data and govern the control and (re)use of such data under the principles of open science (Carroll et al. 2020). For instance, the Collective Benefit principle states that data ecosystems should be designed and function in ways that enable Indigenous Peoples to derive benefit from the data, e.g. for improved government and citizen engagement. The CARE principles are complementary to the FAIR principles, under the motto ‘be FAIR and CARE’ (Carroll et al. 2020).

 

FAIR and TRUST

The TRUST principles (Transparency, Responsibility, User focus, Sustainability, and Technology) were designed to facilitate discussion and implementation of best practices with regard to digital preservation (Lin et al. 2020). Again, these principles complement FAIR, as they concern aspects that FAIR does not. For instance, the Sustainability principle addresses the long-term ambition that the FAIR principles ignore. The TRUST principles, and the concept of archival trustworthiness, are strongly related to the OAIS model, but extend this with assessable elements to demonstrate the trustworthiness of the service provided by the archive. These principles are the basis of the CoreTrustSeal certification. This topic will be further explored in section 4 of this Chapter.

 

Increase your understanding

Find out more about your archive

Here are some questions that you can ask yourself to learn more about your own archive:

  • Did your archive exist before the conception of the FAIR principles (2016)?
    • If so, did it change its workflows to incorporate FAIR? How?
    • If not, did your archive choose to have a focus on FAIR from the start? Why (not)?
  • Is the term FAIR used anywhere on your archive’s website or in documentation?
  • What FAIR practices does your archive implement? Are there any missing or insufficient?

 

Expert tips

Use the FAIR-Aware tool to assess your knowledge on the FAIR principles and their related practices (Akerman et al. 2021). This tool introduces you to the FAIR principles, exploring what researchers, data stewards, and other data professionals are encouraged to look for in an archive.