Most researchers face challenges making their data compliant with the FAIR Data principles. For example, a researcher may not be sure which criteria should be met first and how. Jacob et al. (2020) proposed an approach that can serve as a model for research data management that allows researchers to disseminate their data in a way that complies with the FAIR principles and that is without insurmountable barriers.
Having structural metadata allows datasets to achieve a higher level of interoperability and greatly facilitates functional interconnection and analysis in a broader context.
Let’s Consider the Above Figure: (Jacob et al. 2020)
A) “The first 2 tables of data from the experiment, namely, the individuals (plants.txt) followed by the samples (samples.txt). The data should be well organized, i.e., each variable forms a column, each observation forms a row, and each table is relative to an entity, i.e., the same type of observational unit (e.g., plants, samples), and a file should have only 1 data table. In this example, each sample is linked to the plant from which it comes.”
B) “It requires the observation of dependent variables resulting from the effects of certain controlled independent variables. Also, the objects of study usually each have an ‘identifier,’ and the variables can be quantitative or qualitative. Thus, each of the columns within a table can be associated with 1 of the 4 categories: identifier, factor, quantitative, qualitative.”
Associating a category with each column greatly improves subsequent statistical analyses by machines. All structural metadata can be grouped in 2 specific files.
C) “The first metadata file associates with each data table a key concept corresponding to the main entity of the data table. It also specifies for each table the link with the table from which it comes (purple arrow). These links can be interpreted as ‘is obtained from.'”
D) “The second metadata file annotates each attribute (concept/variable) with minimal but relevant metadata, such as its category defined above, its description with its unit, the data type. In each of these 2 files (entities and attributes), it is possible to annotate each of the terms with unambiguous definitions (CV terms) through links to accessible (standard) definitions based on ontologies.”