The principles of FAIR data, acronym in English of (Findecent, Aaccessible, Iinteroperable, Rusable - Tstolen, Aaccessible, Interoperable i Reusables), first published in Scientific data in 2016, they are a set of characteristics that data objects (metadata and data) must have in order to be retrievable and reusable by humans and machines.
Open data and FAIR data:
Open data can be used, reused and redistributed freely by anyone. They are subject, at most, to the requirement of attribution and sharing in the same way as they appear. Source: Open Data Handbook
However, for reasons of security, privacy, protection of personal data or commercial / industrial exploitation and following the principles of the European Commission "as open as possible, as closed as necessary" FAIR data do not have to be open but they do metadata under a CCO or equivalent, to the extent that legitimate interests or limitations are safeguarded.
Agents involved
Research staff, repository managers, and data curators are key to making these principles possible.
To comply with the FAIR principles, deposit your research data in one trusted repository like CORA-RDR and fill in as much metadata as possible with information from your dataset so that it can be easily found.
A1 Metadata can be retrieved by its identifier using a standardized communication protocol
A1.1 protocols must be open, free, and universally implementable.
A1.2 When necessary for reasons of privacy, security or commercial interests, the protocol should allow for a system of authentication and authorization of data.
A2 Metadata must be accessible even when data is no longer available
R1 Metadata is described with a plurality of precise and relevant attributes
R1.1 Publish data and metadata with a clear and accessible license to use
R1.2 Use criteria of origin (creation, attribution, and version history) to associate metadata with data over its lifecycle
R1.3 The data and metadata standards used comply with the common standards in the area of knowledge to which the data relates.
Performance Tips R1
This concept is related to the F1 principle but focused on how the user searches to decide if the data is of interest to them.
Describe the scope of your data: for what purpose was it collected?
Mention limitations on data that others should know about
Specifies the date of data generation / collection, laboratory conditions, who prepared the data, the name and version of the software used
Is the data raw or processed?
Performance Tips R1.2
In order for others to reuse your data, they need to know where it came from, indicating this information in R1
Who to quote and / or how you want to be recognized. Include a description of the workflow that led to your data: who generated or collected it? How has it been processed? Has it been published before?
Does it contain data from another person that you have transformed or completed?
The research staff of the UPC may publish FAIR-type research datasets in accordance with the guidelines of theEOSC.
El CSUC and the CORA-RDR they are official service providers within the EOSC, actively contributing to the advancement of research, driving data reuse and promoting the values of open science for the benefit of the wider scientific community.
Before publishing the first dataset to CORA-RDR you need to register by following these instructions and report to info.biblioteques@upc. Edu to assign you publishing permissions.
Post is not immediate. The data curators of the libraries of the UPC review metadata before publishing the dataset.
Before publishing, verify that the data follows the FAIR principles
You can verify with the quick guide if you have all the information in the data to follow the FAIR principles.
UPCommons
It is recommended to deposit datasets in UPCommons only if you have indicated in your DMP UPCommons as a repository or datasets are part of a project that already contains datasets in UPCommons.
The collection Research data d'UPCommons allows you to publish, share, describe and lliceStart FAIR research data linked to a publication or research project:
The data published in UPCommons They must be produced by the scientific community of the UPC.
UPCommons It allows you to host data in any format. Following the institutional policy on free software, we recommend that you use open formats whenever possible.
In addition to the handle, UPCommons assign a DOI to each dataset.
Files can be up to 2GB in size. For larger files or to publish multiple files, contact info.bibliotequesupc.edu.
How to publish datasets a UPCommons:
To the community Research data, select Deposit search data.
Enter metadata:
Author / s of the data.
Department / s research group (s) of authors / s.
Title, description and keywords in the dataset.
Year of creation
Software to check the data, if applicable.
Code of the financing entity or title of the publication associated with the data.
Information on licenses.
With each dataset:
Add a readme file with information about the search data. The file must be saved as "Readme_title of dataset.txt".
Download the templates and right-click "Save link as".
Post is not immediate. Before making them public, librarians review metadata.
Before publishing research data:
Check that you have the rights to disseminate the data. If not, you need permission to re-use the data by the data holders.
Make sure the data you want to publish is not subject to any restrictions on privacy, privacy or copyright issues. If the data refers to people (surveys, ...), they must be published anonymously or have the explicit consent of the people who participated. For more information, consult the sections: Copyright and Licenses.
Zenodo: research data repository developed by CERN in the framework of the OpenAIRE project. If you can not find a deposit that fits your project, you can use multidisciplinary repositories.
Eudat (European Data Insfrastucture): multi-institutional project funded by the EU H2020 program.
During the search, we recommend that you organize and document the data that you generate and preserve them during the period you establish, for example, between 5 and 10 years. Sharing data with other researchers by depositing them in open access repositories, entails benefits for:
The researcher and his institution:
Safe storage in the long term.
Be able to demonstrate the results of the investigation.
Make the data visible and be able to cite them.
Allow reuse of data.
Increase citations and therefore the impact of research.
Establish collaborations on related topics.
For financing entities:
Have the data of the search funded located.
Avoid duplicities in the collection of data.
Make more efficient use of research funded with public funds.
Increase return on investment by promoting reuse of data.
For science and society:
Maximize transparency.
Improve quality in verification, replication and trust.
According to him General Model Grant Agreement (Annex 5, Article 17 - Communication, dissemination, Open Science and visibility), the research staff benefiting from a project must:
Manage search data responsibly and in accordance with FAIR principles: that it can be searched easily, that it is accessible, interoperable and reusable
Ensure open access to data via the repository, in accordance with the criterion "as open as possible, as closed as necessary" and with a CC-BY or CC-0 license or equivalent
Provide information to the repository on research results and any other tools needed to validate the data
Search data metadata must be open and with a CC-0 license or equivalent
Within the framework of the 2020 Horizon, the European Commission started the year 2015,Open Research Data Pilot which required projects in specific areas, the development of a data management plan and the publication in open access of the data.
From 2017,Open Research Data Pilot It has been extended to all areas of projects financed by H2020 and, therefore, requires the open publication of the data in all projects. In addition, it is necessary that the data be FAIR, that is to say that they are findable (Findable), accessible (Accessible), interoperable (Iinteroperable) and reusable (Reusable).
In the 29.3 clause of the Model Grant Agreement The legal requirements that the projects must fulfill are detailed:
Depositing the search data in a repository as soon as possible to guarantee anyone, access, mining, exploitation, reproduction and dissemination using a license Creative Commons appropriate
In the same deposit where the data are published provide information about tools and instruments (software, etc.) needed to validate the results and when possible, offer these tools.
The costs associated with the data, including the creation of the WMD, are considered eligible expenses in the project.
Exceptions: Following the principles of the Commission "as open as possible, as closed as necessary", the open dissemination of project results may be excluded for reasons of security, privacy, protection of personal data or commercial / industrial exploitation .
Un data management plan which must be deposited in institutional, national or international repositories once the project has been completed and the period established in the corresponding calls has elapsed, always respecting all situations in which the data must be protected for reasons of confidentiality, security, protection or when necessary for the commercial exploitation of the results obtained.
The data associated with the article (supplementary data) in a thematic or multidisciplinary data repository com CORA-RDR. More information about how to publish datasets in CORA-RDR.
"The raw information or data, such as a demographic data list, a set of weather logs or a UTM coordinate relationship, even if incorporated or represented by a database, or of a plan, or its presentation and interpretation in the framework of a research work deserve protection by the added value that these materials or studies provide. Rights to these products or studies are recognized, but not to the information that has served as base to elaborate them ".
The "sui generis" right on a database protects the substantial investment, qualitatively or quantitatively evaluated, made by its manufacturer either through financial means, investment of time, effort, energy or others of a similar nature, for the obtaining , verification or presentation of its content.
By virtue of this right, the manufacturer of a database may prohibit the extraction and / or reuse of all or a substantial part of its content provided that the obtaining, verification or presentation of such content represent A substantial investment from a quantitative or qualitative point of view. This right may be transferred, ceded or licensed.
Nor is the repeated or systematic extraction and / or reuse of non-substantial parts of the content of the database suspected of acts contrary to normal exploitation of the database or that cause unjustified damage to the legitimate interests of its manufacturer.
The "sui generis" right on the data base is applied without prejudice to the possible existing rights over its content (copyright of the works included or others).
Therefore, the reuse of own data or of third parties entails the consideration of the following aspects:
Who owns the data?
Are the data in a database protected?
Do you have permissions to preserve the data and allow it to be reused?
Are there restrictions on third party data?
Is there any embargo period that limits open access to data?
What licenses will you use to facilitate the reuse of your own data?
"as far as possible, projects must then take measures to enable for third parties to access, mine, exploit, reproduce and disseminate (free of charge for any user) this research data. One straightforward and effective way of doing this is to attach Creative Commons Licence (CC-BY or CC0 tool) to the data deposited (http://creativecommons.org/licenses/). "
It should be noted that public domain licenses are the means to provide data in the most open way possible as the llicenciador waives all rights (as far as possible with applicable law).
It should also be considered that the 4.0 version of the licenses Creative Commons It presents some improvements that may be of interest in the case of the research data:
Databases: coverage of the right "sui generis"from databases, except explicit exclusion of lliceinitiator
Authorization: improvement of the procedure with which an author can request the non-mention of his authorship, both in the reproductions of his work and in the works derived therefrom. In addition, the users of the works can recognize the authorship of the works used through a link to a web page where this information is listed.
Interoperability: maximization of interoperability between CC licenses and other licenses.
Alex Ball, a member of the Digital Curation Center, has developed the guide How to License Research Data, a document that includes several aspects to consider when licensing research data:
Most projects can use standard licenses such as Creative Commons or Open Data Commons, but you can also make a personalized license according to the data casuistry and provided that you have the advice of professionals.
In cases where none of the existing licenses is fully satisfactory, the granting of several licenses ("multiple licensing ").
Creative Commons licenses treat datasets and databases as a whole, but not individual data included (differentiated from the databases or collection). This can be difficult in some complex cases, such as the collections of several copyrighted works.
The databases are included among the works that can be offered in the public domain by means of the CC0 license. In relation to the remaining Creative Commons licenses, it is recommended to use the 4.0 version.
There are other licenses, as part of the Open Data Commons project, which are specific to databases:
Open Data Commons Attribution License (ODC-BY): A license that allows third parties to copy, distribute and use the database, as well as use it to create new content, databases or database collections (as long as the original database is cited ).
Open Data Commons Database License (ODbL): It is the same license as the ODC-BY but, in the event that new derivative databases (not database collections or other possible derivative contents) are made, the same license must be granted that the base of original data. It also allows the application of Digital Rights Management (DRM) technology, both in the original database and the derivative, as long as an unrestricted copy of the database is offered alternately.
Open Data Commons Public Domain Dedication and License (PDDL): License similar to CC0, but written specifically for databases. It allows you to copy, distribute and use the database, as well as create derivative works and databases, without any other restrictions.
ODC-BY i ODbL they only cover the rights on the database, not on the contents of the same (images, audiovisual material, etc.). In this case, the lliceAdvertisers must use ODbl in conjunction with other licenses.
However, PDDL It can be used for databases or its contents (data), both jointly and individually.
The ethical aspects affect the data that can be displayed, the time and the anonymity of the people involved, respecting the dignity and the integrity to guarantee the privacy and the confidentiality.