RESEARCHERS

FAIR research data

Following the Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020, the European Commission defines the research data as factual or numerical information, collected to be examined and considered as the basis of reasoning, discussion or calculation.

The data can be statistics, results of experiments, measurements, observations resulting from the field study, surveys or interviews and images.

 

Principles of FAIR

The principles of FAIR data, acronym in English of (Findecent, Aaccessible, Iinteroperable, Rusable - Tstolen, Aaccessible, Interoperable i Reusables), first published in Scientific data in 2016, they are a set of characteristics that data objects (metadata and data) must have in order to be retrievable and reusable by humans and machines.

Open data and FAIR data:

  • Open data can be used, reused and redistributed freely by anyone. They are subject, at most, to the requirement of attribution and sharing in the same way as they appear. Source: Open Data Handbook
  • However, for reasons of security, privacy, protection of personal data or commercial / industrial exploitation and following the principles of the European Commission "as open as possible, as closed as necessary" FAIR data do not have to be open but they do metadata under a CCO or equivalent, to the extent that legitimate interests or limitations are safeguarded.

Agents involved

  • Research staff, repository managers, and data curators are key to making these principles possible.
  • To comply with the FAIR principles, deposit your research data in one trusted repository like CORA-RDR and fill in as much metadata as possible with information from your dataset so that it can be easily found.  

More information: 

 

How to make search data Fimpossible  

  • F1 Metadata must have a unique global and persistent identifier (handle, DOI, ...)
  • F2 The data is described with rich metadata
  • F3 There is specific metadata for the permanent identifier
  • F4 Metadata is recorded or indexed in a search engine

Tips for performing F2

  • Include descriptive information about the context and / or characteristics of the data. More information

How to make search data Aaccessible

    • A1 Metadata can be retrieved by its identifier using a standardized communication protocol
      • A1.1 protocols must be open, free, and universally implementable.
      • A1.2 When necessary for reasons of privacy, security or commercial interests, the protocol should allow for a system of authentication and authorization of data.
    • A2 Metadata must be accessible even when data is no longer available

    More information

    How to make the data Iinteroperable

    • I1 Metadata uses a formal, accessible, shared, and widely applicable language for the representation of knowledge.
    • I2 Metadata uses a FAIR vocabulary (controlled vocabularies)
    • I3 Metadata includes qualified references to other metadata (publications and related materials) 

    More information

    How to make the data Rusable

    • R1 Metadata is described with a plurality of precise and relevant attributes
      • R1.1 Publish data and metadata with a clear and accessible license to use
      • R1.2 Use criteria of origin (creation, attribution, and version history) to associate metadata with data over its lifecycle
      • R1.3 The data and metadata standards used comply with the common standards in the area of ​​knowledge to which the data relates.

    Performance Tips R1

    This concept is related to the F1 principle but focused on how the user searches to decide if the data is of interest to them.

    • Describe the scope of your data: for what purpose was it collected?
    • Mention limitations on data that others should know about
    • Specifies the date of data generation / collection, laboratory conditions, who prepared the data, the name and version of the software used
    • Is the data raw or processed?

    Performance Tips R1.2

    • In order for others to reuse your data, they need to know where it came from, indicating this information in R1
    • Who to quote and / or how you want to be recognized. Include a description of the workflow that led to your data: who generated or collected it? How has it been processed? Has it been published before?
    • Does it contain data from another person that you have transformed or completed?

    More information

    Want to know if your data is FAIR?

    • Use the program developed by DANCE (Data Archiving and Networked Services) and answer the 12 questions on the form to find out if your data is FAIR

    SATIFYD: FAIR data self-assessment tool

    Publish research data

    CORA. Research Data Repository   

    • It is recommended to deposit the search data in CORA-RDR. It is a trusted repository specifically for data.
    • CORA-RDR is the data repository of Catalan universities and CERCA centers.
    • Metadata published in CORA-RDR are found indexed and therefore findable a:
    • The research staff of the UPC may publish FAIR-type research datasets in accordance with the guidelines of theEOSC.
    • El CSUC and the CORA-RDR they are official service providers within the EOSC, actively contributing to the advancement of research, driving data reuse and promoting the values ​​of open science for the benefit of the wider scientific community.
    • Before publishing the first dataset to CORA-RDR you need to register by following these instructions and report to info.biblioteques@upc. Edu to assign you publishing permissions. 
    • Post is not immediate. The data curators of the libraries of the UPC review metadata before publishing the dataset. 
    • Contact info.biblioteques@upc. Edu to publish datasets larger than 10GB.

    With each dataset:

    • Add a readme file with information about the search data. The file must be saved as "Readme_title of dataset.txt".

    Download the templates (use the language of the data) and right-click "Save link as".

    Before publishing, verify that the data follows the FAIR principles 

    • You can verify with the quick guide if you have all the information in the data to follow the FAIR principles.

     

    UPCommons

    It is recommended to deposit datasets in UPCommons only if you have indicated in your DMP UPCommons as a repository or datasets are part of a project that already contains datasets in UPCommons.

    The collection Research data d'UPCommons allows you to publish, share, describe and lliceStart FAIR research data linked to a publication or research project:

    • The data published in UPCommons They must be produced by the scientific community of the UPC.
    • UPCommons It allows you to host data in any format. Following the institutional policy on free software, we recommend that you use open formats whenever possible.
    • In addition to the handle, UPCommons assign a DOI to each dataset.
    • Files can be up to 2GB in size. For larger files or to publish multiple files, contact info.bibliotequesupc.edu.

     

    How to publish datasets a UPCommons:

    Enter metadata:

    • Author / s of the data.

    • Department / s research group (s) of authors / s.

    • Title, description and keywords in the dataset.

    • Year of creation

    • Software to check the data, if applicable.

    • Code of the financing entity or title of the publication associated with the data.

    • Information on licenses.

    With each dataset:

    • Add a readme file with information about the search data. The file must be saved as "Readme_title of dataset.txt".

    Download the templates and right-click "Save link as".

    Post is not immediate. Before making them public, librarians review metadata.

     

    Before publishing research data:

    • Check that you have the rights to disseminate the data. If not, you need permission to re-use the data by the data holders.

    • Make sure the data you want to publish is not subject to any restrictions on privacy, privacy or copyright issues. If the data refers to people (surveys, ...), they must be published anonymously or have the explicit consent of the people who participated. For more information, consult the sections: Copyright and Licenses.

     

    In addition to CORA-RDR id 'UPCommons, you can deposit the data in other repositories. Consider the following criteria:

    • The storage capacity is sufficient.
    • It allows you to deposit in the desired format and the different versions of the same file.
    • You can link the data to the associated publications.
    • It has a preservation policy: backups, shelf life, etc. 
    • Ensures interoperability with OpenAIRE, if the data is from a European project
    • It has a persistent and unique identifier (DOI o URN) for each set of data.
    • The deposit follows data quality guidelines and certifications such as, ISO, DINI, Data Seal of Approval.
    • Data is easily recoverable.
    • Allows you to choose between different usage licenses.
    • Allows the restriction of access to the data (closed, restricted or established by an embargo period).
    • What are the costs associated with the use of the deposit.
    • The discipline of project data.

    The directory re3data.org allows you to select from many repositories according to the thematic area, the type of data, etc. 

    You can also search for data repositories according to the main fields of knowledge of the UPC:
    Aeronautics and space Building Engineering Chemical engineering
    Architecture Energies Telecommunication engineering Physics
    Health Sciences Agri-food engineering Materials engineering Technical support
    Sciences of vision Biomedical engineering Electrical engineering Maths
    Economy and organization of companies Civil engineering Mechanical engineering Planning

     

    • Zenodo: research data repository developed by CERN in the framework of the OpenAIRE project. If you can not find a deposit that fits your project, you can use multidisciplinary repositories.
    • Eudat (European Data Insfrastucture): multi-institutional project funded by the EU H2020 program.

     

    During the search, we recommend that you organize and document the data that you generate and preserve them during the period you establish, for example, between 5 and 10 years. Sharing data with other researchers by depositing them in open access repositories, entails benefits for:

    The researcher and his institution:

    •    Safe storage in the long term.
    •    Be able to demonstrate the results of the investigation.
    •    Make the data visible and be able to cite them.
    •    Allow reuse of data.
    •    Increase citations and therefore the impact of research.
    •    Establish collaborations on related topics.

    For financing entities:

    •    Have the data of the search funded located.
    •    Avoid duplicities in the collection of data.
    •    Make more efficient use of research funded with public funds.
    •    Increase return on investment by promoting reuse of data.

    For science and society:

    •    Maximize transparency.
    •    Improve quality in verification, replication and trust.
    •    Promote innovation through new uses of data.
    •    Increase the social value of research.
    •    Meet the mandates in favor of open access.

    Funding entities and research data

    According to him General Model Grant Agreement (Annex 5, Article 17 - Communication, dissemination, Open Science and visibility), the research staff benefiting from a project must: 

    • Manage search data responsibly and in accordance with FAIR principles: that it can be searched easily, that it is accessible, interoperable and reusable  
    • Prepare a Data Management Plan (CEO), updating it periodically. 
    • Deposit the data in a trusted repository as soon as possible
    • Ensure open access to data via the repository, in accordance with the criterion "as open as possible, as closed as necessary" and with a CC-BY or CC-0 license or equivalent
    • Provide information to the repository on research results and any other tools needed to validate the data
    • Search data metadata must be open and with a CC-0 license or equivalent
    Within the framework of the 2020 Horizon, the European Commission started the year 2015,Open Research Data Pilot which required projects in specific areas, the development of a data management plan and the publication in open access of the data.
    From 2017,Open Research Data Pilot It has been extended to all areas of projects financed by H2020 and, therefore, requires the open publication of the data in all projects. In addition, it is necessary that the data be FAIR, that is to say that they are findable (Findable), accessible (Accessible), interoperable (Iinteroperable) and reusable (Reusable).
    In the 29.3 clause of the Model Grant Agreement The legal requirements that the projects must fulfill are detailed: 
    • Develop one Data Management Plan - Data Management Plan (WMD).
    • Depositing the search data in a repository as soon as possible to guarantee anyone, access, mining, exploitation, reproduction and dissemination using a license Creative Commons appropriate
    • In the same deposit where the data are published provide information about tools and instruments (software, etc.) needed to validate the results and when possible, offer these tools.
    The costs associated with the data, including the creation of the WMD, are considered eligible expenses in the project.

    More information: 

     

    For actions financed by theEuropean Research Council (ERC) see:

    Exceptions: Following the principles of the Commission "as open as possible, as closed as necessary", the open dissemination of project results may be excluded for reasons of security, privacy, protection of personal data or commercial / industrial exploitation .  

    More information:

    In order to promote access to the research data of the funded R+D+i projects, the calls derived from the State Plan for Scientific, Technical and Innovation Research 2021-2023 mention include:

    Un data management plan which must be deposited in institutional, national or international repositories once the project has been completed and the period established in the corresponding calls has elapsed, always respecting all situations in which the data must be protected for reasons of confidentiality, security, protection or when necessary for the commercial exploitation of the results obtained.

    La Law 17/2022 of September 5 on science, technology and innovation, in its article 37, provides that the research staff participating in national competitive projects must submit:

    • The data associated with the article (supplementary data) in a thematic or multidisciplinary data repository com CORA-RDR. More information about how to publish datasets in CORA-RDR.

    Research data rights and licenses

    As it is collected in the Recommendations from the educational content publications guide (UdG):

    "The raw information or data, such as a demographic data list, a set of weather logs or a UTM coordinate relationship, even if incorporated or represented by a database, or of a plan, or its presentation and interpretation in the framework of a research work deserve protection by the added value that these materials or studies provide. Rights to these products or studies are recognized, but not to the information that has served as base to elaborate them ".

    Although raw data is not copyrighted and therefore not subject to intellectual property, it must be kept in mind that it is the databases where they appear. As set forth in 133 article of the Intellectual property law I la Law 5/1998 of 6 March, incorporating into Spanish law Directive 96/9 / EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases:

    • The "sui generis" right on a database protects the substantial investment, qualitatively or quantitatively evaluated, made by its manufacturer either through financial means, investment of time, effort, energy or others of a similar nature, for the obtaining , verification or presentation of its content.
    • By virtue of this right, the manufacturer of a database may prohibit the extraction and / or reuse of all or a substantial part of its content provided that the obtaining, verification or presentation of such content represent A substantial investment from a quantitative or qualitative point of view. This right may be transferred, ceded or licensed.
    • Nor is the repeated or systematic extraction and / or reuse of non-substantial parts of the content of the database suspected of acts contrary to normal exploitation of the database or that cause unjustified damage to the legitimate interests of its manufacturer.
    • The "sui generis" right on the data base is applied without prejudice to the possible existing rights over its content (copyright of the works included or others).

    Therefore, the reuse of own data or of third parties entails the consideration of the following aspects:

    • Who owns the data?
    • Are the data in a database protected?
    • Do you have permissions to preserve the data and allow it to be reused?
    • Are there restrictions on third party data?
    • Is there any embargo period that limits open access to data?
    • What licenses will you use to facilitate the reuse of your own data?

    As the European Commission collects in the document Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020:

    "as far as possible, projects must then take measures to enable for third parties to access, mine, exploit, reproduce and disseminate (free of charge for any user) this research data. One straightforward and effective way of doing this is to attach Creative Commons Licence (CC-BY or CC0 tool) to the data deposited (http://creativecommons.org/licenses/). "

         

     

    It should be noted that public domain licenses are the means to provide data in the most open way possible as the llicenciador waives all rights (as far as possible with applicable law).

    It should also be considered that the 4.0 version of the licenses Creative Commons It presents some improvements that may be of interest in the case of the research data:

    • Databases: coverage of the right "sui generis"from databases, except explicit exclusion of lliceinitiator
    • Authorization: improvement of the procedure with which an author can request the non-mention of his authorship, both in the reproductions of his work and in the works derived therefrom. In addition, the users of the works can recognize the authorship of the works used through a link to a web page where this information is listed.
    • Interoperability: maximization of interoperability between CC licenses and other licenses.
    • Others: "What's New in 4.0".

    Alex Ball, a member of the Digital Curation Center, has developed the guide How to License Research Data, a document that includes several aspects to consider when licensing research data:

    • Most projects can use standard licenses such as Creative Commons or Open Data Commons, but you can also make a personalized license according to the data casuistry and provided that you have the advice of professionals.
    • In cases where none of the existing licenses is fully satisfactory, the granting of several licenses ("multiple licensing ").
    • Creative Commons licenses treat datasets and databases as a whole, but not individual data included (differentiated from the databases or collection). This can be difficult in some complex cases, such as the collections of several copyrighted works.
    • The databases are included among the works that can be offered in the public domain by means of the CC0 license. In relation to the remaining Creative Commons licenses, it is recommended to use the 4.0 version.
    • There are other licenses, as part of the Open Data Commons project, which are specific to databases:
      • Open Data Commons Attribution License (ODC-BY): A license that allows third parties to copy, distribute and use the database, as well as use it to create new content, databases or database collections (as long as the original database is cited ).
      • Open Data Commons Database License (ODbL): It is the same license as the ODC-BY but, in the event that new derivative databases (not database collections or other possible derivative contents) are made, the same license must be granted that the base of original data. It also allows the application of Digital Rights Management (DRM) technology, both in the original database and the derivative, as long as an unrestricted copy of the database is offered alternately.
      • Open Data Commons Public Domain Dedication and License (PDDL): License similar to CC0, but written specifically for databases. It allows you to copy, distribute and use the database, as well as create derivative works and databases, without any other restrictions.

    As set forth in the preamble to these licenses Open Data Commons:

    • ODC-BY i ODbL they only cover the rights on the database, not on the contents of the same (images, audiovisual material, etc.). In this case, the lliceAdvertisers must use ODbl in conjunction with other licenses.
    • However, PDDL It can be used for databases or its contents (data), both jointly and individually.

    Ethical aspects and citation

    The ethical aspects affect the data that can be displayed, the time and the anonymity of the people involved, respecting the dignity and the integrity to guarantee the privacy and the confidentiality.

    You must consider:

    To correctly quote the search data, follow the guidelines.

     

     


    Last update: 22 / 11 / 2023