In parallel to the publication of your research findings in a reviewed journal, it is recognized as good practice (and often required) to make the data that support your findings available to the research community. This allows other researchers to reproduce your findings and enables reusability in subsequent research. When you finish your study, you might thus want or have to make your data accessible. This page provides useful information to help you publish and disseminate your data.
For the publication of research data, repositories are available that make it possible to publish the data together with the documentation and metadata and make them findable. Publication of data on a website is not recommended for reasons of poor sustainability and discoverability of the data.
Not all data can and may be shared publicly. Ethical and legal aspects must be considered here and must definitely be clarified in advance. It does not always make sense to publish all the data from a research project. If only part of the data is suitable for subsequent use, a selection of the data or only certain data sets can be published.
Most research funders require that data be published in compliance with the FAIR principles. This implies that, in addition to documentation and metadata, the data must also be given a persistent identifier so that it can be cited. To make the data interoperable and reusable, the data should be provided in common or open file formats and accompanied by licences for use (CC licences).
For data that cannot be published for legal reasons, at least the metadata should be made publicly available.
Research data repositories are specialised online platforms where research data can be uploaded and shared together with the related metadata. They have a search function that can be used to find the datasets. Most repositories allow to assign persistent identifiers to ensure citation and referencing. For data that cannot be shared publicly (e.g. personal data) there are repositories that allow data to be stored with restricted access or made available after an embargo period. In this case it is important to provide information on the conditions of access.
We differentiate between three different types of repositories:
- Disciplinary repositories store and provide data from specific subject areas. If there are no other requirements to follow, this type of repository is recommended. Firstly, the structure of the platform is adapted to the respective data types and subject-specific requirements; secondly, these platforms are known in the respective disciplines and the data can therefore be found easily.
- In generic repositories, data from a wide range of disciplines can be made available. The disadvantage is that the metadata structure is very general and is not adapted to different data types or subject-specific requirements. Examples for generic repositories are Zenodo and Dryad.
- The University of Basel maintains a community on Zenodo. Members of the University of Basel who publish data on Zenodo are invited to link them to this community.
- Institutional repositories are for data from researchers at a particular institution. Here, the institutional affiliation determines whether data can be published or not. The University of Basel does not have an institutional repository for research data.
Another distinction is made between commercial and non-commercial repository providers. Here it is important to consider the requirements of the research funding organisations, as data uploads to commercial repositories are not always funded. An example of a commercial repository is Figshare.
For certain domains there are national repositories in Switzerland, e.g:
- DaSCH - Swiss National Data and Service Centre for the Humanities
- SWISSUbase for Social sciences and linguistics
Finding research data repositories
The Swiss National Science Foundation (SNSF) provides a list of generalist, discipline-specific and institutional data repositories.
Another way to search for repositories is the Register of Research Data Repositories re3data.
See also:
- Elixir list of Deposition Databases for Biomolecular Data.
- an overview of subject related data repositories by the Open Access Directory.
Research data are becoming increasingly openly available through data repositories, data journals and supplementary material in scientific journals.
To find research data there are different options:
- searching via data journals. Articles in data journals provide extensive information about a data set. Unlike "typical" scientific journals, they do not interpret the data but provide a rich description of a data set. As with other scientific publication formats, articles in a data journal are generally subject to a peer review process that ensures quality standards. The described data sets are usually linked in the data journal but not published directly in the article. A list of data journals is provided by forschungsdaten.org.
- searching via specific research data repositories. To find repositories you can use re3data.org and filter e.g. for subject-specific repositories.
- using meta search engines that query data records from various data repositories. Examples for meta search engines are DataCite Commons, OpenAire Explore, Google Dataset Search, Mendeley Data or b2find.eudat.eu.
Bioschemas aims to improve data interoperability in life sciences. It does this by encouraging people in life science to use schema.org markup, so that their websites and services contain consistently structured information. This structured information then makes it easier to discover, collate and analyse distributed data.
When you re-use a dataset, it is important, for reasons of transparency and good scientific practice, that you cite the data and give credit to the original data creator(s). The citation should include the following information about the dataset:
- Author(s)
- Title of the dataset
- Year of publication
- Repository
- Version (if indicated)
- Persistent Identifier
The way you arrange these elements depends on the citation style you use.
In many cases, when you find a dataset in a repository, you will be given a suggestion of how to cite the dataset and can sometimes also choose between different citation styles.
Example for the citation of a dataset:
Krasselt, J., & Dreesen, P. (2024). Topic model and n-grams for islamophobic blog PI-News (Version 1.0.0) [Data set]. LaRS - Language Repository of Switzerland. https://doi.org/10.48656/k36q-aq92.
Persistent identifiers allow resource providers (e.g. journals, repositories, …) to uniquely reference persons, datasets or other types of publications. The most prevalent of these identifiers in the research community are the DOI and the ORCID.
- DOI: Digital Object Identifier (doi.org)
A digital object identifier (DOI) is a unique alphanumeric string assigned by a registration agency (e.g. DataCite, Crossref) to identify content and provide a link to its location on the Internet. It makes research data and publications persistently citable by making sure that one and the same link will always point to the most recent location of the publication. - ORCID: Open Researcher and Contributor Identifier (orcid.org)
ORCID provides a personal identifier for authors to use with their name as they engage in research, scholarship, and innovation activities. For more details about this identifier, check here: https://info.orcid.org/researchers/#why To register with ORCID, click here: orcid.org/register.