Selecting the best repository to house a dataset may be straightforward, if there is already a well-established subject based repository in your discipline, or it may take some research to determine the best place for your data. Look for a research data repository with open licenses, to make your datasets more accessible (CC-0 is the least restrictive license). The repository should provide clear, persistent citations for datasets. Repositories offer a range of services to depositors (from data validation to peer review) and to users (from in-browser data exploration to visualization and analysis tools), which may also influence your choice. The Digital Scholarship and Scholarly Communications Team is happy to assist you as you select an appropriate data repository.
There are several useful tools for finding data repositories that serve your field.
Harvard Dataverse is a repository for research data and code. “The Harvard Dataverse is open to all scientific data from all disciplines worldwide. It includes the world’s largest collection of social science research data. It is hosting data for projects, archives, researchers, journals, organizations, and institutions.”
If you would like to publish a dataset but cannot find an appropriate subject-based repository, you may want to consider using Figshare, “a repository where users can make all of their research outputs available in a citable, shareable and discoverable manner.” The research outputs you can upload to Figshare include datasets, figures, papers, posters, and video. When you publish research materials on Figshare, they receive a Digital Object Identifier (DOI), providing a persistent citation. Figshare also supports version control, so that you can update or add to a dataset without confusing other researchers who may wish to cite it.
The Inter-University Consortium for Political and Social Research (ICPSR) archives data from any source. It has the world’s largest collection of Social Science data.
Data can be deposited for free, although there is a fee for curated deposits.
For more information about ICPSR, visit this research guide.
If you are using Github to manage a project, you can easily archive dataset releases using Zenodo. Zenodo assigns a digital object identifier (DOI) to the dataset, making it easier to cite the dataset in publications.
Questions? Contact us