Both government and scholars have recognized the value of making datasets more openly available. Government at all levels from city and municipal, to provincial to national and international are making data more available. Especially under so-called Government 2.0 or City 2.0 initiatives, various datasets have been liberated and made more available for citizens.
Academics have also recognized the value of making their research data much more widely available. Many who practice deposit articles in institutional repositories are also depositing the datasets with the articles or reports deposited. This enables other scholars the opportunity to validate or corroborate the conclusions that the original researcher(s) drew from the data. Sharing data more openly also enables scholars to examine the data and perhaps make new discoveries. Many research granting agencies have either made depositing the dataset created a requirement for research grants or have started to indicate some preference for datasets derrived from the research project or study being widely distributed or more widely available. There are complications in sharing data. Generally, these complications include balancing privacy issues with data integrity and dealing with organization and formatting issues with the data.
Increases visibility and impact of research
Data is made publicly available through a data repository can increase the impact
of that research (e.g. citation rates).
Ensures compliance with funding agency policies
A growing number of funding agencies demand that researchers manage and
share their data upon completion of a research project.
Publicly available data allows researchers to collaborate with each other by sharing
data sets, research environments and tools.
Enables replication and verification of research results
When data are archived and shared, results are repeatable and data can be used for
reanalysis, backing up original research findings. They may also be used to
expose errors or inconsistencies with original data analysis.
“Researchers should ensure that the data obtained are stored with all the precautions appropriate to the sensitivity of the data. Data released should not contain names, initials or other identifying information. While it may be important to preserve certain types of identifiers (e.g., region of residence), these should be masked as much as possible using a standardized protocol before the data are released for research purposes.”
CIHR requires grant recipients to deposit certain data types - bioinformatics, atomic, and molecular coordinate data - into the appropriate public database immediately upon publication of research results. CIHR also requires researchers to retain original data sets arising from CIHR-funded research for a minimum of five years after the end of the grant. They have plans to review this policy in the near future and possibly expand it.
Social Sciences and Humanities Research Council (SSHRC)
SSHRC has a Research Data Archiving Policy in place since 1990. The policy requires that “All research data collected with the use of SSHRC funds must be preserved and made available for use by others within a reasonable period of time. SSHRC considers “a reasonable period” to be within two years of the completion of the research project for which the data was collected.”
Natural Sciences and Engineering Resaerch Council (NSERC)
NSERC no general policy in regards to research data. They do, however, have guidelines for researchers funded through the NSERC Strategic Networks Program. The guidelines state, “To encourage the sharing and dissemination of research data and its use by others within a reasonable period of time, an agreement regarding responsibility for the maintenance and preservation of large data sets must be in place at the outset of network activities.”
Scientific Data, from Nature Publishing Group, is a new open-access, online-only publication for descriptions of scientifically valuable datasets. It introduces a new type of content called the Data Descriptor designed to make your data more discoverable, interpretable and reusable. Scientific Data is currently calling for submissions, and will launch in May 2014.
"DataDryad.org is a curated general-purpose repository that makes the data underlying scientific publications discoverable, freely reusable, and citable. Dryad has integrated data submission for a growing list of journals; submission of data from other publications is also welcome."