Many researchers are interested in sharing data with one another, but don’t know how to go about it. With so many different research protocols and a variety of privacy concerns that complicate the process, it is not always easy for a researcher to decide what information they should share – or who they should share it with. Over time, this has created a culture where most researchers do not share their data at all. This blog post will explore 10 ways that can help make your collaborations more successful and enhance your collaborative efforts between colleagues.
Here’s a list of some best practices when collaborating on shared projects (listed in order from lowest impact to highest):
a) publish your data in a public repository
b) use a structured publishing platform for shared projects
c) make your raw data available to collaborators and grantees of your research project
d) release only anonymized or aggregated versions of the data you’re sharing with others.
e.) share through another system that will allow access without requiring any changes, such as an institutional archive site like Zenodo or Figshare (or GitHub). This is different from making one specific file publicly accessible because it allows broad and unrestricted access while still complying with copyright law. It also lets you change what’s open over time based on new developments in confidentiality concerns. And because many people are familiar with GitHub, this route may be more likely to find collaborators.
f) make a data management plan public, either on your own site or by publishing it in an institutional repository
g) share the raw data with someone else who has requested them and is qualified to analyze the data
h) release limited portions of your project’s dataset as a new open access publication that anyone can read without restriction
I) offer fellowships for research projects where researchers are given full access to shared datasets, such as Google Scholar Awards Program Fellowships
j.) publish in a journal if you’re releasing anonymized versions of your data only (e.g., through oRevo). This route requires changes because journals typically require authors to sign a copyright agreement in order to publish the data, while open access journals or repositories may not.
k) release your dataset and code as a toolkit that is freely available for anyone else to use
l) share raw data with another researcher by giving them read only permissions so they can’t change it but have full license to analyze and re-share their research without charge. This is often done through an institutional repository
a research collaboration can be enhanced by: shared datasets, such as Google Scholar Awards Program Fellowships
j.) publish in a journal if you’re releasing anonymized versions of your data only (e.g., through oRevo). This route requires changes because journals typically require authors to sign a copyright agreement in order to publish the data, while open access journals or repositories may not.
k) release your dataset and code as a toolkit that is freely available for anyone else to use
l) share raw data with another researcher by giving them read only permissions so they can’t change it but have full license to analyze and re-share their research without charge. This is a great option for collaborators who have different expertise but want to work together.
m) input data into a database like OpenRefine, where it can be analyzed and manipulated collaboratively.
n) use a third party collaboration platform that enables you to share files with other researchers (e.g., Dropbox). One of the benefits of these platforms is they make collaborations more reproducible because all participants will have identical access to the same files at any given time
o) create an anonymized dataset by removing names or other identifying information from datasets before sharing them in order to protect privacy rights while still maximizing potential reuse possibilities
p) offer guidance on how others may reproduce your research findings using your specific methods by including instructions on how to replicate a study’s findings in the methods or discussion section of your paper.
q) use licenses like Creative Commons CC-BY (Attribution) and CC-BY-ND (Attribution, No Derivatives). This allows other researchers to share your work while also requiring them to give credit where it is due
r) set up a data repository that can be accessed by scientists seeking specific datasets for their research projects. For example, Nature Publishing Group hosts DataDryad.org which provides free access to large amounts of curated scientific data on topics as diverse as animal behavior studies and genetic sequencing techniques
s) provide a public data repository for your work to ensure that the results of research is available in perpetuity
t) make sure all code and programming scripts are archived, including on repositories such as arXiv.org or GitHub (a social coding site where users can post their own software programs). This ensures future researchers will be able to read these files without having to guess at what they might mean
u) release raw datasets from experiments so other people have access to them in many formats like Excel spreadsheets (.xls), comma-separated values (.csv), tab-delimited text tables (.txt), XML documents, images with metadata, and more
v) release data collected by a robot (e.g., sensor readings or video footage of an experiment). Robots can also share their own datasets with others to help other robots better complete tasks they’re assigned
w) use Creative Commons licenses that allow for reuse, remixing and distribution of the material in any medium or format; these are outlined at creativecommons.org/licenses/. The CC-BY license allows anyone to copy text from a website into another document without having to cite it as a source .xx]
y) provide open access to research articles before publication so people who cannot afford journal subscriptions can still read them
z) make sure there’s a clear system in place for how to credit data collected by a robot.