Publishing your data doesn’t necessarily mean that your data must be available to anyone and everyone. If there are concerns that releasing your data openly could cause harm to someone or something, or have other negative consequences, then you can choose to apply specific restrictions on how people can gain access to your data.
The terms that we use to describe the different access levels, (open, mediated, closed and embargoed), are widely used, but different institutions or repositories may use slightly different terms or definitions. Before you publish, check how your chosen repository refers to the access levels that they provide to make sure you choose the right one.
There are no restrictions on access to the data; anyone can view and download a copy.
This ease of access makes your data more likely to be reused and makes it possible for others to verify the results of your research. Open access is the best choice for publishing data that aren’t sensitive, such as most non-human data. Provided consent has been obtained from participants, human research data may also be published through open access, usually in a non-identifiable form.
A description of your dataset is published, including information such as the dataset title, who created it, and what the data are; however, the dataset is inaccessible until after a specified period of time has elapsed. At the end of the embargo period, data will become available by either open or mediated access, depending on the option that you’ve selected.
Examples of situations where you might wish to apply an embargo to your data could include wanting to publish research based on the data before making the data accessible, or needing to finalise a commercial benefit resulting from the data, such as a patent, before releasing the data.
A description of your dataset is published, including information such as the dataset title, who created it, and what the data are; however, others won’t be able to access the data until after they apply and have their application approved. Conditions of access are usually set by the owner or submitter of the data and may include providing proof that the requester is a genuine researcher and that they have ethical approval from their own institution to undertake the research.
Mediated access enables data to be shared and reused by other researchers while reducing the risk of any harm that might result from a wider release of the data. Mediated access is a good choice for sensitive data, such as identifiable human data, or data for which there is a significant risk of re-identification, although consent is still needed to publish human data through mediated access.
A description of your dataset is published, including information such as the dataset title, who created it, and what the data are; however, the dataset is inaccessible and there is no process in place to allow others to apply for access to it.
Ensuring that a record of the dataset is made available informs others of research that has been done, and that the data exist, even if they are too sensitive to be shared beyond the original research team. Closed access is rarely used, but it might be an appropriate choice if: you need to securely archive sensitive data; you have published a version of the data via open or mediated access with sensitive information removed, and you would like a record of the original, unmodified version of the data; or you have worked on developing a dataset, but don’t have the right to publish the data.
Some repositories allow you to publish a record of your dataset that includes information describing the data, but that doesn’t directly provide access to the dataset itself. This is called a metadata only record. There are a variety of reasons that you may wish to publish a metadata only record, including:
Data collected from or in collaboration with people or communities from some cultural backgrounds may have additional sensitivities or protocols regarding access and reuse. Before you publish any data related to Aboriginal and Torres Strait Islander peoples, for example, you must work with the community to identify any sensitive material and local protocols for access used by the community. You must ensure that your published data follows any identified protocols and protects secret or sacred information. This may include applying specific restrictions on accessing the data, such as using mediated access to allow only community members or researchers who have been given community approval to be granted access to the data. If you do restrict access to all or part of your data, it’s important to put a sustainable process in place to ensure that people who have the right to access the data, such as members of the community, will always be able to do so.
Mediated access, whereby you set conditions on when and how access is granted, is all about allowing your data to be reused in a safe way that protects your research participants from harm. If your data are sensitive, and the sensitive information can’t be removed without the dataset losing value, then mediated access is a good option to choose when publishing.
Scenario 1 – Identifiable domestic violence data
A researcher investigating domestic violence collects identifiable data from their research participants. The researcher obtains consent from the participants to share the identifiable data via mediated access for specific research purposes. After the research is published, other researchers who wish to access the data will need to meet two criteria. They must demonstrate that they will be using the data for the specified purposes and that they have gone through an appropriate ethics approval process. Researchers who prove that they meet both criteria will be granted access to the data.
Scenario 2 – Culturally sensitive data
Researchers working with a community collect culturally significant data that they hold in trust for that community. The researchers publish the data in a repository with specific conditions regarding who may access it. Only researchers who have been given community approval to use the data in their research can access the data, but any community member can access the data at any time. Researchers have to go through a community approval process to access data, but community members can access the data without needing to go through the approval process.
Scenario 3 – Potentially re-identifiable health data
A team of medical researchers collect data on a rare health disorder from patients. The researchers obtain consent to publish a non-identifiable version of the data via open access. However, due to the rare nature of the disorder, a risk assessment of the dataset highlights a risk of potential re-identification if the data were to be used in linked data projects. In order to mitigate the risk, the dataset is published via mediated access, and researchers are only granted access to the dataset if they sign an access agreement stating that they will not be linking the data to other datasets.