Applying a licence to your research data will help other people understand how they can reuse the data and what kind of things you do and don’t want them to do with it. More information about publishing research data and copyright can be found at Publishing your research.
Funders, journals, or your selected repositories may specify a licence that you must use. Make sure you check which licence to apply, and that you’re happy with the reuse that will be possible for others under that licence, before publishing your data.
If there are no licensing requirements for the data that you’re publishing, then you can choose a licence based on how you want others to reuse your data. Creative Commons provide a set of six different licences that put various conditions on reuse. All Creative Commons licences require that anyone making use of your data attributes you. The Creative Commons Attribution Licence 4.0 (CC BY) is often a good choice for licensing data, however you can choose to apply whichever licence you feel best suits your data. Creative Commons have a handy licence picking tool to help you select the best licence based on how you want others to reuse your data.
Other licences and waivers exist, such as the Open Data Commons licences for data, GNU software and documentation licences, or CC0 Public Domain Dedication. If you wish to use any of these, or another alternative licence for your work, or if you want any assistance in selecting a licence for your data, you should contact the Library’s Copyright team.
There are a number of ways that you can apply a licence to your dataset. In almost all cases it’s a good idea to add a rights statement in a prominent place within your dataset, as well as at the location where your data will be hosted. The statement should include the name of the licence you’ve chosen and the URL at which the full text of the licence can be found. If you use the Creative Commons licence picker, you’ll find that it creates a rights statement based on the licence you’ve chosen that you can copy to add to your dataset.
Good, prominent places to include a rights statement in your dataset are in the top level of your folder structure as part of your readme text file or as a separate licence.txt plain text file, on the first page of your data dictionary, or as part of the standard template for your interview transcripts.
If you’re publishing your data via a repository or journal, you will usually be asked to specify licence information when you submit your data. This allows the repository or journal to put a rights statement up on your dataset’s record. The rights statement is usually put up in a machine-readable form as well, making your data more discoverable and reusable.
If you are publishing your data on a website, you should make sure to include the rights statement on the page where people will be able to download your dataset. The Creative Commons licence picker also creates the HTML code for the rights statement of your chosen licence, allowing you to easily add the statement and licence icon to your website.
Releasing data under a licence that allows for reuse means that anyone with a copy of your data can continue to make use of and distribute the data under the original conditions of the licence. This can be a problem if you decide that you want to restrict certain reuse activities that were allowed under the original licence, or if you no longer wish to allow your data to be reused. You can apply new, more restrictive licence conditions to a new version of your data, however, the data distributed under the original licence will still be usable under the original licence conditions.
Changing to a more permissive licence for your data is less problematic, as you will then be publishing a version of your data that allows for additional types of reuse, beyond what was permitted under the licence you originally selected. For example, you could publish a dataset under a Creative Commons Attribution Non-Commercial licence (CC BY-NC 4.0), and then later decide that you are happy for people to use the data commercially. You could then update the licence to be a Creative Commons Attribution licence (CC BY 4.0). Anyone who wanted to use the data for commercial purposes would then be able to use the new version of the data, and all previously allowed uses of the data would still be permitted.