People are increasingly making datasets that they've created and compiled available for others to use. Finding the data most useful to you can be a bit of a challenge, as there are many different ways that people can choose to release their data. This guide runs through the three steps of conducting an efficient search for data:
Step 1 - Identify your dataset needs
Step 2 - Search
Step 3 - Assess what you find
If you want any advice or assistance with your search, please contact researchdatasupport@sydney.edu.au or your Academic Liaison Librarian.
Make your search for data as efficient as possible by figuring out a few things before you get started. Think about and answer the following questions:
Knowing the answers to these questions will help you perform an effective search by identifying the keywords, filter conditions and data sources that are most appropriate for the data that you want to find.
There are a number of different strategies you can use to search across the various places that people make data available. The strategies are ordered from the simplest to those that require more effort or may turn up less relevant material.
Strategy 1 (Simplest) - Google's Dataset Search
Strategy 2 - Data repositories and archives
Strategy 3 - Library and other University subscribed data sources
Strategy 4 - General internet search
After you've found a dataset that you think might be useful to you, make sure that you assess it for relevance and quality to ensure that you don't waste time trying to analyse data that doesn't meet your needs.
Use the metadata associated with the dataset to make sure that it meets all of the criteria that you established before you started searching. Double check that:
Look for readme files or other documentation that describes the dataset. To be understandable and usable a dataset must include:
After reading the documentation, you should be able to understand what information is contained in the dataset and what can and cannot be done with the data.
Consider the trustworthiness of the data and the data source. Ask yourself: