Data Organization in Spreadsheets (recorded 31 January 2024)This webinar is for any researcher or HDR who is about to commence a research project where data will be in a spreadsheet.
Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start.
We organize data in spreadsheets in the ways that we as humans want to work with the data, but computers require that data be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too!
In this lesson, you will learn:
Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets
In this lesson, however, you will NOT learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.
OpenRefine for Data (recorded 7 February 2024)This webinar is suitable for any researcher or HDR who has collected research data in spreadsheets and is preparing to analyse the data.
Before you can analyze data, you need to clean it.
Data cleaning identifies errors and corrects formatting to create consistent data. This step must be taken with extreme care and attention because without clean data the results of analysis may be false and non-reproducible.
OpenRefine is a powerful free and open-source tool for working with messy tabular data: cleaning it and transforming it from one format into another without changing your raw data files
This hands-on workshop will teach you to use OpenRefine to clean and format data effectively and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.
Examples of OpenRefine functions are:
Making the data machine readable by removing whitespace Finding and correcting spelling mistakes by clustering and counting names in a column
Prior to this workshop, you will need to install OpenRefine. OpenRefine is a free, open-source Java application that runs locally in your browser.
Research Data Management Planning (recorded 7 March, 2024)This 60 minute webinar is suitable for researchers or HDRs who are starting a new research project generating research data.
This webinar will present an overview of the considerations that should be taken into account when dealing with research data before, during and after the active research phase to ensure that the data will be secure, useable and policy compliant.
This session will cover
Data Management Plans
Policy
Data storage and security
Sensitive data including Indigenous data principles
Documentation and metadata
Tidy data
Data retention in a repository including licences