What is OpenRefine?
OpenRefine is a standalone, open-source desktop application that enables the collection, manipulation, transformation and standardisation of incomplete or inconsistent data without affecting the data’s original structure. Although it looks like a spreadsheet, and works with standard spreadsheet file formats, it operates like a database and can be used to quickly sort, arrange and contextualise data.
What is OpenRefine useful for?
OpenRefine offers a wide variety of functions to collect, normalise and transform data, including:
- standardising inconsistencies such as date formats or misspelled values
- resolving or removing duplicates within columns
- splitting or combining cell values
- fetching and combining data from multiple sources, including spreadsheets and Web services.
- working with large datasets
OpenRefine is also used for exploration and discovery within data sets by using facets and filters based on textual, numeric or date information from within the data set. Transformed datasets can be exported and actions taken within a dataset can be saved and reapplied to new sets.
Where can I get it?
OpenRefine is available free from http://openrefine.org/
OpenRefine events run regularly throughout the year and include the following workshops:
- Introduction to OpenRefine
Registering for a workshop
Workshop registration is available through myDevelopment. This aligns with myPlan and training records for staff, and enables more streamlined processes for assigning credit towards the Monash Doctoral Program for Graduate Research students.
To search for upcoming Data Fluency workshops log into myDevelopment from your myMonash portal. Type "Data Fluency" or the software you are interested in (eg. "python") into the search box to find a list of available workshops.
No access to myDevelopment?
If you do not have access to myDevelopment, please complete the relevant form:
- External staff and users (without Monash ID) Requires Monash approver name and email address.
- Monash students (non-Graduate Research) Requires current Monash student ID.
Staff Development will validate and approve access. This process will take one business day to complete. An email will be sent to users confirming access details.
Where else to find training
You can find more online training materials for this tool via the Library. Visit LinkedIn Learning or Safari to access a range of videos, eBooks and online courses, or try using Library Search to find other resources to help you master this tool.
If you're still not sure where to start, use the details below to get in touch with the Data Fluency community.