The power of R: Professor Dianne Cook elected to esteemed R Foundation
We spoke to Professor Dianne Cook about R, the language of data analysis.
The use of ‘R’ is all around, although many would be completely unaware of it. But if you’ve got a bank account, read a newspaper or used social media, you’ve certainly come into contact with it.
R is the language of data analysis, an open source programming language for tasks such as statistical modelling and data analytics. It is the basis of popular tools used by statisticians and data miners around the world to make sense of data, and used in many organisations including giants ANZ, Google, Facebook and Twitter.
“R makes data analysis accessible to everyone,” says Professor Dianne Cook from Monash Business School’s Department of Econometrics and Business Statistics.
“It is the most powerful software available for data science today and it is free. Anyone with a laptop, some math skills and a willingness to script code in R can take data from many sources, plot it and model it to understand what is going on around them.”
Professor Cook has been recently elected into the inner circle of the global R Foundation. This body is charged with supporting the R project, to hold and administer the copyright of the R software and documentation. She is the second Australian - behind founding member Bill Venables - and the third woman globally to be elected.
New ordinary members are selected based on their non-monetary contributions (such as code and effort) to the R project. Professor Cook is one of 37 invitation-only members, and will participate in key decision-making around the code base.
Begun in the 1990s by two New Zealand researchers to teach first year statistics, the R project was adopted by an international team within a few years and developed into a collaborative open source project. This means it encourages people across the globe to create, develop and upload new packages for conducting different aspects of data handling, management, visualisation and modelling.
The number of contributed packages has grown enormously. In March this year, the number of packages available was around 7000; by November this has grown to more than 9000. R is now the dominant software used globally for data analysis. Another open source project, Python, has rapidly developed in recent years and been adapted to data analysis partially by using the model provided by the R project.
The R code base is regularly updated to fix bugs and accommodate new technological advances.
“I use R to pull data from public sources and use its analytical tools to better understand our world,” says Professor Cook. She adds “that developing R skills among the students helps to give them a competitive edge in the job market.”
Professor Cook has recently created a package called Eechidna (Exploring Election and Census Highly Informative Data Nationally for Australia), which pulls data from the 2013 Australian Federal Election and 2011 Australian Census to allow demographic mapping of the Australian electorate. Functions in the package can be used to pull the most recent election and Census data, and future changes.
Many R packages are now used by large corporations such as Google, Facebook and Twitter as well as government departments. One of the Department of Econometrics and Business Statistics' PhD students is currently working with Australian Border Security and Control to report to importers which of their suppliers are regularly providing problematic shipments.
Another project Professor Cook is working on is based on data from the City of Melbourne. The project reviews data of Melbourne pedestrian traffic to show locations that are predominantly used by commuters compared to shoppers. It also reveals the impact of Christmas and events such as White Night on pedestrian traffic. The research is designed to illustrate how Melbourne city can develop commercial infrastructure to better support the community.
Professor Cook has also been involved in the recent establishment of R Ladies Melbourne, a spin-off of R Ladies Global, which is designed to promote women’s involvement in the R project.
“The R community is still very male dominated,” says Professor Cook. “Now the number of women who have contributed packages to the R open source is at 15%. We now have over 200 members of our R Ladies Melbourne group.”
By Elizabeth Byrne