Learning under Non-stationary Distributions
The sheer volume and ubiquity of data in the information age demand increasingly effective technologies for data analysis. Most online data sources are non-stationary: factors bearing on their composition change over time, as do relations among those factors. But nearly all machine-learning algorithms assume invariance. We are conducting a comprehensive investigation of emerging technologies for learning from non-stationary distributions, guided by the insight that subgroups change in different ways, at different times, and at different speeds.
Outcomes will include robust, tested, and reliable data analytics for non-stationary data – enabling far more efficient use of big data, with countless real-world applications.
Learning from Large or Complex Data
Effective extraction of information from massive and complex data stores is increasingly problematic as data quantities continue to grow rapidly and data becomes more heterogeneous. Quite simply, effective techniques for learning from small and simple datasets do not scale to either size or complexity. However, the problem is even worse than this. Big data contain more information than the small data in which context most state-of-the-art learning algorithms have been developed. For small data, overly detailed classifiers will overfit the data and so should be avoided. In contrast, big data provide fine detail and hence will benefit new types of learner that can capture it. We are creating novel learners that are not only capable of capturing this detail, but do so with the efficiency required to process terabytes of data.
Bayesian Non-parametric Methods
Non-parametric methods allow machine learning approaches to adapt to model complexity, to work on arbitrary structures and hierarchies and Bayesian techniques for this have been developed extensively. We are adapting techniques from the Bayesian non-parametric community in statistics to the computationally demanding and rich representational tasks explored in machine learning.
Information Theoretic Approaches to Data Analysis
Following the legacy of Monash's Chris Wallace (1933-2004), the originator of the Minimum Message Length (MML) framework, Monash remains the world centre of MML.
MML is a computational method for operationalizing the scientific principle of Occam’s Razor. More specifically, it has statistical invariance and statistical consistency properties which guarantee that it is not affected by the framing of a problem and that, as the amount of data increases, it will accurately quantify the noise and also converge arbitrarily closely to any true underlying model. The relationship between MML and (two-part) Kolmogorov complexity also mean that it can model any arbitrarily general computable problem, i.e., it can be applied to all machine learning problems.
Specific problems of interest include developing and refining the statistical theory behind MML, further developing and refining theories of intelligence, and applications to data from fields as diverse as infectious disease and sports.
Bayesian Approaches to Learning and Knowledge Discovery
Bayesian statistics provide a powerful framework for acquiring knowledge from data. This flagship theme provides powerful techniques for knowledge discovery and automated analysis of data. Our research into Bayesian classification learning develops powerful techniques for analysing historical data in order to make effective decisions with respect to future situations. A particular focus is learning from big data, in which context we are developing highly efficient (linear time) algorithms for exploiting the very detailed information about multivariate interactions that big data embody, with the promise of creating learning algorithms of unsurpassed efficiency and accuracy.
Bayesian networks become important representations for both prediction and modelling. Naïve Bayesian networks and their semi-naive variants (tree-augmented networks) are widely used in data mining applications and research. Finding more efficient ways of learning them, and finding more expressive variations of them to learn, engages top researchers around the world.
More general Bayesian networks, and specifically causal Bayesian networks, are becoming ever more widely employed in understanding and planning with complex systems, as in environment management, medicine and public policy. Causal discovery via MML (CaMML) provides methods for automating the learning of causal Bayesian networks from observational and experimental data using Bayesian inference.