Application areas

Converting machine learning principles and theories to technology to make it useful in real-life is challenging. Our faculty hosts world-class researchers in applied machine learning, who bridge the gap between theory and practice. In a nutshell, our applied machine learning researchers are working on problems in the following areas:

Computational Biology

Many devices used in biological research, such as sequencing machines and X-ray crystallography, generate large quantities of data. The need to effectively interpret this data has resulted to a highly interdisciplinary research area called computational biology. Our researchers work on various problems in this area using machine learning techniques, including protein structural biology, RNA and RNA-RNA secondary structure prediction, and computational cancer genomics.

Information Retrieval (IR) and Web Mining

The Web provides an enormous source of data that can be mined for many different purposes. Examples of our work in this area include:

  • technology for personalising the retrieval of web content using hierarchical Bayesian models;
  • the analysis of social network data (such as Twitter feeds) for identifying trends and making predictions; and
  • improving the recommendation of content (such as movies and music) to users based on their individual interests.

Medical Informatics

Analysis of medical records. Researchers from the Faculty of Information Technology (FIT) and the Alfred Hospital are applying advanced data analytics to administrative and clinical data to augment administrative records. Currently, much clinical information, such as the medical diagnoses, are not captured, making it impossible to determine the varying rates of different medical conditions. Advanced data analytics make it possible to infer diagnoses from clinical treatment, allowing records to be enriched with this missing information. To this end we are developing new technologies and approaches to the data enrichment problem.

Surgical process modelling. More than half a million surgeries are performed every day worldwide, which makes surgery one of the most important component of global health care. Competing demands are motivating a better understanding of surgical processes: the number of patients is continuously growing, surgical procedures are getting more complex, residents now have to be trained while performing less procedures, the surgical interventions have to be more and more justified and the procedures have to cost less money. A better understanding of surgical practice is one of the key components to addressing these issues. Identifying the characteristics of surgical procedures and analysing surgical behaviours is critical to improving health care. We are working in collaboration with the French Institute for Health and Medical Research (INSERM) on this topic, in order to provide support and automated decision pre-, peri- and post-surgery.

Monitoring and assistance systems

Systems that help elderly people remain safely in their homes, where the system learns a resident's patterns of behaviour from sensor observations, and issues alerts to the resident or off-site carers.

Natural Language Processing (NLP) and Text Mining

Language technology is the enabling component of many text-based computerized technology. Particularly, we are working on:

  • core NLP problems and enabling technologies, e.g. computational semantics and statistical parsing;
  • machine translation where a computer program 'automatically' learns how to translate from one language to another language based on examples translations;
  • large scale probabilistic models (eg graphical models and non-parametric Bayesian models) for text processing;
  • cross-lingual sentiment analysis and document classification; and
  • dialogue systems: this includes computational semantics and speech processing.

Time-Series Analysis/Classification

Many kinds of data describe temporal processes. These processes may be either stationary, with the data being generated independently and identically distributed (i.i.d.), or non-stationary, with the processes generating the data themselves changing over time. We are interested in both kinds of time series and the data mining of models for them; some of the early MML work was on time series. There are many kinds of models that may be learned to represent these processes. Our research includes:

  • Inference of segmented models, i.e., cut points for a sequence of models, for binomial and other kinds of data
  • Inference of dynamic Bayesian networks, with repeating static networks (stationary time series) and with varying static networks (non-stationary time series)
  • Classification of time series
  • Analysis of spatio-temporal series, such as the analysis of series of images (in particular sensed from satellites)