Access to a safe and secure Natural Language Processing (NLP) framework is a game changer for research outcomes
NCHA Healthy Ageing Data Platform leading the way in AI and NLP to advance research outcomes in the healthcare environment.
The National Centre for Healthy Ageing (NCHA) Data Platform, led by Associate Professor Richard Beare and Associate Professor Nadine Andrew, is making ground-breaking progress in creating a Natural Language Processing (NLP) framework allowing researchers to access information that may not otherwise be accessible, especially information that is either poorly coded in standard datasets or not coded at all.
We have already used NLP to show its value in identifying dementia, a condition that is not readily picked up clinically, from routine electronic health data and are now diversifying our work into other related health conditions associated with ageing, such as frailty and delirium. We are also using it to find information relevant to other important scenarios in relation to health. "An example of this might be housing or social connections, which are currently not well recorded in structured fields. Is someone homeless? Do they live alone? Do they have help at home? Do they smoke? Can we apply some kind of AI tool that lets us check thousands of records and return that level of detail?” describes A/Prof Beare. “Our aim is to create a framework that will allow us to rapidly create and validate automated tools for extraction of project-specific items from unstructured text data.”
What is Natural Language Processing (NLP)?
As A/Prof Richard Beare explains, “Natural Language Processing (NLP) is a family of computational tools for automated analysis of narrative or unstructured text, giving computers the ability to interpret, manipulate, and comprehend human language. In essence, it’s a way to unlock critical information within free text response areas, that wouldn’t otherwise be accessible for researchers.”

A/Prof Richard Beare, A/Prof Nadine Andrew
“We are particularly excited by our work at the Healthy Ageing Data Platform. Having the ability to extract concepts not contained in structured data fields, at scale, will allow us to develop more comprehensive statistical models by controlling for a greater breadth of confounders or explanatory factors. This will increase our ability to draw more accurate insights from the data to improve and better understand health and social outcomes."
Modern AI tools for processing unstructured data such as large language models, like ChatGPT, have received high levels of investment, publicity and hype in the last two years. Delivering improvements in healthcare using this technology is extremely difficult given the lack of transparency over the nature of the data used to train the models and the sensitivity of patient data. The AI component of the NCHA Healthy Ageing Data Platform allows the team to experiment with large language models in a safe and secure way for research applications.
The Data Platform team is confident about the future use of this technology to advance research outcomes, with A/Prof Beare adding “This will build experience using AI tools in the healthcare environment, assisting with the development and deployment of future AI tools that address operational and clinical problems.”
Examples of NLP in health research – 2025 NHMRC Ideas Grant funding
NLP helping Government make informed decisions around aged care policies
A/Prof Nadine Andrew was recently awarded a National Health and Medical Research Council (NHMRC) Ideas Grant to lead work on ‘Novel methods to enhance the use of routinely collected linked data for the evaluation of government policies related to healthy ageing'
The new Aged Care Act, starting 1 July 2025, has redefined aged care as support and care to help people maintain independence as they age, with a greater emphasis on in-home care programmes and enhancement of primary healthcare.
A raft of new policies is expected in the wake of this change. However, policy makers do not currently have access to robust evidence of the effectiveness of past and current in-home care policies, making it impossible to reliably forecast the impact of future policies.
Recent advances in data linkage, causal inference methods (e.g. target trial emulation) and the application of artificial intelligence (AI) to electronic health records have the potential to provide new and more robust approaches to the evaluation of government funded policies captured in administrative datasets, and this is where A/Prof Nadine Andrew and her team at the NCHA Healthy Ageing Data Platform are making great progress.
“This project represents the first application of NLP for the extraction of important variables from clinical text, supported by input from experts and consumers, for incorporation into linked data models. Utilising AI in this way will allow us to draw stronger causal inferences from the data than was previously achievable.” explains A/Prof Nadine Andrew
"While Randomised Control Trials (RCTs) offer the best evidence for policy decisions, they’re not always feasible. Target trial emulation using existing electronic health data (EHR) can approximate RCTs, with NLP automating the extraction of key variables from EHRs to create richer, higher-quality datasets. Automation is crucial for achieving the necessary sample size cost-effectively."
Learn more about the NCHA Healthy Ageing Data Platform: https://www.monash.edu/medicine/national-centre-for-healthy-ageing/data-platform
Or contact us at: ncha-enquiries@monash.edu
Using NLP to access important free-text documentation in Electronic Health Records
Dr Ting Xia and Dr Tina Lam PhD at the Monash Addiction Research Centre (Monash University) were also recently awarded an National Health and Medical Research Council (NHMRC) Ideas Grant to lead work on 'Transforming opioid poisoning surveillance through novel technologies', and the NCHA Healthy Ageing Data Platform will play an integral role in making this happen by using the technology of "Natural Language Processing" (NLP) to enable better coding of opioid involvement in ED cases.
Most opioid-related harms, like overdoses, are handled in emergency departments (EDs) without patients being admitted to the hospital. Despite being a rich resource, ED data remain underutilised, in part due to the quality of the routinely collected structured data. For example, condition codes (ICD-10 codes) are often filled out incorrectly or not at all, and about half of all ED opioid poisoning cases are not correctly documented. However, a lot of important details are recorded in the free-text sections of Electronic Health Records. In cases of opioid poisoning, clinicians often write down specifics about the substances involved (e.g., “took Panadeine Forte,” “took methadone tablets,” or “intentional OD”).
NLP is a cost-effective way to tap into this unstructured text without adding extra work for clinicians or changing current practices, and NLP algorithms can pull out and code the details in free text, making the data more accurate and detailed. This innovative approach to ED data opens the door to entirely new questions about policy performance, providing valuable insights to guide future policy decisions.
Read more: NHMRC funding for innovative AI tech that will transform hospital data coding
Find out more about MARC research about opioid prescribing in Australia and the policy impact: Enabling evidence-informed policy to address Australia's opioid crisis
References: Peninsula Health Research Report 2024, Artificial Intelligence enabled Healthy Ageing Data Platform supporting research, pages 18-19, https://www.peninsulahealth.org.au/wp-content/uploads/PH-Research-Report-2024-Digital.pdf
Learn more about the National Centre for Healthy Ageing (NCHA) here: ncha.org.au