Solving challenges in designing the NCHA Healthy Ageing Data Platform
The development, design and capability of the NCHA’s Healthy Ageing Data Platform was promoted at international and national symposia this year, including being invited to lead a key symposium at the Australian Association of Gerontology (AAG) Conference in November, 2023.
The symposium, ‘The National Centre for Healthy Ageing, Healthy Ageing Data Platform: developing a data ecosystem for studying the complexities of ageing’, highlighted some gaps in current Electronic Health Record (EHR) data for research, and provided details about key activities undertaken to address these gaps including:
- data validation using aged care residency as a case study
- improving diagnosis coding using Artificial Intelligence
- data linkage for comprehensive data capture
- engaging older consumers in the use and governance of their data
- the acceptability of patient reported outcome data collection from an older person perspective.
Attendees at the symposium were provided with insights into the strengths and limitations of using EHR data for research relevant to an ageing population and how routinely collected data can be used to improve the health and wellbeing of older people.

The NCHA Data Platform team at AAG 2023.
Image (L-R): Tanya Ravipati, Dr Emily Parker, Dr Kim Naude, Lucy Marsh, Assoc Prof Richard Beare,
Assoc Prof Nadine Andrew, Dr Alison Carver.
The presentations at AAG included:
- The National Centre for Healthy Ageing data platform: using data linkage for comprehensive data capture within a geographic region - Associate Professor Nadine Andrew

- Improving diagnosis coding using Artificial Intelligence - Associate Professor Richard Beare

- Electronic Health Record data validation and optimisation, using aged care residency as a case study - Ms Tanya Ravipati

- Understanding the routine collection of EQ-5D-5L data for patients aged 60 years or older - Dr David Snowdon/ Ms Lucy Marsh

- The NCHA data platform: engaging older consumers in the use and governance of their data for health research - Dr Emily Parker

- Developing a health service organisation-wide patient-reported outcome (PROM) collection system: a multi stakeholder co-design process with older Australians – Dr Kim Naude

Why is the NCHA Healthy Ageing Data Platform unique?
Our world-leading Healthy Ageing Data Platform uses the power of large data to inform and influence national solutions for health and care related to ageing.
What makes the NCHA Data Platform unique is its linking of electronic health record (EHR) data across multiple health and care sectors in a geographical population, providing an excellent opportunity to support innovative research. Find out more.
Below are summaries of the presentations delivered at the AAG Conference:
The National Centre for Healthy Ageing data platform: using data linkage for comprehensive data capture within a geographic region.
Introduction: Older people frequently have complex health problems, accessing multiple services with health data siloed between systems. We described the establishment of a linked geographic cohort within the Australian National Centre for Healthy Ageing (NCHA). Peninsula Health (NCHA partner) Electronic Health Record (EHR) systems (4 hospitals, >10 community services) within a geographic region. Data were linked for all residents aged ≥60 years to: Medicare, medication dispensing’s, Aged Care and death data through the AIHW; and the Centre for Victorian Data Linkage (CVDL) for linkage to state-wide hospital data. Project specific linkages were also undertaken with General Practice (GP), environmental and local council data.
Results: Data for >800,000 patients collected over a 10-year period have been curated within the Platform’s research data warehouse from 13 relevant datasets. Linked AIHW and CVDL data were obtained for the 222,057 residents aged ≥60 years (median age 72.2 years, 52.2% female, 1.8% regional residence) for the period Jan 2010-May 2021. During the study period 91% presented to a Victorian hospital (1.6 million[M] inpatients episodes, 1.7M outpatient episodes) with >54M Medicare claims and 58M pharmacy dispensings. Environmental data such as greenery, and traffic pollution have been incorporated and linkages to local GP practices is in process.
Conclusion: We demonstrated the feasibility of linking EHR data across a geographic region to establish a linked data Platform that is underpinning a range of research activities related to healthy ageing.
Improving diagnosis coding using artificial intelligence
Background: The use of health data at scale for research has traditionally been dependent on coded administrative datasets. The limitations of these datasets stem from their core purpose which is to support operations or funding. We described the deployment of a natural language processing (NLP) tool within the National Centre for Healthy Ageing Data Platform and its application to research using dementia diagnosis as a case study.
Methods: The tool, CogStack, automatically annotates documents from clinical databases. The annotations can be used to derive information that is not traditionally coded in administrative data sets but is critical to research questions relating to ageing and wellbeing of older people. Examples include presence of comorbidities that are not the primary reason for admission to hospital and social factors that are important to understanding an admission context. The approach, using a combination of automated annotations produced by CogStack and statistical classifiers, has been used to detect patients with a diagnosis of dementia based on the unstructured documents in the inpatient record.
Results: Classification accuracy was tested using simple keywords, logistic regression, naïve Bayes, support vector machines, random forest, and a fine-tuned clinical BERT. The highest accuracy was 91.9 per cent using random forests.
We have demonstrated the feasibility of improving accuracy of coding for problematic diagnoses like dementia. We plan to expand this application to a range of other diagnoses and social concepts e.g. social isolation, known to be poorly recorded in coded datasets.
Electronic Health Record data validation and optimisation, using aged care residency as a case study
Introduction: Electronic Health Record (EHR) data collection processes for Aged Care residents accessing public hospital services contains inconsistencies due to multiple systems and entry-points. In this study, we described the validation and optimisation process undertaken at the National Centre for Healthy Ageing (NCHA) to identify aged care residents within EHR systems.
Methods: In a cohort of individuals aged ≥60 years (January 2017-December 2022), data items routinely submitted to the Department of Health commonly used to identify aged care residency, were compared to an index derived by mapping residential addresses against the Australian Geocoded National Address File (G-NAF) and then with the Australian Institute of Health and Welfare (AIHW) aged care services list. We calculated sensitivity, specificity, and Positive Predictive Values (PPVs) separately for aged care residents.
Results: Data for 65,367 admitted and 77,692 emergency patients aged ≥60 years from three health service datasets within the NCHA Data Platform were analysed. Based on routinely submitted items, only 13.9% of admitted and 8.5% of emergency patients were aged care residents. Compared to our derived index, sensitivity was 73.3% for admission data and 91.7% for emergency data. Specificity was 74.8% for admission and 70.9% for emergency data. The positive predictive value was high for emergency data at 85.6% and excellent for admission data at 97.6%. Use of our derived index allowed an additional 12% of residents to be captured in the cohort.
Conclusion: This case study highlights the need for optimizing EHR data to ensure maximum cohort inclusion for research activities.
Understanding the routine collection of EQ-5D-5L data for patients aged 60 years or older
Introduction: Patient reported outcome measures (PROMs) capture the perspectives of consumers about their current state of health and is underrepresented in routinely collected data. However, routine collection of PROMs requires patients to value these outcomes. We aimed to establish patient acceptability of completing a PROM in their routine care, and compare acceptability between those aged ≥60 years and those <60 years.
Methods: A cross-sectional survey design was used to explore the acceptability of the EQ-5D-5L (i.e. PROM measuring health-related quality of life) from the perspective of patients who attended community health services within a publicly-funded organisation. Likert scale items explored level of acceptability with using the EQ-5D-5L and open-ended questions determined how well the EQ-5D reflects experience of illness. Acceptability was compared between age groups using chi-square test and open-ended questions analysed using content analysis.
Results: The majority of the 304 patients (mean age:70 years, SD:16) agreed that the EQ-5D-5L: was easy to use/understand (n=301, 99%); improved communication with their therapist (n=275, 90%); and made them feel more in control of their health (n=276, 91%). Those aged ≥60 years reported lower agreement with these items and were less likely to want to continue using the EQ-5D-5L than those aged <60 years (X²=10.46; df=3; p=.015). These patients felt the EQ-5D-5L did not adequately capture their experience of illness related to fatigue, balance/falls, cognition, and sleep.
Conclusion: The EQ-5D-5L is acceptable for routine use in patient care but may not capture all aspects of health relevant to patients aged ≥60 years.
The NCHA data platform: engaging older consumers in the use and governance of their data for health research
Introduction: Electronic Health Record (EHR) data is being increasingly used for research. Older people are the greatest users of health services. However, rarely are they given the opportunity to contribute to how their health data are governed. We described how the National Centre for Healthy Ageing (NCHA) have engaged older people in the governance of their health data for research.
Methods: A review was conducted to investigate how similar data platforms engage consumers in governance processes. Health service consumers were invited to participate in two face-to-face workshops to provide opinions on: attitudes to sharing health data and how consumers could be directly involved in the governance and use of their health data. Digital consumer-focussed solutions were also scoped to support consumer-based engagement.
Results: Five EHR derived data platforms were identified from three countries, none specifically integrated consumers into their processes. Sixteen consumers (8 male, 8 female) aged ≥65 years participated in face-to-face workshops. Consumers expressed the need for greater transparency around health data usage, resulting in the inclusion of a consumer review process for data access requests to the platform. Partnering with the UK’s National Innovation Centre for Ageing (NICA), the NCHA is implementing the Voice™ Consumer Engagement Platform to capture the unmet needs, priorities and aspirations of our community. This digital consumer engagement platform will allow expansion of consumers engagement activities and ongoing monitoring of attitudes towards the use of health data.
Conclusion: We demonstrated the ability and importance in engaging older consumers about the use of their health data in research and provide examples of how to address the challenges of engaging this community in a transparent and inclusive way.