Towards a Global Sleep Observatory: validating sleep/wake time inference from granular internet activity measurements

Project supervisors

A/Prof Simon Angus, Faculty of Business and Economics (Main Supervisor)
Prof Shantha Wilson Rajaratnam, Faculty of Medicine, Nursing and Health Sciences

PhD project abstract

Sleep is widely recognised as critical to human health, with poor sleep quality and sleep deficiency linked to major causes of death, impaired cognitive performance, diminished workplace productivity, and an increased risk of harmful accidents. [1] Together, insufficient sleep has been attributed to economic losses in the hundreds of billions of dollars in developed economies, including Australia. [2,3] However, obtaining accurate, timely data on sleep deficiency at the population level is costly, with current methods requiring either laborious retrospective sleep diarisation, or the deployment of wearable technology which faces challenges at scale. [4] Understandably, the sleep science community has called for a ‘broad data-collection strategy’ to ‘transform our understanding of sleep’. [5]

Building on successful proof-of-principle prior work, and establishing a key collaboration with an internationally leading sleep research program, this PhD project will make significant advances towards the establishment of a Global Sleep Observatory by undertaking fundamental research that will provide broad validation for inferring sleep duration and quality from remotely observed, granular internet activity data.

Specifically, it has been demonstrated for a universe of 81 US cities, that average, city-level, sleep and wake times from self-reported time-use survey data can be accurately inferred with modern machine-learning methods from remotely and passively observed, intra-diurnal internet activity traces. [6] The Monash IP Observatory, housed at SoDa Laboratories in the Monash Business School [7], has the unique capability to obtain equivalent activity traces for over 1000 cities globally, making 3 billion observations of over 400 million internet connected end-point devices daily, and researchers at SoDa labs have already produced the proof-of-principle study. The Observatory is fully GDPR compliant and does not collect PII from any end-point.

By gathering new ground-truth sleep duration and quality data from representative individuals across a variety of cities, and combining these with remotely and passively measured internet activity observations from the same individuals, the project will undertake granular statistical inference to validate internet activity measurement as an accurate proxy for sleep duration and quality. Additionally, by leveraging follow up sub-sample studies of particular sleep pathologies and shift-workers, further models will be developed to identify these pathologies from remotely observed internet activity data. Together, these studies will undertake fundamental research to micro-validate the association between granular, remote, passive internet measurement and sleep duration and quality, forming the foundation of a Global Sleep Observatory using internet observation.

[1] Medic, G., Wille, M., & Hemels, M. E. H. (2017). Short- and long-term health consequences of sleep disruption. Nature and Science of Sleep (Vol. 9, pp. 151–161).
[2] Hillman, D., Mitchell, S., Streatfeild, J., Burns, C., Bruck, D., & Pezzullo, L. (2018). The economic cost of inadequate sleep. Sleep, 41(8), 1–13.
[3] Hafner, M., Stepanek, M., Taylor, J., Troxel, W., & Stolk, C. (2017). Why sleep matters -- the economic costs of insufficient sleep: A cross-country comparative analysis. In Why sleep matters -- the economic costs of insufficient sleep: A cross-country comparative analysis (Vol. 6, Issue 4). RAND Corporation.
[4] Adams, R. J., Appleton, S. L., Taylor, A. W., Gill, T. K., Lang, C., McEvoy, R. D., & Antic, N. A. (2017). Sleep health of Australian adults in 2016: results of the 2016 Sleep Health Foundation national survey. Sleep Health, 3(1), 35–42.
[5]  Till Roenneberg. Chronobiology:  The human sleep project. Nature, 498(7455):427–428, June 2013.
[6] Ackermann, K., Angus, S. D., & Raschky, P. A. (2020). Estimating Sleep & Work Hours from Alternative Data by Segmented Functional Classification Analysis (SFCA).
[7] Monash IP Observatory, SoDa Laboratories, Monash Business School,

Areas of research

Population health; Sleep medicine; Big Data; Machine learning; Data-science

Project description

The PhD project will undertake the fundamental research needed to micro-validate the proof-of-principal prior work at the city-aggregate level, by shifting focus to the individual. In a first stage, the project will generate new, ground-truth sleep and internet activity datasets from representative samples of individuals across a variety of contexts; and then, in a second stage, the project will apply advanced machine learning methods to refine generalisable tools for timely, accurate sleep estimation, at global scale.

Participants (healthy, with no diagnosed sleep disorders) will be recruited from 10 representative cities across the United States and Australia, using our existing partnership with Qualtrics (n=1000). These participants will be required to log their bedtime, waketime, and sleep latency and subjective quality measures into the Qualtrics platform each day for 2 weeks. Follow up studies will include subsamples with common sleep disorders such as obstructive sleep apnea and insomnia, and those with disrupted sleep such as shift workers. At commencement of the study period, participants will provide their Internet Protocol (IP) address to the research team, enabling remote, frequent measurement of their internet activity.

By combining these data, advanced machine learning models will be built on top of proof-of-principle work to infer sleep start/stop times, including interruptions, first at the individual level, then at the city aggregate by combining the representative samples. Model validation will proceed at the city level by training on individuals in all but one city, and then  inferring sleep characteristics in the unseen city. In an additional stage, the project will then explore the feasibility of inferring common sleep disorders from internet patterns at the individual level, given the known associations between digital technology and sleep quality. This research will open the door to passive, remote sleep health alerts and management at the individual scale, presenting a very low cost, and scalable avenue for health professionals to better manage and treat sleep disorders.

The project is fundamentally interdisciplinary in nature, combining state-of-the-art knowledge and methods from the clinical sciences (sleep medicine), psychology (psychometrics and psychological assessment), internet measurement (computer science), and advanced statistical machine learning methodologies. Monash University is the only institution with the combined world-class strengths in sleep medicine, internet measurement and advanced machine learning to undertake this project.

The project fits strongly within, and ties together, two world-leading programs of research at Monash University. Prof Rajaratnam leads an internationally leading sleep research program at Monash with deep connections nationally and internationally as chair of the Australiasian Sleep Association, the Sleep Health Foundation, and the Monash Sleep Network. He brings extensive experience in sleep medicine and sleep measurement, with a focus on disrupted sleep, shift-work, and circadian rhythms. A/Prof Simon Angus, is the co-founder and Director of the Monash IP Observatory, and Technical Director of SoDa Laboratories. He leads a number of programs that handle very large, alternative data sets, and apply them to empirical and computational social science questions, with a strong focus on modern statistical methodologies. As such, this PhD project reflects the pursuit of an obviously synergistic, inter-disciplinary research opportunity.

The overall aim of the joint research program is to build a Global Sleep Observatory built on remote, consistently measured human--internet interaction at global scale. The project is targeted at the fundamental research needed to build granularly validated methodologies to infer sleep patterns from remote, passively obtained data. If successful, the potential impact of the program is enormous. A global, passive, consistently obtained, and micro-validated sleep measurement platform would place Monash at the centre of sleep measurement worldwide. A Global Sleep Observatory would enable a wide and exciting range of medical research including leading indicators of sleep deficiency, critical event impact assessment, circadian rhythm analysis, together with economic and social analysis, and behavioural change policy assessment. Given the major economic and social importance of sleep to human health and wellbeing, this fundamental research will enable impactful societal change across the globe. And, given the increasing prevalence of internet-connected peoples, its impact and opportunity will only grow over time.

PhD student role description

Are you passionate about making a real-world impact with your science? Do you enjoy rich collaboration with leading experts in cross-domain applications? Would you like to work with data from a unique data generating instrument operating in the commercial cloud, with global reach and scale? This project is all about combining deep domain knowledge across boundaries, to bring about global impact. Our team is excited about this opportunity and want to share it with a candidate who is ready to bring their domain knowledge and appetite for learning to the team.

Whether you have relevant experience and training in sleep science, and a growing capability in statistical machine learning and data science, or you have formal qualifications in statistics, machine learning or computer science, and are passionate about applying your capabilities to interesting, wide-reaching problems, this project is for you.

You will be responsible for managing data acquisition and on-boarding from field surveys and our global internet instrument. You will develop data pipelines, build features, and undertake statistical machine learning model development and testing. You will communicate your methodologies, and findings, and build up a deep contextual understanding of human sleep, its disorders, and survey-based methods of sleep measurement. Our team will provide guidance and support to grow your skills where needed, in a supportive, lab-based research environment. Collaboration will be ‘by design’ and platform level support will come from respective teams to enhance your training and the quality of your work.

By combining world-class capabilities and knowledge in population sleep measurement, with unique big data access on internet activity, and advanced machine-learning capabilities, this project provides the prospective PhD student with a unique opportunity at the intersection of human health, machine learning, and big data.

Required skills and experience

The PhD applicant should have prior qualification and experience in one of the key domains of the project (sleep science, modern statistical methods) and demonstrated interest/capability in the other domain. The ideal applicant would have strengths across human biology and advanced statistical methods. In addition, applicants must be able to demonstrate capability with modern data analysis techniques and software (e.g. R, Python, Matlab, or equivalent), and ideally to demonstrate capability with large datasets. Finally the applicant should be able to demonstrate strong fundamental scientific skills including well organised, clear and impactful written and verbal communication, capability with research data organisation, together with a creative and collaborative spirit, and experience with independent research.