AI-enabled Smart Glasses for People with Severe Vision Impairments

Project supervisors

Prof Kim Marriott, Faculty of IT (Main Supervisor)
Dr Thanh-Toan Do, Faculty of IT
A/Prof Nicholas Price, Faculty of Medicine, Nursing and Health Sciences
Prof Arthur Lowery, Faculty of Engineering

PhD project abstract

This interdisciplinary project will investigate the use of AI-based computer vision technologies to help people with severe vision impairments (SVIs). The aim is to develop and evaluate the use of real-time image enhancement and object recognition in “smart glasses” for people with low vision. These will enhance the scene that the person is viewing by, for instance, enlarging or clarifying text, emphasizing doors and stairs, and provide additional information in natural language about the objects and people in the field of view as the viewer moves their gaze. The research will also feed into the design of the Gennaris bionic vision system for the totally blind currently underdevelopment at Monash University.  This is groundbreaking research that will require developing innovative user interfaces and computer vision technologies. It has the potential to dramatically improve access to educational materials, workplace participation, as well as removing current barriers to independent travel and participation in sports and cultural activities by people with SVIs.

Areas of research

Computer vision, assistive technologies, HCI, bionic vision

Project description

Approximately 2.2 billion people worldwide have some kind of vision impairment. In the case of more severe vision impairments (SVIs) this leads to difficulty accessing educational materials, low workplace participation, fear of travelling independently and restricted involvement in sports and cultural activities.

AI technologies, specifically computer vision, has the potential to remove many of the barriers currently facing people with SVIs and significantly improve their lives. A notable example is the Seeing AI app (https://www.microsoft.com/en-us/ai/seeing-ai) developed by Microsoft that reads text and/or audio describes the scene in view of a mobile phone camera. There has also been research into the use of image enhancements such as magnification or contrast enhancement with augmented reality applications on mobile phones and head-mounted displays.

The aim of this project is to explore the combination of these two approaches. Specifically the project will develop and trial the use of real-time computer vision technologies with people with SVIs that are integrated into a head-mounted display and that: (a) enhance the scene that the person is viewing by, for instance, enlarging or clarifying text, emphasizing doors and stairs, and (b) recognising people and objects in the field of view and providing this additional information in natural language as the viewer moves their gaze. A key innovation in the project will be to tightly  integrate these two components so that, for example, as features are described they are highlighted in the display and allowing the user to control which objects are described using eye movement.

The technology will be deployed in two distinct ways. In the case of people with limited eyesight we will utilise head-mounted augmented reality displays such as the HoloLens. In the case of people who are totally blind we plan to incorporate the technology into the Gennaris bionic vision system currently under development at Monash University (https://www.monash.edu/industry/success-stories/bionic-eye)

This interdisciplinary project brings together researchers in assistive technology and augmented reality (Prof Kim Marriott), AI and computer vision (Dr Toan Do), visual perception (A/Prof Nicholas Price) and the bionic vision group (Prof Arthur Lowery).  It is groundbreaking research that has the potential to revolutionise the day to day lives of people with SVIs.

PhD student role description

This is an opportunity to develop new computer vision techniques and to work with the world-renowned Monash Vision Group on the Gennaris bionic vision system. Not only will the research be cutting edge it will also be making a difference to the lives of the millions of people worldwide who have severe vision impairments (SVIs).

The student will be expected to design, implement and evaluate deep-learning techniques for both image enhancement and recognition. The three key innovations in the project will be (1) to design and evaluate a user interface that naturally integrates image enhancement with audio description; (2) develop deep learning techniques that can run on the less-powerful computing devices found in head-mounted displays; and (3) evaluate the use of the resulting system with people with SVIs in real-world settings for real-world tasks.

The project team has strong links with organisations supporting people with SVIs and the detailed project goals and evaluations will be developed in participation with people with SVIs to ensure that the project is addressing real needs.  It is envisaged that the PhD student will work closely with 3 people with representative vision impairments and develop prototype image enhancement/recognition applications running on a head-mounted augmented reality display that will assist them when travelling, when interacting with other people and with daily activities.

This interdisciplinary project will lead to publications in premier conferences and journals. It is also expected to lead to international collaboration with institutions such as Microsoft Research. It is the perfect preparation for either an academic career or as a researcher in an industrial IT lab.

Required skills and experience

Strong programming skills and some experience of deep learning techniques and/or computer vision.

Potential start date

31 January 2022

Submit your Expression of Interest