When integrated with conventional hearing aid technology, the system could help tackle the so-called ‘cocktail party effect’, a common shortcoming of traditional hearing aids.
Hearing aids assist hearing-impaired people by amplifying all ambient sounds around them, but in noisy situations the hearing aids’ broad spectrum of amplification can make it difficult for users to focus on specific sounds, like conversation with a particular person.
One potential solution to the cocktail party effect is to make ‘smart’ hearing aids, which combine conventional audio amplification with a second device to collect additional data from reading lips.
There has been success in using cameras to aid with lip reading but collecting video footage of people without their consent raises privacy concerns. Cameras are also unable to read lips through masks, a regular challenge for people who wear face coverings for cultural or religious purposes and a broader issue in the COVID-19 era.
In a paper published in Nature Communications, the Glasgow University led team outline how they set out to harness advanced sensing technology to read lips. Their system is said to preserve privacy by only collecting radio-frequency data, with no accompanying video footage.
To develop the system, the researchers asked male and female volunteers to repeat the five vowel sounds, first while unmasked and then while wearing a surgical mask.
As the volunteers repeated the vowel sounds, their faces were scanned using radio-frequency signals from a dedicated radar sensor and a Wi-Fi transmitter. Their faces were also scanned while their lips remained still.
Then, the 3,600 samples of data collected during the scans was used to ‘teach’ machine learning and deep learning algorithms how to recognise the characteristic lip and mouth movements associated with each vowel sound.
Because the radio-frequency signals can easily pass through the volunteers’ masks, the algorithms could also learn to read masked users’ vowel formation.
The system proved to be capable of correctly reading the volunteers’ lips most of the time. Wi-Fi data was correctly interpreted by the learning algorithms up to 95 per cent of the time for unmasked lips, and 80 per cent for masked. The radar data was interpreted correctly up to 91 per cent without a mask, and 83 per cent of the time with a mask.
In a statement, lead author Dr Qammer Abbasi, of Glasgow University’s James Watt School of Engineering, said: “With this research, we have shown that radio-frequency signals can be used to accurately read vowel sounds on people’s lips, even when their mouths are covered. While the results of lip-reading with radar signals are slightly more accurate, the Wi-Fi signals also demonstrated impressive accuracy.
“Given the ubiquity and affordability of Wi-Fi technologies, the results are highly encouraging which suggests that this technique has value both as a standalone technology and as a component in future multimodal hearing aids.”
Researchers from Glasgow University, Edinburgh Napier University in the UK contributed to the paper, along with colleagues from the University of Engineering and Technology Lahore, Pakistan and Southeast University, Nanjing in China.
The team’s paper, titled ‘Pushing the Limits of Remote RF Sensing by Reading Lips Under the Face Mask’, is published in Nature Communications. The research was supported by funding from EPSRC.
Promoted content: Does social media work for engineers – and how can you make it work for you?
So in addition to doing their own job, engineers are expected to do the marketing department´s work for them as well? Sorry, wait a minute, I know the...