Content area
Full text
Summary:
University of Washington researchers have developed AI-powered “proactive hearing assistant” headphones that automatically isolate conversation partners in noisy environments by detecting turn-taking speech rhythms.
Key Takeaways:
1. The system uses two AI models to identify who is speaking and suppress all non-participant voices and background noise in real time.
2. Initial testing showed users rated filtered audio more than twice as favorably as unfiltered baseline audio.
3. The open-source technology could eventually be integrated into hearing aids, earbuds, and smart glasses to provide hands-free, intent-aware sound filtering.
Holding a conversation in a crowded room often leads to the frustrating “cocktail party problem,” or the challenge of separating the voices of conversation partners from a hubbub. It’s a mentally taxing situation that can be exacerbated by hearing impairment.
As a solution to this common conundrum, researchers at the University of Washington have developed smart headphones that proactively isolate all the wearer’s conversation partners in a noisy soundscape. The headphones are powered by an AI model that detects the cadence of a conversation and another model that mutes any voices which don’t follow that pattern, along with other unwanted background noises. The prototype uses off-the-shelf hardware and can identify conversation partners using just two to four seconds of audio.
The system’s developers think the technology that recognizes turn-taking rhythms could one day help users of hearing aids, earbuds, and smart glasses to filter their soundscapes without the need to manually direct the AI’s “attention.”
The Prototype System
The team presented the technology Nov. 7 in Suzhou, China at the Conference on Empirical Methods in Natural Language Processing. The underlying code is open-source and available for download.
“Existing approaches to identifying who the wearer is listening to predominantly involve electrodes implanted in the brain to track attention,” said senior author Shyam Gollakota, a UW professor in the Paul G. Allen School of Computer Science & Engineering. “Our insight is that when we’re conversing with a specific group of people, our speech naturally follows a turn-taking rhythm. And we can train AI to predict and track those rhythms using only audio, without the need for implanting electrodes.”
The prototype system, dubbed “proactive hearing assistants,” activates when the person wearing the headphones begins speaking. From there, one AI model...




