Bruce Ryan, Director of Engineering, Harman Embedded Audio08.11.21
As voice technology continues to develop into a more powerful interface, its application to a broad array of devices and use cases is bound to unlock new channels for performance and fill important consumer needs. One of these channels undoubtedly is the intersection of senior care and healthcare.
In 2017, a pilot study by the Front Porch Center for Innovation introduced 50 senior residents in a retirement home to various devices with Amazon Alexa voice technology. The study found that 100% of seniors felt the voice agent made their life easier overall. With the elderly audience, voice plays a crucial role in improving access to smart technology. Many members of the elderly segment are not familiar with touch screens and therefore not comfortable using them, but voice enables users to obtain the same information without ever having to touch a device.
The benefits of voice agents in healthcare extend beyond comfort, too. Imagine surgeons in an operating room where sterility is of the utmost importance. Hands-free smart technology in the operating theater can enable surgeons to access information such as medical images without the need for direct contact with a separate device.
1. Voice as a Companion
One of the most common concerns communities and families face regarding their seniors is isolation. As seniors’ worlds grow smaller, they can easily feel alone, which can lead to other challenges from loneliness and depression to mobility concerns. But what if members of the elderly community could rely on a companion, available 24/7, dependable for answering questions, providing reminders and contacting doctors and loved ones?
Advancing voice agents from an ask-and-answer platform to a virtual companion that enables a two-way conversation makes it much more similar to a real human interaction. The virtual companion can both reduce isolation for the elderly group and open the door for digital health capabilities. Using AI and 5G, advanced voice agents can interact with users when they are ill to identify symptoms and next steps, such as ‘see a doctor’ versus ‘fill this prescription.’
Additionally, like real humans, companions should be able to respond at all times, which is why Harman is innovating to enable voice in offline conditions rather than requiring a WiFi connection to operate off of a cloud-based system.
Voice Activity Detection (VAD) and Direction of Arrival (DOA) are two technologies that are critical to enable human-like interaction. VAD is used to allow the system to sleep while still listening. This allows the system to lower the overall power usage as well as processing demands. Once an audio signal of a predefined value (loudness or voice) is detected, the system resumes full power and can process the audio normally. DOA allows the system to know where the speaker is located, which helps the system focus on the intended input and filter out unwanted audio content. VAD can detect when a person starts talking and when they are finished talking. This enables proper two-way communication and also removes the need to repeat a wake up word in a given conversation. DOA is very important to indicate where the person talking is located in the room. Imagine your companion to be a robot; you would want it to be looking back at you when you are talking to it.
2. Wearable Audio Tech
With the ability to access voice agents anywhere, on or offline and in or out of the house, the next iteration is wearable audio technology. With seniors, wearable audio offers significant benefits from the ability to initiate an emergency call to using audio signal as a sensor for medical applications, like measuring breathing or a heart rate.
A lot of emergency systems are based on the emergency calling buttons being pressed. In real life, people are unlikely to fall or need emergency help next to where the button is located. Audio and voice can help significantly in this case since you can trigger an emergency call with your voice. Microphone systems equipped with far field technology could detect your call request from even a few meters away. Another consideration is a wearable microphone directly fitted on the person and capable of detecting the specific noise of a fall (ideally combined with an accelerometer sensor) to automatically generate an emergency call.
A microphone and wearable electronic device can record and transmit heart rate and breathing sound to a remote listener. Think about it like a doctor with a stethoscope -- similarly, wearable tech can provide remote sensing for a doctor visit on a video conference call.
The addition of a wide range of health sensors enables a remote medical professional to do a remote electronic physical of the patient or provides data taken over a period of time to be analyzed. Current medical sensors include heat rate, blood oxygen, pulse and blood pressure, blood sugar levels, breathing, skin temperature, body position and motion. With the addition of these sensors as well as a microphone for voice interaction, the physician is able to provide an accurate evaluation of a remote patient. All of these sensors are now wirelessly enabled and can be either worn on the body or placed on the body as needed.
3. Addressing Hearing Loss in Voice Technology
A major challenge presented by audio technology for seniors is hearing loss. With approximately one in three people between the ages of 65 and 74 experiencing it, it’s a challenge that cannot be ignored when innovating solutions for healthcare and the senior population. Harman is building out a personalized audio response system in its software that will evaluate hearing loss in the user. The personalized audio response system will evaluate the specific type or amount of hearing loss a user is facing, and then the level of sound or the frequency of the audio is adjusted accordingly.
Personalized audio playback allows the listener to compensate for some hearing loss due to age or injury. Using years of research and measurements, the Harman team has developed this technique into a software package that can be applied to any audio device, which has improved the device’s speech clarity or music playback enhancement.
The personal “tuning” involves a quick hearing test using a known headphone. The results will be analyzed and stored as the listeners preference. This tuning can then be transported to any connected device via the Harman database.
4. Voice in the Operating Room
Voice enablement is impacting more than just senior living -- think about the healthcare professionals serving seniors and beyond. Using hands-free voice agents, doctors can listen to music while operating or employ voice-to-text to record and transcribe the events and conversations taking place in an operating room. With music proven to stimulate areas of the brain involved with emotion, memory and even physical movement, voice in the operating room could benefit more than just the doctors.
Voice control is critical in this situation as it can be faster than a touch interface. Surgeons and staff usually have their hands full, so it reduces touches on surfaces, therefore eliminating the risk of virus and bacteria transmissions. The operating theatre or operating room is usually a very noisy environment filled with machines beeping, HVAC or respirators, or other air pumping mechanisms. Noise reduction algorithms are therefore very important to remove all unnecessary signals and focus on the actual operator voice. There are also many people in these rooms. Focusing on the right person giving the voice commands is key. Beamforming or source separation technologies can be used to target the actual user who needs speech commands executed. Medical transcription over voice will also significantly improve paperwork treatment.
Ultimately, as the applications of voice technology continue to expand to the healthcare space, people will continue to see more opportunities to connect with others, provide accessible and consistent care and improve the healthcare community with sound.
Bruce Ryan is the director of Engineering at Harman Embedded Audio.
In 2017, a pilot study by the Front Porch Center for Innovation introduced 50 senior residents in a retirement home to various devices with Amazon Alexa voice technology. The study found that 100% of seniors felt the voice agent made their life easier overall. With the elderly audience, voice plays a crucial role in improving access to smart technology. Many members of the elderly segment are not familiar with touch screens and therefore not comfortable using them, but voice enables users to obtain the same information without ever having to touch a device.
The benefits of voice agents in healthcare extend beyond comfort, too. Imagine surgeons in an operating room where sterility is of the utmost importance. Hands-free smart technology in the operating theater can enable surgeons to access information such as medical images without the need for direct contact with a separate device.
1. Voice as a Companion
One of the most common concerns communities and families face regarding their seniors is isolation. As seniors’ worlds grow smaller, they can easily feel alone, which can lead to other challenges from loneliness and depression to mobility concerns. But what if members of the elderly community could rely on a companion, available 24/7, dependable for answering questions, providing reminders and contacting doctors and loved ones?
Advancing voice agents from an ask-and-answer platform to a virtual companion that enables a two-way conversation makes it much more similar to a real human interaction. The virtual companion can both reduce isolation for the elderly group and open the door for digital health capabilities. Using AI and 5G, advanced voice agents can interact with users when they are ill to identify symptoms and next steps, such as ‘see a doctor’ versus ‘fill this prescription.’
Additionally, like real humans, companions should be able to respond at all times, which is why Harman is innovating to enable voice in offline conditions rather than requiring a WiFi connection to operate off of a cloud-based system.
Voice Activity Detection (VAD) and Direction of Arrival (DOA) are two technologies that are critical to enable human-like interaction. VAD is used to allow the system to sleep while still listening. This allows the system to lower the overall power usage as well as processing demands. Once an audio signal of a predefined value (loudness or voice) is detected, the system resumes full power and can process the audio normally. DOA allows the system to know where the speaker is located, which helps the system focus on the intended input and filter out unwanted audio content. VAD can detect when a person starts talking and when they are finished talking. This enables proper two-way communication and also removes the need to repeat a wake up word in a given conversation. DOA is very important to indicate where the person talking is located in the room. Imagine your companion to be a robot; you would want it to be looking back at you when you are talking to it.
2. Wearable Audio Tech
With the ability to access voice agents anywhere, on or offline and in or out of the house, the next iteration is wearable audio technology. With seniors, wearable audio offers significant benefits from the ability to initiate an emergency call to using audio signal as a sensor for medical applications, like measuring breathing or a heart rate.
A lot of emergency systems are based on the emergency calling buttons being pressed. In real life, people are unlikely to fall or need emergency help next to where the button is located. Audio and voice can help significantly in this case since you can trigger an emergency call with your voice. Microphone systems equipped with far field technology could detect your call request from even a few meters away. Another consideration is a wearable microphone directly fitted on the person and capable of detecting the specific noise of a fall (ideally combined with an accelerometer sensor) to automatically generate an emergency call.
A microphone and wearable electronic device can record and transmit heart rate and breathing sound to a remote listener. Think about it like a doctor with a stethoscope -- similarly, wearable tech can provide remote sensing for a doctor visit on a video conference call.
The addition of a wide range of health sensors enables a remote medical professional to do a remote electronic physical of the patient or provides data taken over a period of time to be analyzed. Current medical sensors include heat rate, blood oxygen, pulse and blood pressure, blood sugar levels, breathing, skin temperature, body position and motion. With the addition of these sensors as well as a microphone for voice interaction, the physician is able to provide an accurate evaluation of a remote patient. All of these sensors are now wirelessly enabled and can be either worn on the body or placed on the body as needed.
3. Addressing Hearing Loss in Voice Technology
A major challenge presented by audio technology for seniors is hearing loss. With approximately one in three people between the ages of 65 and 74 experiencing it, it’s a challenge that cannot be ignored when innovating solutions for healthcare and the senior population. Harman is building out a personalized audio response system in its software that will evaluate hearing loss in the user. The personalized audio response system will evaluate the specific type or amount of hearing loss a user is facing, and then the level of sound or the frequency of the audio is adjusted accordingly.
Personalized audio playback allows the listener to compensate for some hearing loss due to age or injury. Using years of research and measurements, the Harman team has developed this technique into a software package that can be applied to any audio device, which has improved the device’s speech clarity or music playback enhancement.
The personal “tuning” involves a quick hearing test using a known headphone. The results will be analyzed and stored as the listeners preference. This tuning can then be transported to any connected device via the Harman database.
4. Voice in the Operating Room
Voice enablement is impacting more than just senior living -- think about the healthcare professionals serving seniors and beyond. Using hands-free voice agents, doctors can listen to music while operating or employ voice-to-text to record and transcribe the events and conversations taking place in an operating room. With music proven to stimulate areas of the brain involved with emotion, memory and even physical movement, voice in the operating room could benefit more than just the doctors.
Voice control is critical in this situation as it can be faster than a touch interface. Surgeons and staff usually have their hands full, so it reduces touches on surfaces, therefore eliminating the risk of virus and bacteria transmissions. The operating theatre or operating room is usually a very noisy environment filled with machines beeping, HVAC or respirators, or other air pumping mechanisms. Noise reduction algorithms are therefore very important to remove all unnecessary signals and focus on the actual operator voice. There are also many people in these rooms. Focusing on the right person giving the voice commands is key. Beamforming or source separation technologies can be used to target the actual user who needs speech commands executed. Medical transcription over voice will also significantly improve paperwork treatment.
Ultimately, as the applications of voice technology continue to expand to the healthcare space, people will continue to see more opportunities to connect with others, provide accessible and consistent care and improve the healthcare community with sound.
Bruce Ryan is the director of Engineering at Harman Embedded Audio.