New Study Shows Kardome Outperforms Standard Speech Recognition Algorithms
A recent study shows Kardome's voice user interface technology achieves more than 90% accuracy in challenging acoustic conditions.
Download the study here
Many factors affect the performance of Automatic Speech Recognition (ASR) systems, including background noise and reverberation, echo, and distance from speakers.
To stay competitive, voice-enabled device manufacturers and Original Equipment Manufacturers (OEMs) must overcome these challenges.
Kardome's speech recognition enhancement technology is a software-based solution that enables voice and speech recognition devices to perform with higher accuracy. It utilizes state-of-the-art signal processing techniques to achieve superior performance in noisy environments and over distances.
A vital element of the technology is separating a single speaker’s voice from other voices and background noises in the environment and focusing on it for highly accurate ASR.
To show the accuracy of our speech recognition technology, as compared to other commercial technologies, we conducted an unbiased study in a real-life environment.
Our team of engineers analyzed the performance of Kardome in various environments against that of standard speech recognition algorithms such as those used in Alexa and Google Home.
We conducted this study using a smart speaker placed in a typical living room environment as part of our research. We wanted to see how well the smart speaker's ASR systems would perform in various simulated scenarios.
We used loudspeakers to play different environmental noise sources, such as a television, a kitchen (with mixers, running water, and someone cooking), a fan, a vacuum cleaner, and babble noise (multi-speaker conversations). We played each type of sound in various volumes, allowing us to evaluate the accuracy of the ASR under different signal-to-noise ratio conditions.
Kardome engineers used the following industry-standard indicators to test the smart speaker's ASR performance:
- Wake Word False Rejection Rate (FRR): The percentage at which the system does not detect a wake word when it is present.
- Wake Word False Alarm Rate (FAR): The percentage at which a system detects a wake word when it is not present.
- Response Accuracy Rate: The percentage at which commands are executed successfully.
Here's a Summary of the results:
- Kardome achieved more than 90% accuracy in wake word FRR detection testing.
- Kardome outperformed standard algorithms by 60% to 80% across all environments, including conditions with higher noise levels.
- Kardome’s response accuracy rate outperformed standard algorithms by 80%
Kardome’s technology advances state-of-the-art speech recognition technology. As the world moves toward more listenership-based communication models, better speech recognition technologies are essential to ensure accurate transcription and understanding.
Contact us to learn more about Kardome's VUI technology
Download the complete study here