The Technologies Involved in Speech Recognition Chips
Speech recognition chips are also called speech recognition ICs. Compared with traditional voice chips, the biggest feature of voice recognition chips is that they can recognize voices. It allows machines to understand human voices and perform various actions on command, such as blinking, opening a mouth (smart doll). In addition, the speech recognition chip also has high-quality, high-compression recording and playback functions, enabling man-machine dialogue.This post POROSVOC will introduce the technologies involved in speech recognition chips.
Fig.1
The technologies involved in speech recognition chips include signal processing, pattern recognition, probability theory, information theory, sound mechanism, auditory mechanism, artificial intelligence, etc.
According to the user's restrictions, the speech recognition chip can be divided into a specific person's speech recognition chip and a non-specific person's speech recognition chip.
specific person speech recognition
The specific person speech recognition chip is used for the specific person's speech recognition. If no other person can be recognized, the user's speech reference sample must be stored in the database as a comparison database, that is, the speech recognition of a specific person must be trained on speech before use, usually following the machine prompt to train the speech input twice to use it.
Human-Independent Speech Recognition
Human-independent speech recognition is a recognition technology that does not need to target a specific person regardless of age or gender, as long as the same language is used. The application pattern was to collect about 200 people based on a dozen or so voice interaction items identified before the product was finalized. The voice samples of the PC are processed by the PC algorithm to obtain the voice model and feature database of the interactive entry and then burned into the chip. Machines using this chip (smart dolls, electronic pets, children's computers) have interactive capabilities.
Some non-human speech recognition applications are based on phoneme algorithms. In this mode, interactive recognition can be performed without collecting many people's speech samples, but the disadvantage is that the recognition rate is not high and the recognition performance is unstable.
According to the continuity of speaking mode, speech recognition chips can be divided into discontinuous speech recognition and continuous speech recognition.
Intermittent speech recognition
For discontinuous speech, each spoken word must be identified separately, and a pause is required after each word is spoken.
Continuous speech recognition
Continuous speech recognition can perform human-like speech recognition in a generally natural and fluent way of speaking, but it is difficult to achieve good recognition results due to the problem of connecting voices.
- +1 Like
- Add to Favorites
Recommend
- AI ASR Chip Supporting Off-Line Automatic Speech Recognition, Widely Used in Home Appliances
- Bringing Voice AI Chips, Algorithms and Solutions and More, Chipintelli and Sekorm Announced a Distribution Agreement
- A Introduction to The Power Management Chips from Xinbole XBLW - Analog Chips
- Bringing WIFI RF Front-end Chips, Low-noise Amplifiers, 5G Small Base Station RF Chips, CHIPBETTER Announced a Distribution with Sekorm
- Speech Synthesis ICs ML2253x series Equipped with Playback Anomaly Detection Function
- MindMotion Unveils Its High-integration Motor Chips MM32SPIN060G, MM32SPIN080G, and MM32SPIN560CM at CMIMS 2024
- XBLW Xinbole Analog Chips: Signal Chain Chips (Part 3)
- With AI Voice Chips, Chipintelli Was Selected as KPMG China‘s Top 50 “Core Technology“ Emerging Companies
This document is provided by Sekorm Platform for VIP exclusive service. The copyright is owned by Sekorm. Without authorization, any medias, websites or individual are not allowed to reprint. When authorizing the reprint, the link of www.sekorm.com must be indicated.