Automatic Error Correction for Repeated Words in Mandarin Speech Recognition
Xiangdong Wang 1, Hong Liu 1, Yueliang Qian 1, and Xinhui Li 1
1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China
2. Tencent Inc., China
2. Tencent Inc., China
Abstract—In this paper, an approach of automatically correcting recognition errors of repeated words is proposed by exploiting recognition results of preceding utterances. During the error correction, the words that might appear again in the following utterances are collected from the recognition results of preceding utterances. For each utterance, there are four steps involved in the correction: 1) initial recognition. In order to correct recognition error of the repeated word, character confusion network (CCN) is adopted as the result of initial Mandarin recognition, 2) detecting repeated words by computing the phonetic similarity between collected words and the CCN, 3) correcting recognition errors of repeated words automatically, and 4) extracting new words from the recognition result of the current utterance. Experiments show that more than 5% absolute character error rate (CER) reduction can be achieved using the proposed method.
Index Terms—speech recognition, error correction, repeated word, confusion network
Cite: Xiangdong Wang, Hong Liu, Yueliang Qian, and Xinhui Li, "Automatic Error Correction for Repeated Words in Mandarin Speech Recognition," Jounal of Automation and Control Engineering, Vol. 4, No. 2, pp. 153-158, April, 2016. doi: 10.12720/joace.4.2.153-158
Index Terms—speech recognition, error correction, repeated word, confusion network
Cite: Xiangdong Wang, Hong Liu, Yueliang Qian, and Xinhui Li, "Automatic Error Correction for Repeated Words in Mandarin Speech Recognition," Jounal of Automation and Control Engineering, Vol. 4, No. 2, pp. 153-158, April, 2016. doi: 10.12720/joace.4.2.153-158