语音搜索错误纠正的音素增强判别重评分
End-to-end (E2E) Automatic Speech Recognition (ASR) models are trained using paired audio-text samples that are expensive to obtain, since high-quality ground-truth data requires human annotators....
本文提出了一种针对E2E自动语音识别模型在新电影标题识别中不足的音素纠正方法。该方法通过音素搜索生成替代选项,并结合ASR模型的识别结果,显著提高了识别准确率,错误率降低了4.4%至7.6%。
