Research
研究プロジェクト・論文・書籍等
- 論文
The VoiceMOS Challenge 2023:Zero-shot Subjective Speech Quality Prediction for Multiple Domains
- #音声処理
- #品質評価
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
We present the second edition of the VoiceMOS Challenge, a scientific event that aims to promote the study of automatic prediction of the mean opinion score (MOS) of synthesized and processed speech. This year, we emphasize real-world and challenging zero-shot out-of-domain MOS prediction with three tracks for three different voice evaluation scenarios. Ten teams from industry and academia in seven different countries participated. Surprisingly, we found that the two sub-tracks of French text-to-speech synthesis had large differences in their predictability, and that singing voice-converted samples were not as difficult to predict as we had expected. Use of diverse datasets and listener information during training appeared to be successful approaches.