Continual Subjective Evaluation Method of Speech by Merging Sort-based Preference Tests Towards Ever-Expanding Corpus of Human Ratings

Author：Yusuke Yasuda, Junichi Yamagishi, Tomoki Toda

#音声処理
#音声合成
#品質評価

13th edition of the Speech Synthesis Workshop

Mean Opinion Scores (MOS) are widely used method for subjective evaluation of speech, and automatic quality assessment can predict MOS from speech by training models with MOS as labels. However, MOS are not suitable training labels for the automatic quality assessment models because they are context-dependent and their data size is limited and fixed, which limits reliability of the automatic quality assessment. To overcome the limitation, this study defines a continual subjective evaluation of speech to keep expanding scores and systems in a subjective evaluation corpus by merging. The objective of continual subjective evaluation is to derive a ranking of systems in a situation where the number of systems increases over time. The continual subjective evaluation consists of a loop of two subproblems: sorting subsets of systems in the quality order and merging the subsets of sorted systems into a single ranking. We propose a preference test method integrated with sort- and merge-based online learning algorithms to solve the continual subjective evaluation efficiently. Our experiments show that our method can realize the continual subjective evaluation by deriving a ranking of 60 systems from 216 pairs with 65,460 preference scores.

一覧へ戻る