Research
研究プロジェクト・論文・書籍等
- テクニカルレポート
[国内学会] Generating Segment-Level Foreign-Accented Synthetic Speech with Natural Speech Prosody
- #音声処理
- #音声合成
情報処理学会 第118回音楽情報科学・第120回音声言語情報処理合同研究発表会
We present a new application of deep-learning-based TTS, namely multilingual speech synthesis for generating controllable foreign accent. We train an acoustic model on non-accented multilingual speech recordings from the same speaker and interpolate quinphone linguistic features between languages to generate microscopic foreign accent. By copying pitch and durations from a pre-recorded utterance of the desired prompt, natural prosody is achieved. We call this paradigm “cyborg speech” as it combines human and machine speech parameters. Experiments on synthetic American-English-accented Japanese confirm the success of the approach.