One model for all sounds: fast and high-quality neural source-filter model for speech and non-speech waveform modeling

Author：Wang Xin（研究代表者）

期間：2019年9月 – 2021年3月
課題名：日本学術振興会科学研究費助成事業研究活動スタート支援
課題番号：19K24371
URL：https://kaken.nii.ac.jp/ja/grant/KAKENHI-PROJECT-19K24371/

Generating natural-sounding waveforms from a computer is a fundamental speech science topic. In this research, we plan to combine speech science and deep learning. We propose to combine a classical speech production model called source-filter model with neural network, which results in a neural source-filter waveform model. Our model is expected to generate waveforms with a faster speed and improved quality; it is also expected to be applicable not only to speech but also to singing voice and non-speech sounds. Such a new model will be useful in many applications such as text-to-speech.

一覧へ戻る