[Invited speaker] Can we ‘generate’ large, privacy-aware, unbiased, and fair datasets with speech-generative models?

#生成モデル
#音声処理
#プライバシー

講演者：Junichi Yamagishi
会議名：SynData4GenAI workshop (Satellite workshop of Interspeech 2024)
主催者：Interspeech 2024
開催地：Kos Island, Greece
開催日：2024年8月31日
URL：https://interspeech2024.org/satellite/

The success of deep learning in speech and speaker recognition relies heavily on using large datasets. However, ethical, privacy and legal concerns arise when using large speech datasets collected from real human speech data. In particular, there are significant concerns when collecting many speaker’s speech data from the web.

On the other hand, the quality of synthesized speech produced by recent generative models is very high. Can we ‘generate’ large, privacy-aware, unbiased, and fair datasets with speech-generative models? Such studies have started not only for speech datasets but also for facial image datasets.

In this talk, I will introduce our efforts to construct a synthetic VoxCeleb2 dataset called SynVox2 that is speaker-anonymised and privacy-aware. In addition to the procedures and methods used in the construction, the challenges and problems of using synthetic data will be discussed by showing the performance and fairness of a speaker verification system built using the SynVox2 database.

一覧へ戻る