Naturalness Improvement Algorithm for Reconstructed Glossectomy Patient's Speech Using Spectral Differential Modification in Voice Conversion

Murakami, Hiroki; Hara, Sunao; Abe, Masanobu; Sato, Masaaki; Minagi, Shogo

doi:10.21437/Interspeech.2018-1239

Permalink : https://ousar.lib.okayama-u.ac.jp/56199

ID	56199
フルテキストURL	Proc_Interspeech_2018_2464.pdf 804 KB
著者	Murakami, Hiroki Hara, Sunao Graduate School of Natural Science and Technology, Okayama University ORCID Kaken ID publons researchmap Abe, Masanobu Graduate School of Natural Science and Technology, Okayama University Sato, Masaaki Graduate School of Medicine Dentistry and Pharmaceutical Sciences, Okayama University Minagi, Shogo Graduate School of Medicine Dentistry and Pharmaceutical Sciences, Okayama University
抄録	In this paper, we propose an algorithm to improve the naturalness of the reconstructed glossectomy patient's speech that is generated by voice conversion to enhance the intelligibility of speech uttered by patients with a wide glossectomy. While existing VC algorithms make it possible to improve intelligibility and naturalness, the result is still not satisfying. To solve the continuing problems, we propose to directly modify the speech waveforms using a spectrum differential. The motivation is that glossectomy patients mainly have problems in their vocal tract, not in their vocal cords. The proposed algorithm requires no source parameter extractions for speech synthesis, so there are no errors in source parameter extractions and we are able to make the best use of the original source characteristics. In terms of spectrum conversion, we evaluate with both GMM and DNN. Subjective evaluations show that our algorithm can synthesize more natural speech than the vocoder-based method. Judging from observations of the spectrogram, power in high-frequency bands of fricatives and stops is reconstructed to be similar to that of natural speech.
キーワード	voice conversion speech intelligibility glossectomy spectral differential neural network
発行日	2018-09-02
出版物タイトル	Proceedings of Interspeech 2018
出版者	International Speech Communication Association
開始ページ	2464
終了ページ	2468
ISSN	1990-9772
資料タイプ	会議発表論文
言語	英語
OAI-PMH Set	岡山大学
論文のバージョン	publisher
DOI	10.21437/Interspeech.2018-1239
関連URL	isVersionOf https://doi.org/10.21437/Interspeech.2018-1239