Voice Frequency Synthesis using VAW-GAN based Amplitude Scaling for Emotion Transformation

Hye-Jeong Kwon; Min-Jeong Kim; Ji-Won Baek; Kyungyong Chung

Voice Frequency Synthesis using VAW-GAN based Amplitude Scaling for Emotion Transformation

Vol. 16, No. 2, February 28, 2022

10.3837/tiis.2022.02.018, Download Paper (Free):

Abstract

Mostly, artificial intelligence does not show any definite change in emotions. For this reason, it is hard to demonstrate empathy in communication with humans. If frequency modification is applied to neutral emotions, or if a different emotional frequency is added to them, it is possible to develop artificial intelligence with emotions. This study proposes the emotion conversion using the Generative Adversarial Network (GAN) based voice frequency synthesis. The proposed method extracts a frequency from speech data of twenty-four actors and actresses. In other words, it extracts voice features of their different emotions, preserves linguistic features, and converts emotions only. After that, it generates a frequency in variational auto-encoding Wasserstein generative adversarial network (VAW-GAN) in order to make prosody and preserve linguistic information. That makes it possible to learn speech features in parallel. Finally, it corrects a frequency by employing Amplitude Scaling. With the use of the spectral conversion of logarithmic scale, it is converted into a frequency in consideration of human hearing features. Accordingly, the proposed technique provides the emotion conversion of speeches in order to express emotions in line with artificially generated voices or speeches.

Statistics

Show / Hide Statistics

Cite this article

[IEEE Style]

H. Kwon, M. Kim, J. Baek, K. Chung, "Voice Frequency Synthesis using VAW-GAN based Amplitude Scaling for Emotion Transformation," KSII Transactions on Internet and Information Systems, vol. 16, no. 2, pp. 713-725, 2022. DOI: 10.3837/tiis.2022.02.018.

[ACM Style]

Hye-Jeong Kwon, Min-Jeong Kim, Ji-Won Baek, and Kyungyong Chung. 2022. Voice Frequency Synthesis using VAW-GAN based Amplitude Scaling for Emotion Transformation. KSII Transactions on Internet and Information Systems, 16, 2, (2022), 713-725. DOI: 10.3837/tiis.2022.02.018.

[BibTeX Style]

@article{tiis:25315, title="Voice Frequency Synthesis using VAW-GAN based Amplitude Scaling for Emotion Transformation", author="Hye-Jeong Kwon and Min-Jeong Kim and Ji-Won Baek and Kyungyong Chung and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2022.02.018}, volume={16}, number={2}, year="2022", month={February}, pages={713-725}}

Voice Frequency Synthesis using VAW-GAN based Amplitude Scaling for Emotion Transformation

Abstract

Statistics

Cite this article

[IEEE Style]

[ACM Style]

[BibTeX Style]

Unified Search
(in title, author, abstract, and keywords)

Category Search

Voice Frequency Synthesis using VAW-GAN based Amplitude Scaling for Emotion Transformation

Abstract

Statistics

Cite this article

[IEEE Style]

[ACM Style]

[BibTeX Style]

Unified Search (in title, author, abstract, and keywords)

Category Search

Unified Search
(in title, author, abstract, and keywords)