Speaker-Adaptive Speech Synthesis based on Fuzzy Vector Quantizer Mapping and Neural Networks


The Transactions of the Korea Information Processing Society (1994 ~ 2000), Vol. 4, No. 1, pp. 149-160, Jan. 1997
10.3745/KIPSTE.1997.4.1.149,   PDF Download:

Abstract

This paper is concerned with the problem of speaker-adaptive speech synthesis method using a mapped codebook designed by fuzzy mapping on FLVQ(Fuzzy Learning Vector Quantization). The FLVQ is used to design both input and reference speaker's codebook. This algorithm is incorporated fuzzy membership function into the LVQ(learning vector quantization) networks. Unlike the LVQ algorithm, this algorithm minimizes the network output errors which are the differences of class membership target and actual membership values, and results to minimize the distances between training patterns and competing neurons. Speaker Adaptation in speech synthesis is performed as follows; input speaker's codebook is mapped a reference speaker''s codebook in fuzzy concepts. The Fuzzy VQ mapping replaces a codevector preserving its fuzzy membership function. The codevector correspondence histogram is obtained by accumulating the vector correspondence along the DTW optimal path. We use the Fuzzy VQ mapping to design a mapped codebook. The mapped codebook is defined as a linear combination of reference speaker's vectors using each fuzzy histogram as a weighting function with membership values. In adaptive-speech synthesis stage, input speech is fuzzy vector-quantized by the mapped codebook, and then FCM arithmetic is used to synthesize speech adapted to input speaker. The speaker adaption experiments are carried out using speech of males in their thirties as input speaker's speech, and a female in her twenties as reference speaker's speech. Speeches used in experiments are sentences /anyoung hasim nika/ and /good morning/. As a results of experiments, we obtained a synthesized speech adapted to input speaker.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
L. J. Yi and L. K. Hyung, "Speaker-Adaptive Speech Synthesis based on Fuzzy Vector Quantizer Mapping and Neural Networks," The Transactions of the Korea Information Processing Society (1994 ~ 2000), vol. 4, no. 1, pp. 149-160, 1997. DOI: 10.3745/KIPSTE.1997.4.1.149.

[ACM Style]
Lee Jin Yi and Lee Kwang Hyung. 1997. Speaker-Adaptive Speech Synthesis based on Fuzzy Vector Quantizer Mapping and Neural Networks. The Transactions of the Korea Information Processing Society (1994 ~ 2000), 4, 1, (1997), 149-160. DOI: 10.3745/KIPSTE.1997.4.1.149.