• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Speaker Adaptation Using i-Vector Based Clustering

Vol. 14, No. 7, July 31, 2020
10.3837/tiis.2020.07.003, Download Paper (Free):

Abstract

We propose a novel speaker adaptation method using acoustic model clustering. The similarity of different speakers is defined by the cosine distance between their i-vectors (intermediate vectors), and various efficient clustering algorithms are applied to obtain a number of speaker subsets with different characteristics. The speaker-independent model is then retrained with the training data of the individual speaker subsets grouped by the clustering results, and an unknown speech is recognized by the retrained model of the closest cluster. The proposed method is applied to a large-scale speech recognition system implemented by a hybrid hidden Markov model and deep neural network framework. An experiment was conducted to evaluate the word error rates using Resource Management database. When the proposed speaker adaptation method using i-vector based clustering was applied, the performance, as compared to that of the conventional speaker-independent speech recognition model, was improved relatively by as much as 12.2% for the conventional fully neural network, and by as much as 10.5% for the bidirectional long short-term memory.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
M. Kim, G. Jang, J. Kim, M. Lee, "Speaker Adaptation Using i-Vector Based Clustering," KSII Transactions on Internet and Information Systems, vol. 14, no. 7, pp. 2785-2799, 2020. DOI: 10.3837/tiis.2020.07.003.

[ACM Style]
Minsoo Kim, Gil-Jin Jang, Ji-Hwan Kim, and Minho Lee. 2020. Speaker Adaptation Using i-Vector Based Clustering. KSII Transactions on Internet and Information Systems, 14, 7, (2020), 2785-2799. DOI: 10.3837/tiis.2020.07.003.

[BibTeX Style]
@article{tiis:23714, title="Speaker Adaptation Using i-Vector Based Clustering", author="Minsoo Kim and Gil-Jin Jang and Ji-Hwan Kim and Minho Lee and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2020.07.003}, volume={14}, number={7}, year="2020", month={July}, pages={2785-2799}}