• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model

Vol. 7, No.1, January 30,2013
10.3837/tiis.2013.01.006, Download Paper (Free):

Abstract

Since Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approximation algorithms perform well, training a topic model is still computationally expensive given the large amount of data it requires. In this paper, we propose a new method, called non-simultaneous sampling deactivation, for efficient approximation of parameters in a topic model. While each random variable is normally sampled or obtained by a single predefined burn-in period in the traditional approximation algorithms, our new method is based on the observation that the random variable nodes in one topic model have all different periods of convergence. During the iterative approximation process, the proposed method allows each random variable node to be terminated or deactivated when it is converged. Therefore, compared to the traditional approximation ways in which usually every node is deactivated concurrently, the proposed method achieves the inference efficiency in terms of time and memory. We do not propose a new approximation algorithm, but a new process applicable to the existing approximation algorithms. Through experiments, we show the time and memory efficiency of the method, and discuss about the tradeoff between the efficiency of the approximation process and the parameter consistency.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
Young-Seob Jeong, Sou-Young Jin and Ho-Jin Choi, "Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model," KSII Transactions on Internet and Information Systems, vol. 7, no. 1, pp. 81-98, 2013. DOI: 10.3837/tiis.2013.01.006

[ACM Style]
Jeong, Y., Jin, S., and Choi, H. 2013. Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model. KSII Transactions on Internet and Information Systems, 7, 1, (2013), 81-98. DOI: 10.3837/tiis.2013.01.006