TY - JOUR
T1 - Dynamical analysis of contrastive divergence learning
T2 - Restricted Boltzmann machines with Gaussian visible units
AU - Karakida, Ryo
AU - Okada, Masato
AU - Amari, Shun ichi
N1 - Publisher Copyright:
© 2016 Elsevier Ltd.
PY - 2016/7/1
Y1 - 2016/7/1
N2 - The restricted Boltzmann machine (RBM) is an essential constituent of deep learning, but it is hard to train by using maximum likelihood (ML) learning, which minimizes the Kullback-Leibler (KL) divergence. Instead, contrastive divergence (CD) learning has been developed as an approximation of ML learning and widely used in practice. To clarify the performance of CD learning, in this paper, we analytically derive the fixed points where ML and CDn learning rules converge in two types of RBMs: one with Gaussian visible and Gaussian hidden units and the other with Gaussian visible and Bernoulli hidden units. In addition, we analyze the stability of the fixed points. As a result, we find that the stable points of CDn learning rule coincide with those of ML learning rule in a Gaussian-Gaussian RBM. We also reveal that larger principal components of the input data are extracted at the stable points. Moreover, in a Gaussian-Bernoulli RBM, we find that both ML and CDn learning can extract independent components at one of stable points. Our analysis demonstrates that the same feature components as those extracted by ML learning are extracted simply by performing CD1 learning. Expanding this study should elucidate the specific solutions obtained by CD learning in other types of RBMs or in deep networks.
AB - The restricted Boltzmann machine (RBM) is an essential constituent of deep learning, but it is hard to train by using maximum likelihood (ML) learning, which minimizes the Kullback-Leibler (KL) divergence. Instead, contrastive divergence (CD) learning has been developed as an approximation of ML learning and widely used in practice. To clarify the performance of CD learning, in this paper, we analytically derive the fixed points where ML and CDn learning rules converge in two types of RBMs: one with Gaussian visible and Gaussian hidden units and the other with Gaussian visible and Bernoulli hidden units. In addition, we analyze the stability of the fixed points. As a result, we find that the stable points of CDn learning rule coincide with those of ML learning rule in a Gaussian-Gaussian RBM. We also reveal that larger principal components of the input data are extracted at the stable points. Moreover, in a Gaussian-Bernoulli RBM, we find that both ML and CDn learning can extract independent components at one of stable points. Our analysis demonstrates that the same feature components as those extracted by ML learning are extracted simply by performing CD1 learning. Expanding this study should elucidate the specific solutions obtained by CD learning in other types of RBMs or in deep networks.
KW - Component analysis
KW - Contrastive divergence
KW - Deep learning
KW - Restricted Boltzmann machine
KW - Stability of learning algorithms
UR - http://www.scopus.com/inward/record.url?scp=84964528780&partnerID=8YFLogxK
U2 - 10.1016/j.neunet.2016.03.013
DO - 10.1016/j.neunet.2016.03.013
M3 - 記事
C2 - 27131468
AN - SCOPUS:84964528780
SN - 0893-6080
VL - 79
SP - 78
EP - 87
JO - Neural Networks
JF - Neural Networks
ER -