TY - JOUR
T1 - Improved method for correcting sample Mahalanobis distance without estimating population eigenvalues or eigenvectors of covariance matrix
AU - Kobayashi, Yasuyuki
N1 - Publisher Copyright:
© 2019, The Author(s).
PY - 2020/8/1
Y1 - 2020/8/1
N2 - The recognition performance of the sample Mahalanobis distance (SMD) deteriorates as the number of learning samples decreases. Therefore, it is important to correct the SMD for a population Mahalanobis distance (PMD) such that it becomes equivalent to the case of infinite learning samples. In order to reduce the computation time and cost for this main purpose, this paper presents a correction method that does not require the estimation of the population eigenvalues or eigenvectors of the covariance matrix. In short, this method only requires the sample eigenvalues of the covariance matrix, number of learning samples, and dimensionality to correct the SMD for the PMD. This method involves the summation of the SMD’s principal components (each of which is divided by its expectation obtained using the delta method), Lawley’s bias estimation, and the variances of the sample eigenvectors. A numerical experiment demonstrates that this method works well for various cases of learning sample number, dimensionality, population eigenvalues sequence, and non-centrality. The application of this method also shows improved performance of estimating a Gaussian mixture model using the expectation–maximization algorithm.
AB - The recognition performance of the sample Mahalanobis distance (SMD) deteriorates as the number of learning samples decreases. Therefore, it is important to correct the SMD for a population Mahalanobis distance (PMD) such that it becomes equivalent to the case of infinite learning samples. In order to reduce the computation time and cost for this main purpose, this paper presents a correction method that does not require the estimation of the population eigenvalues or eigenvectors of the covariance matrix. In short, this method only requires the sample eigenvalues of the covariance matrix, number of learning samples, and dimensionality to correct the SMD for the PMD. This method involves the summation of the SMD’s principal components (each of which is divided by its expectation obtained using the delta method), Lawley’s bias estimation, and the variances of the sample eigenvectors. A numerical experiment demonstrates that this method works well for various cases of learning sample number, dimensionality, population eigenvalues sequence, and non-centrality. The application of this method also shows improved performance of estimating a Gaussian mixture model using the expectation–maximization algorithm.
KW - Correction method
KW - Delta method
KW - Gaussian mixture model
KW - Lawley’s bias estimation
KW - Sample eigenvalues and eigenvectors
KW - Sample Mahalanobis distance
UR - https://www.scopus.com/pages/publications/85087955648
U2 - 10.1007/s41060-019-00201-4
DO - 10.1007/s41060-019-00201-4
M3 - 記事
AN - SCOPUS:85087955648
SN - 2364-415X
VL - 10
SP - 121
EP - 134
JO - International Journal of Data Science and Analytics
JF - International Journal of Data Science and Analytics
IS - 2
ER -