Notes on the low rank matrix approximation of kernelHiroshi Tsukahara
?
This document discusses low-rank matrix approximation of kernel matrices for kernel methods in machine learning. It notes that kernel matrices often have low rank compared to their size, and this property can be exploited to reduce the computational complexity of kernel methods. Specifically, it proposes approximating the kernel matrix as the product of two low-rank matrices. This allows the solution to be computed in terms of the low-rank matrices rather than the full kernel matrix, reducing the complexity from O(n3) to O(r2n) where r is the rank. Several algorithms for deriving the low-rank approximation are mentioned, including Nystrom approximation and incomplete Cholesky decomposition.
The document proposes a new method called Sparse Isotropic Hashing (SIH) to learn compact binary codes for image retrieval. SIH imposes additional constraints of sparsity and isotropic variance on the hash functions to make the learning problem better posed. It formulates SIH as an optimization problem that balances orthogonality, isotropic variance and sparsity, and develops an algorithm to solve it. Experiments on a landmark dataset show SIH achieves comparable retrieval accuracy to the state-of-the-art method while learning hash codes 20 times faster.
Notes on the low rank matrix approximation of kernelHiroshi Tsukahara
?
This document discusses low-rank matrix approximation of kernel matrices for kernel methods in machine learning. It notes that kernel matrices often have low rank compared to their size, and this property can be exploited to reduce the computational complexity of kernel methods. Specifically, it proposes approximating the kernel matrix as the product of two low-rank matrices. This allows the solution to be computed in terms of the low-rank matrices rather than the full kernel matrix, reducing the complexity from O(n3) to O(r2n) where r is the rank. Several algorithms for deriving the low-rank approximation are mentioned, including Nystrom approximation and incomplete Cholesky decomposition.
The document proposes a new method called Sparse Isotropic Hashing (SIH) to learn compact binary codes for image retrieval. SIH imposes additional constraints of sparsity and isotropic variance on the hash functions to make the learning problem better posed. It formulates SIH as an optimization problem that balances orthogonality, isotropic variance and sparsity, and develops an algorithm to solve it. Experiments on a landmark dataset show SIH achieves comparable retrieval accuracy to the state-of-the-art method while learning hash codes 20 times faster.
37. 参考?文献
n?? Miaohong Chen, et al. 2014. A Joint Model for Unsupervised Chinese Word
Segmentation. In EMNLP 2014, pages 854–1 863.
n?? Sharon Goldwater, et al. A Fully Bayesian Approach to Unsupervised Part-of-Speech
Tagging. In Proceedings of ACL 2007, pages 744– 751.
n?? Sharon Goldwater, et al. Contextual Dependencies in Un- supervised Word
Segmentation. In Proceedings of ACL/COLING 2006, pages 673–680.
n?? Matthew J. Johnson et al. Bayesian Nonparametric Hidden Semi-Markov Models.
Journal of Machine Learning Research, 14:673–701.
n?? Pierre Magistry et al. Can MDL Improve Unsupervised Chinese Word Segmenta- tion?
In Proceedings of the Seventh SIGHAN Work- shop on Chinese Language Processing,
pages 2–10.
n?? Daichi Mochihashi, et al. Bayesian Unsupervised Word Seg- mentation with Nested
Pitman-Yor Language Mod- eling. In Proceedings of ACL-IJCNLP 2009, pages 100–108.
n?? Yee Whye Teh. A Bayesian Interpretation of In- terpolated Kneser-Ney. Technical
Report TRA2/06, School of Computing, NUS.
n?? Valentin Zhikov, et al. An Efficient Algorithm for Unsuper- vised Word Segmentation
with Branching Entropy and MDL. In EMNLP 2010, pages 832–842.
2015/04/30 37