In recent years, clustering has become a progressively important research topic in machine learning, pattern recognition, and several different disciplines such as data mining, bioinformatics, computer vision, page ranking, social network, etc. This talk is about preliminary definition of cluster, conventional existing approaches that extract the clusters from the data, with an emphasis on the Non-negative Matrix Factorization (NMF) as a new advancement in recent years to both clustering problem and dimension reduction task.
NMF has been investigated and applied for various purposes especially for high-dimensional data with elements containing non-negative values. The ultimate goal of the NMF, is to provide a non-negative low-rank approximation of the original data. Directly imposing the non-negativity constraints on both factors and reconstruction co-efficients results in finding a lower rank approximation of the data that is often more meaningful and interpretable than the other similar low-rank estimations such as singular value decomposition, with even lesser reconstruction error. In other words, due to the NP-hardness of the clustering analysis as a discrete optimization problem, NMF based low-rank approximation gives a relaxation for clustering with equivalent variants to the K-means clustering family and the spectral clustering.