Downloads: 109 | Views: 240
M.Tech / M.E / PhD Thesis | Computer Science & Engineering | India | Volume 5 Issue 7, July 2016 | Popularity: 6.8 / 10
Implementing K-Means Clustering Algorithm Using MapReduce Paradigm
Botcha Chandrasekhara Rao, Medara Rambabu
Abstract: Clustering is a useful data mining technique which groups data points such that the points within a single group have similar characteristics, while the points in different groups are dissimilar. Partitioning algorithm methods such as k-means algorithm is one kind of widely used clustering algorithms. As there is an increasing trend of applications to deal with vast amounts of data, clustering such big data is a challenging problem. Recently, partitioning clustering algorithms on a large cluster of commodity machines using the MapReduce framework have received a lot of attention. Traditional way of clustering text documents is Vector space model, in which tf-idf is used for k-means algorithm with supportive similarity measure. This project exhibits an approach to cluster text documents in which results obtained by executing map reduce k-means algorithm on single node cluster show that the performance of the algorithm increases as the text corpus increases.
Keywords: Vector space model, map reduce, text clustering, map reduce k-means, Hadoop
Edition: Volume 5 Issue 7, July 2016
Pages: 1240 - 1244
Make Sure to Disable the Pop-Up Blocker of Web Browser
Similar Articles
Downloads: 107
Review Papers, Computer Science & Engineering, India, Volume 3 Issue 10, October 2014
Pages: 2135 - 2136Text Clustering and Classification on the Use of Side Information
Shilpa S. Raut, Prof. V. B. Maral
Downloads: 110
Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014
Pages: 1787 - 1791Document Clustering Approach for Forensic Analysis: A Survey
Prachi K. Khairkar, D. A. Phalke
Downloads: 111
Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014
Pages: 2366 - 2368A Survey on Effective Quality Enhancement of Text Clustering & Classification Using METADATA
Padmaja Shivane, Rakesh Rajani
Downloads: 112
M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 6, June 2016
Pages: 1207 - 1210Out Lier Detection and Clustering Analysis in Data Stream Classification
Neethu S, Sajni Nirmal
Downloads: 113
Review Papers, Computer Science & Engineering, India, Volume 3 Issue 5, May 2014
Pages: 1735 - 1738Text Document Clustering Approach: A Brief Review of Literature
Ruchika Mavis Daniel, Arun Kumar Shukla