Implementing K-Means Clustering Algorithm Using MapReduce Paradigm
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
www.ijsr.net | Open Access | Fully Refereed | Peer Reviewed International Journal

ISSN: 2319-7064

M.Tech / M.E / PhD Thesis | Computer Science & Engineering | India | Volume 5 Issue 7, July 2016

Implementing K-Means Clustering Algorithm Using MapReduce Paradigm

Botcha Chandrasekhara Rao, Medara Rambabu

Clustering is a useful data mining technique which groups data points such that the points within a single group have similar characteristics, while the points in different groups are dissimilar. Partitioning algorithm methods such as k-means algorithm is one kind of widely used clustering algorithms. As there is an increasing trend of applications to deal with vast amounts of data, clustering such big data is a challenging problem. Recently, partitioning clustering algorithms on a large cluster of commodity machines using the MapReduce framework have received a lot of attention. Traditional way of clustering text documents is Vector space model, in which tf-idf is used for k-means algorithm with supportive similarity measure. This project exhibits an approach to cluster text documents in which results obtained by executing map reduce k-means algorithm on single node cluster show that the performance of the algorithm increases as the text corpus increases.

Keywords: Vector space model, map reduce, text clustering, map reduce k-means, Hadoop

Edition: Volume 5 Issue 7, July 2016

Pages: 1240 - 1244

Share this Article

How to Cite this Article?

Botcha Chandrasekhara Rao, Medara Rambabu, "Implementing K-Means Clustering Algorithm Using MapReduce Paradigm", International Journal of Science and Research (IJSR), https://www.ijsr.net/search_index_results_paperid.php?id=14071601, Volume 5 Issue 7, July 2016, 1240 - 1244

105 PDF Views | 81 PDF Downloads

Download Article PDF



Similar Articles with Keyword 'Vector space model'

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 7, July 2016

Pages: 1240 - 1244

Implementing K-Means Clustering Algorithm Using MapReduce Paradigm

Botcha Chandrasekhara Rao, Medara Rambabu

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 5 Issue 6, June 2016

Pages: 2044 - 2048

Multi-keyword Ranked Search Over Encrypted Cloud Data Supporting Synonym Query

Siddheshwar S. Metkari, Dr. S. B. Sonkamble

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 3, March 2014

Pages: 410 - 412

Multi Keyword Searching Techniques over Encrypted Cloud Data

P. Shanmuga Priya, R. Sugumar

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 6, June 2014

Pages: 496 - 501

Development of Secure Multikeyword Retrieval Methodology for Encrypted Cloud Data

Deepak I M, K.R. Shylaja, Ravinandan M E

Share this Article

Similar Articles with Keyword 'map reduce'

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014

Pages: 1103 - 1108

Design of a High Performing Cloud Using Load Rebalancing Technique in Distributed File System

Y. Steeven, C. Prakasha Rao

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014

Pages: 1510 - 1513

A Survey on Optimal Data Storage of Cache Manager for Big Data Using Map Reduce Framework

Rupali Pashte, Ritesh Thakur

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014

Pages: 2041 - 2044

Optimization Technique for Efficient Dynamic Query Forms with NoSQL

Kavita Ozarkar, Rakesh Rajani

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 1, January 2015

Pages: 1690 - 1693

Extended Best Peer: A Peer-to-Peer Based System by Corporate Network for Data Sharing

Chandre P.R, Bhavsar Harshada

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 7, July 2015

Pages: 1096 - 1101

Parallel Data Shuffling for Hadoop Acceleration with Network Levitated Merge and RDMA for Interconnectivity

Kishorkumar Shinde, Venkatesan N.

Share this Article

Similar Articles with Keyword 'text clustering'

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014

Pages: 2366 - 2368

A Survey on Effective Quality Enhancement of Text Clustering & Classification Using METADATA

Padmaja Shivane, Rakesh Rajani

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014

Pages: 1787 - 1791

Document Clustering Approach for Forensic Analysis: A Survey

Prachi K. Khairkar, D. A. Phalke

Share this Article

Review Papers, Computer Science & Engineering, India, Volume 3 Issue 5, May 2014

Pages: 1735 - 1738

Text Document Clustering Approach: A Brief Review of Literature

Ruchika Mavis Daniel, Arun Kumar Shukla

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 12, December 2015

Pages: 1420 - 1423

Text Clustering With Using Side Information

Shubhangi V. Airekar, Dhanshree S. Kulkurni

Share this Article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 7, July 2016

Pages: 1240 - 1244

Implementing K-Means Clustering Algorithm Using MapReduce Paradigm

Botcha Chandrasekhara Rao, Medara Rambabu

Share this Article

Similar Articles with Keyword 'Hadoop'

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014

Pages: 1103 - 1108

Design of a High Performing Cloud Using Load Rebalancing Technique in Distributed File System

Y. Steeven, C. Prakasha Rao

Share this Article

Dissertation Chapters, Computer Science & Engineering, India, Volume 4 Issue 7, July 2015

Pages: 1721 - 1725

Secured Load Rebalancing for Distributed Files System in Cloud

Jayesh D. Kamble, Y. B. Gurav

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 1, January 2015

Pages: 1690 - 1693

Extended Best Peer: A Peer-to-Peer Based System by Corporate Network for Data Sharing

Chandre P.R, Bhavsar Harshada

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 7, July 2015

Pages: 1096 - 1101

Parallel Data Shuffling for Hadoop Acceleration with Network Levitated Merge and RDMA for Interconnectivity

Kishorkumar Shinde, Venkatesan N.

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 12, December 2015

Pages: 1661 - 1667

Performance Enhancement of MapReduce Framework in Big Data Application Using Load Balancing with Cache

Sushant Shirish Nagavkar, Ashishkumar

Share this Article
Top