International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
www.ijsr.net | Most Trusted Research Journal Since Year 2012

ISSN: 2319-7064



M.Tech / M.E / PhD Thesis | Computer Science & Engineering | India | Volume 3 Issue 10, October 2014

Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop

Y. K. Patil, Prof. V. S. Nandedkar

Document clustering is one of the important areas in data mining. Hadoop is being used by the Yahoo, Google, Face book and Twitter business companies for implementing real time applications. Email, social media blog, movie review comments, books are used for document clustering. This paper focuses on the document clustering using Hadoop. Hadoop is the new technology used for parallel computing of documents. The computing time complexity in Hadoop for document clustering is less as compared to JAVA based implementations. In this paper, authors have proposed the design and implementation of Tf-Idf, K-means and Hierarchical clustering algorithms on Hadoop.

Keywords: Hadoop, Tf-Idf, Cosine Similarity, K-means and Hierarchical clustering

Edition: Volume 3 Issue 10, October 2014

Pages: 1566 - 1570


How to Cite this Article?

Y. K. Patil, Prof. V. S. Nandedkar, "Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop", International Journal of Science and Research (IJSR), https://www.ijsr.net/search_index_results_paperid.php?id=OCT14526, Volume 3 Issue 10, October 2014, 1566 - 1570

35 PDF Views | 30 PDF Downloads

Download Article PDF



Similar Articles with Keyword 'Hadoop'

Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 10, October 2015

Pages: 1707 - 1708

Survey Paper on Twitter Sentiment Analysis Using Portar Stemming Algorithm

Nishad Patil, Tingre Sayali, Thorat Kalyani, Shivshetty Swapnil, Patil Shwetal

Share this article

Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 12, December 2015

Pages: 1290 - 1293

A Review on Need of MapReduce in Big Data Application

Sushant Shirish Nagavkar, Ashishkumar

Share this article

Case Studies, Computer Science & Engineering, India, Volume 5 Issue 4, April 2016

Pages: 1486 - 1492

An Outlook on India?s Healthcare System with a Medical Case Study and Review on Big Data and its Importance in Healthcare

Saikrishna Depuru, Dr. R. Raj Kumar

Share this article

Research Paper, Computer Science & Engineering, India, Volume 6 Issue 6, June 2017

Pages: 1711 - 1716

Live Data Stream Classification for Reducing Query Processing Time: Design and Analysis

Spraha Kamriya, Vandana Kate

Share this article

Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 11, November 2015

Pages: 1249 - 1251

Survey on Resource Allocation in Phase-Level using MapReduce in Hadoop

Suryakant S. Bhalke

Share this article



Similar Articles with Keyword 'Tf-Idf'

Research Paper, Computer Science & Engineering, India, Volume 5 Issue 5, May 2016

Pages: 1964 - 1967

Improving Performance of Hindi-English based Cross Language Information Retrieval using Selective Documents Technique and Query Expansion

Aditi Agrawal, Dr. A. J. Agrawal

Share this article

Review Papers, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 981 - 984

Using SVM and Stopword removal method in Microblogging Classroom

Vidya Dhuttargaon, Amit R. Sarkar

Share this article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 7, July 2016

Pages: 1240 - 1244

Implementing K-Means Clustering Algorithm Using MapReduce Paradigm

Botcha Chandrasekhara Rao, Medara Rambabu

Share this article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 6, June 2016

Pages: 2206 - 2210

Document Clustering using Improved K-means Algorithm

Anjali Vashist, Rajender Nath

Share this article

Review Papers, Computer Science & Engineering, India, Volume 5 Issue 1, January 2016

Pages: 710 - 712

World Wide Web Metasearch Using TF-IDF Method

S. P. Phadtare, S. B. Magdum

Share this article



Similar Articles with Keyword 'Cosine Similarity'

Research Paper, Computer Science & Engineering, India, Volume 5 Issue 5, May 2016

Pages: 1964 - 1967

Improving Performance of Hindi-English based Cross Language Information Retrieval using Selective Documents Technique and Query Expansion

Aditi Agrawal, Dr. A. J. Agrawal

Share this article

Review Papers, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 981 - 984

Using SVM and Stopword removal method in Microblogging Classroom

Vidya Dhuttargaon, Amit R. Sarkar

Share this article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 6, June 2016

Pages: 2206 - 2210

Document Clustering using Improved K-means Algorithm

Anjali Vashist, Rajender Nath

Share this article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 3 Issue 10, October 2014

Pages: 1566 - 1570

Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop

Y. K. Patil, Prof. V. S. Nandedkar

Share this article

Dissertation Chapters, Computer Science & Engineering, India, Volume 3 Issue 4, April 2014

Pages: 178 - 184

Mining Contents in Web Pages and Ranking of Web Pages Using Cosine Similarity

Divya C.

Share this article
Top