International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 120

M.Tech / M.E / PhD Thesis | Computer Science & Engineering | India | Volume 3 Issue 10, October 2014


Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop

Y. K. Patil | Prof. V. S. Nandedkar


Abstract: Document clustering is one of the important areas in data mining. Hadoop is being used by the Yahoo, Google, Face book and Twitter business companies for implementing real time applications. Email, social media blog, movie review comments, books are used for document clustering. This paper focuses on the document clustering using Hadoop. Hadoop is the new technology used for parallel computing of documents. The computing time complexity in Hadoop for document clustering is less as compared to JAVA based implementations. In this paper, authors have proposed the design and implementation of Tf-Idf, K-means and Hierarchical clustering algorithms on Hadoop.


Keywords: Hadoop, Tf-Idf, Cosine Similarity, K-means and Hierarchical clustering


Edition: Volume 3 Issue 10, October 2014,


Pages: 1566 - 1570


How to Download this Article?

You Need to Register Your Email Address Before You Can Download the Article PDF


How to Cite this Article?

Y. K. Patil, Prof. V. S. Nandedkar, "Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop", International Journal of Science and Research (IJSR), Volume 3 Issue 10, October 2014, pp. 1566-1570, https://www.ijsr.net/get_abstract.php?paper_id=OCT14526

Similar Articles with Keyword 'Hadoop'

Downloads: 1

Research Paper, Computer Science & Engineering, India, Volume 10 Issue 6, June 2021

Pages: 1188 - 1193

Profit Contribution of Bank Customer from Different Business Liabilities

Vinod Desai | Shalini B Ullagaddi | Vittal A Odeyar

Share this Article

Downloads: 1

Research Paper, Computer Science & Engineering, India, Volume 11 Issue 1, January 2022

Pages: 1229 - 1231

Big Data in Healthcare

Pratiksha Patil

Share this Article
Top