International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064

Downloads: 119 | Views: 131 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

M.Tech / M.E / PhD Thesis | Computer Science & Engineering | India | Volume 5 Issue 6, June 2016

Document Clustering using Improved K-means Algorithm

Anjali Vashist | Rajender Nath

Abstract: Clustering is an efficient technique that organizes a large quantity of unordered text documents into a small number of significant and coherent clusters, thereby providing a basis for intuitive and informative navigation and browsing mechanisms. It is studied by the researchers at broad level because of its broad application in several areas such as web mining, search engines, and information extraction. It clusters the documents based on various similarity measures. The existing K-means (document clustering algorithm) was based on random center generation and every time the clusters generated was different In this paper, an Improved Document Clustering algorithm is given which generates number of clusters for any text documents based on fixed center generation, collect only exclusive words from different documents in dataset and uses cosine similarity measures to place similar documents in proper clusters. Experimental results showed that accuracy of proposed algorithm is high compare to existing algorithm in terms of F-Measure, Recall, Precision and time complexity.

Keywords: Document Clustering, Cosine Similarity, Term Finder, Tf-Idf, Threshold

Edition: Volume 5 Issue 6, June 2016,

Pages: 2206 - 2210

How to Download this Article?

Type Your Email Address below to Download the Article PDF

How to Cite this Article?

Anjali Vashist, Rajender Nath, "Document Clustering using Improved K-means Algorithm", International Journal of Science and Research (IJSR), Volume 5 Issue 6, June 2016, pp. 2206-2210,

Similar Articles with Keyword 'Document Clustering'

Downloads: 104

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015

Pages: 2114 - 2117

Enhanced Document Clustering for Forensic Analysis

Rahul D. Kopulwar | Fazeel Irshad Zama

Share this Article

Downloads: 108

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 1983 - 1986

An Improved Hierarchical Technique for Document Clustering

Priti B. Kudal [2] | Prof. Manisha Naoghare [2]

Share this Article