Document Clustering using Improved K-means Algorithm
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
www.ijsr.net | Open Access | Fully Refereed | Peer Reviewed International Journal

ISSN: 2319-7064

M.Tech / M.E / PhD Thesis | Computer Science & Engineering | India | Volume 5 Issue 6, June 2016

Document Clustering using Improved K-means Algorithm

Anjali Vashist, Rajender Nath

Clustering is an efficient technique that organizes a large quantity of unordered text documents into a small number of significant and coherent clusters, thereby providing a basis for intuitive and informative navigation and browsing mechanisms. It is studied by the researchers at broad level because of its broad application in several areas such as web mining, search engines, and information extraction. It clusters the documents based on various similarity measures. The existing K-means (document clustering algorithm) was based on random center generation and every time the clusters generated was different In this paper, an Improved Document Clustering algorithm is given which generates number of clusters for any text documents based on fixed center generation, collect only exclusive words from different documents in dataset and uses cosine similarity measures to place similar documents in proper clusters. Experimental results showed that accuracy of proposed algorithm is high compare to existing algorithm in terms of F-Measure, Recall, Precision and time complexity.

Keywords: Document Clustering, Cosine Similarity, Term Finder, Tf-Idf, Threshold

Edition: Volume 5 Issue 6, June 2016

Pages: 2206 - 2210

Share this Article

How to Cite this Article?

Anjali Vashist, Rajender Nath, "Document Clustering using Improved K-means Algorithm", International Journal of Science and Research (IJSR), https://www.ijsr.net/search_index_results_paperid.php?id=NOV164735, Volume 5 Issue 6, June 2016, 2206 - 2210

142 PDF Views | 87 PDF Downloads

Download Article PDF



Similar Articles with Keyword 'Document Clustering'

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014

Pages: 1787 - 1791

Document Clustering Approach for Forensic Analysis: A Survey

Prachi K. Khairkar, D. A. Phalke

Share this Article

Review Papers, Computer Science & Engineering, India, Volume 3 Issue 5, May 2014

Pages: 1735 - 1738

Text Document Clustering Approach: A Brief Review of Literature

Ruchika Mavis Daniel, Arun Kumar Shukla

Share this Article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015

Pages: 2114 - 2117

Enhanced Document Clustering for Forensic Analysis

Rahul D. Kopulwar, Fazeel Irshad Zama

Share this Article

Review Papers, Computer Science & Engineering, India, Volume 3 Issue 10, October 2014

Pages: 2218 - 2220

A Review of Modern Document Clustering Algorithm

Priti B. Kudal, Prof. Manisha Naoghare

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014

Pages: 2321 - 2323

A Survey of Forensic Analysis on Document Clustering

Nikhil Nagnath Boriwale, Satish.R.Todmal

Share this Article

Similar Articles with Keyword 'Cosine Similarity'

Research Paper, Computer Science & Engineering, India, Volume 5 Issue 5, May 2016

Pages: 1964 - 1967

Improving Performance of Hindi-English based Cross Language Information Retrieval using Selective Documents Technique and Query Expansion

Aditi Agrawal, Dr. A. J. Agrawal

Share this Article

Review Papers, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 981 - 984

Using SVM and Stopword removal method in Microblogging Classroom

Vidya Dhuttargaon, Amit R. Sarkar

Share this Article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 6, June 2016

Pages: 2206 - 2210

Document Clustering using Improved K-means Algorithm

Anjali Vashist, Rajender Nath

Share this Article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 3 Issue 10, October 2014

Pages: 1566 - 1570

Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop

Y. K. Patil, Prof. V. S. Nandedkar

Share this Article

Dissertation Chapters, Computer Science & Engineering, India, Volume 3 Issue 4, April 2014

Pages: 178 - 184

Mining Contents in Web Pages and Ranking of Web Pages Using Cosine Similarity

Divya C.

Share this Article

Similar Articles with Keyword 'Tf-Idf'

Research Paper, Computer Science & Engineering, India, Volume 5 Issue 5, May 2016

Pages: 1964 - 1967

Improving Performance of Hindi-English based Cross Language Information Retrieval using Selective Documents Technique and Query Expansion

Aditi Agrawal, Dr. A. J. Agrawal

Share this Article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 7, July 2016

Pages: 1240 - 1244

Implementing K-Means Clustering Algorithm Using MapReduce Paradigm

Botcha Chandrasekhara Rao, Medara Rambabu

Share this Article

Review Papers, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 981 - 984

Using SVM and Stopword removal method in Microblogging Classroom

Vidya Dhuttargaon, Amit R. Sarkar

Share this Article

Review Papers, Computer Science & Engineering, India, Volume 5 Issue 1, January 2016

Pages: 710 - 712

World Wide Web Metasearch Using TF-IDF Method

S. P. Phadtare, S. B. Magdum

Share this Article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 6, June 2016

Pages: 2206 - 2210

Document Clustering using Improved K-means Algorithm

Anjali Vashist, Rajender Nath

Share this Article

Similar Articles with Keyword 'Threshold'

Research Paper, Computer Science & Engineering, India, Volume 9 Issue 12, December 2020

Pages: 1615 - 1620

Traffic Sign Detection via Graph Based Ranking and Segmentation Algorithm

Premsai Cheedella

Share this Article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 4 Issue 10, October 2015

Pages: 1963 - 1966

An Enhanced Keyword Search Using Flexible Ranking Over Differential Query Services in Cloud

Rasiya V M, Shanavas M A

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 5 Issue 4, April 2016

Pages: 1690 - 1694

Analytics of Application Resource Utilization within the Virtual Machine

Priyanka H

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015

Pages: 1353 - 1357

Analysis of Energy Efficiency and Throughput for IEEE 802.11 based Mobile Adhoc Networks

Manpreet Kaur, Pooja Saini

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 12, December 2015

Pages: 2004 - 2008

Document Classification Using Part of Speech in Text Mining

Sonam Tripathi, Tripti Sharma

Share this Article
Top