Downloads: 120 | Views: 291 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
M.Tech / M.E / PhD Thesis | Computer Science & Engineering | India | Volume 5 Issue 6, June 2016 | Popularity: 7.1 / 10
Document Clustering using Improved K-means Algorithm
Anjali Vashist, Rajender Nath
Abstract: Clustering is an efficient technique that organizes a large quantity of unordered text documents into a small number of significant and coherent clusters, thereby providing a basis for intuitive and informative navigation and browsing mechanisms. It is studied by the researchers at broad level because of its broad application in several areas such as web mining, search engines, and information extraction. It clusters the documents based on various similarity measures. The existing K-means (document clustering algorithm) was based on random center generation and every time the clusters generated was different In this paper, an Improved Document Clustering algorithm is given which generates number of clusters for any text documents based on fixed center generation, collect only exclusive words from different documents in dataset and uses cosine similarity measures to place similar documents in proper clusters. Experimental results showed that accuracy of proposed algorithm is high compare to existing algorithm in terms of F-Measure, Recall, Precision and time complexity.
Keywords: Document Clustering, Cosine Similarity, Term Finder, Tf-Idf, Threshold
Edition: Volume 5 Issue 6, June 2016
Pages: 2206 - 2210
Make Sure to Disable the Pop-Up Blocker of Web Browser
Similar Articles
Downloads: 1 | Monthly Hits: ⮙1
Student Project, Computer Science & Engineering, India, Volume 11 Issue 5, May 2022
Pages: 650 - 654Automatic Text Summarization and Audio Generation
Tanooja K, Tejasri K, Akhilesh T, Prasanna Kavya M
Downloads: 104
M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015
Pages: 2114 - 2117Enhanced Document Clustering for Forensic Analysis
Rahul D. Kopulwar, Fazeel Irshad Zama
Downloads: 106
Research Paper, Computer Science & Engineering, India, Volume 5 Issue 5, May 2016
Pages: 1964 - 1967Improving Performance of Hindi-English based Cross Language Information Retrieval using Selective Documents Technique and Query Expansion
Aditi Agrawal, Dr. A. J. Agrawal
Downloads: 108
Research Paper, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015
Pages: 1983 - 1986An Improved Hierarchical Technique for Document Clustering
Priti B. Kudal, Prof. Manisha Naoghare
Downloads: 109
M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 7, July 2016
Pages: 1240 - 1244Implementing K-Means Clustering Algorithm Using MapReduce Paradigm
Botcha Chandrasekhara Rao, Medara Rambabu