International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Open Access | Fully Refereed | Peer Reviewed

ISSN: 2319-7064


Downloads: 118

M.Tech / M.E / PhD Thesis | Computer Science & Engineering | India | Volume 5 Issue 6, June 2016


Document Clustering using Improved K-means Algorithm

Anjali Vashist, Rajender Nath


Abstract: Clustering is an efficient technique that organizes a large quantity of unordered text documents into a small number of significant and coherent clusters, thereby providing a basis for intuitive and informative navigation and browsing mechanisms. It is studied by the researchers at broad level because of its broad application in several areas such as web mining, search engines, and information extraction. It clusters the documents based on various similarity measures. The existing K-means (document clustering algorithm) was based on random center generation and every time the clusters generated was different In this paper, an Improved Document Clustering algorithm is given which generates number of clusters for any text documents based on fixed center generation, collect only exclusive words from different documents in dataset and uses cosine similarity measures to place similar documents in proper clusters. Experimental results showed that accuracy of proposed algorithm is high compare to existing algorithm in terms of F-Measure, Recall, Precision and time complexity.


Keywords: Document Clustering, Cosine Similarity, Term Finder, Tf-Idf, Threshold


Edition: Volume 5 Issue 6, June 2016,


Pages: 2206 - 2210


How to Cite this Article?

Anjali Vashist, Rajender Nath, "Document Clustering using Improved K-means Algorithm", International Journal of Science and Research (IJSR), https://www.ijsr.net/get_abstract.php?paper_id=NOV164735, Volume 5 Issue 6, June 2016, 2206 - 2210

How to Share this Article?

Enter Your Email Address


Similar Articles with Keyword 'Document Clustering'

Downloads: 104

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015

Pages: 2114 - 2117

Enhanced Document Clustering for Forensic Analysis

Rahul D. Kopulwar, Fazeel Irshad Zama

Share this Article

Downloads: 108

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 1983 - 1986

An Improved Hierarchical Technique for Document Clustering

Priti B. Kudal, Prof. Manisha Naoghare

Share this Article

Similar Articles with Keyword 'Cosine Similarity'

Downloads: 106

Research Paper, Computer Science & Engineering, India, Volume 5 Issue 5, May 2016

Pages: 1964 - 1967

Improving Performance of Hindi-English based Cross Language Information Retrieval using Selective Documents Technique and Query Expansion

Aditi Agrawal, Dr. A. J. Agrawal

Share this Article

Downloads: 110

Review Papers, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 981 - 984

Using SVM and Stopword removal method in Microblogging Classroom

Vidya Dhuttargaon, Amit R. Sarkar

Share this Article

Similar Articles with Keyword 'TfIdf'

Downloads: 128

Review Papers, Computer Science & Engineering, India, Volume 3 Issue 6, June 2014

Pages: 758 - 760

Pattern discovery on the World Wide Web by Using Web Mining Methods: A Review

Anu Bala, Amardeep Singh

Share this Article

Downloads: 134

Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 1, January 2015

Pages: 420 - 422

Achieving Efficiency of Encrypted Cloud Data with Synonym Based Search and Multi-Keyword Ranked Search

Dipika Chavan, Dinesh Yadav

Share this Article

Similar Articles with Keyword 'Threshold'

Downloads: 47

Research Paper, Computer Science & Engineering, India, Volume 10 Issue 2, February 2021

Pages: 1682 - 1686

Support Vector Machine Based MRI Brain Tumor Detection

Nazima Yousuf, Mrigana Walia

Share this Article

Downloads: 48

Research Paper, Computer Science & Engineering, India, Volume 10 Issue 3, March 2021

Pages: 376 - 379

Glaucoma Detection Using Support Vector Machine Algorithm

Aukif Yousuf Wani, Preeti Sondhi

Share this Article

Similar Articles with Keyword 'Document'

Downloads: 1

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014

Pages: 2205 - 2207

A Survey of Generating Multi-Document Summarizations

Patil Ajita S., P. M. Mane

Share this Article

Downloads: 1

Survey Paper, Computer Science & Engineering, India, Volume 10 Issue 6, June 2021

Pages: 716 - 720

A Comprehensive Study of Elasticsearch

Nikita Kathare, O. Vinati Reddy, Dr. Vishalakshi Prabhu

Share this Article

Similar Articles with Keyword 'Clustering'

Downloads: 1

Survey Paper, Computer Science & Engineering, India, Volume 10 Issue 5, May 2021

Pages: 948 - 951

Survey on Various Image Segmentation Techniques

Babita Chauhan

Share this Article

Downloads: 1

Research Paper, Computer Science & Engineering, India, Volume 10 Issue 9, September 2021

Pages: 649 - 652

Image Segmentation using Biogeography based Optimization and its Comparison with K Means Clustering

Babita Chauhan, Preeti Sondhi

Share this Article

Similar Articles with Keyword 'Cosine'

Downloads: 82

Research Paper, Computer Science & Engineering, India, Volume 2 Issue 3, March 2013

Pages: 165 - 167

Statistical Analysis of DCT Coefficients for Ascertaining Ears Biometrics

Nisha, Jatinder Kansal, JPS Raina

Share this Article

Downloads: 102

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015

Pages: 1984 - 1986

Discrete Cosine Transform based Image Inpainting

Bansi B. Thanki

Share this Article

Similar Articles with Keyword 'Similarity'

Downloads: 3

Research Paper, Computer Science & Engineering, India, Volume 10 Issue 5, May 2021

Pages: 1118 - 1123

Music Recommendation System

Nipun Prakash Gupta, Durgesh Kumar

Share this Article

Downloads: 99

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014

Pages: 2297 - 2299

Measuring and Comparing Semantic Structure of Ontology

Nirmitee N. Kurhekar, Prof. L. J. Sankpal

Share this Article

Similar Articles with Keyword 'Term'

Downloads: 156

Research Paper, Computer Science & Engineering, India, Volume 6 Issue 11, November 2017

Pages: 338 - 384

Managing Uncertainty in Supply Chain Operating Cost Using Genetic Algorithm

Dr. Niju P. Joseph, Dr. Priyanka Surendran

Share this Article

Downloads: 155

Research Paper, Computer Science & Engineering, India, Volume 6 Issue 8, August 2017

Pages: 868 - 874

Study of Power Management in Adhoc Networks

Anandhi Giri, S. K. Srivatsa

Share this Article

Similar Articles with Keyword 'Finder'

Downloads: 100

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015

Pages: 1167 - 1170

Secured Location Based Bus Route Locator Application for Android Mobiles

Dayanand G Savakar, Ravi Hosur

Share this Article

Downloads: 110

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 10, October 2014

Pages: 681 - 686

An Evolving Approach on Efficient Web Crawler using Fuzzy Genetic Algorithm

P. Jaganathan, T. Karthikeyan

Share this Article
Top