Efficient Seed and K-Value Selection in K-Means Clustering using Relative Weight and New Distance Metric
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
www.ijsr.net | Most Trusted Research Journal Since Year 2012

ISSN: 2319-7064



Survey Paper | Computer Science & Engineering | India | Volume 6 Issue 1, January 2017

Efficient Seed and K-Value Selection in K-Means Clustering using Relative Weight and New Distance Metric

Premsagar Dandge, Aruna Gupta

K-mean clustering algorithm is used for clustering the data points which are similar to each other. K-means algorithm is popular due to its simplicity and convergence tendency. The general distance metrics used this algorithm are Euclidian distance, Manhattan distance etc. which are best suited for numeric data like geometric coordinates. These distance metrics does not given full proof results for categorical data. We will be using a new distance metric for calculating the similarity between the categorical data points. The new distance metric uses dynamic attribute weight and frequency probability to differentiate the data points. This ensures the use of categorical properties of the attributes considered while clustering. The k-mean algorithm needs the information about number of clusters present in the dataset in advance before proceeding for cluster analysis. We will be using a different technique for finding out the number of clusters which is based on the data density distribution. Also the initial cluster seeds are selected in a random fashion which may lead to more iteration required for convergent solution. In proposed method, seeds are selected considering the density distribution which ensures the even distribution of initial seed selection. This will reduce the overall iteration required for convergent solution.

Keywords: k-means clustering, categorical data, dynamic attribute weight, frequency probability, data density

Edition: Volume 6 Issue 1, January 2017

Pages: 2084 - 2087

Share this Article

How to Cite this Article?

Premsagar Dandge, Aruna Gupta, "Efficient Seed and K-Value Selection in K-Means Clustering using Relative Weight and New Distance Metric", International Journal of Science and Research (IJSR), https://www.ijsr.net/search_index_results_paperid.php?id=ART20164290, Volume 6 Issue 1, January 2017, 2084 - 2087

58 PDF Views | 56 PDF Downloads

Download Article PDF



Similar Articles with Keyword 'k-means clustering'

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015

Pages: 1862 - 1864

User Profile Based Client Side Instant Search Mechanism With Use of TLB Mechanism and Fuzzy Search

Rupali A. Ingale, J. L. Chaudhari

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 6, June 2014

Pages: 1634 - 1638

Performance Comparison of Hard and Fuzzy Clustering Algorithms on ESTs of Human Genes

Abhilasha Chaudhuri, Asha Ambhaikar

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 7, July 2014

Pages: 938 - 943

HCR Using K-Means Clustering Algorithm

Meha Mathur, Anil Saroliya

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 5, May 2015

Pages: 1164 - 1169

A Survey on Scalable Big Data Analytics Platform

Ravindra Phule, Madhav Ingle

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 5 Issue 2, February 2016

Pages: 1396 - 1400

An Improved Mining of Biomedical Data from Web Documents Using Clustering

Nikita Gupta, Gunjan Pahuja

Share this Article

Similar Articles with Keyword 'categorical data'

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014

Pages: 1188 - 1192

Outlier Detection Based on Surfeit Entropy for Large Scale Categorical Data Set

Neha L. Bagal

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014

Pages: 1503 - 1509

Automatic Clustering Subspace for High Dimensional Categorical Data Using Neuro-Fuzzy Classification

R. Mahalingam, S. Omprakash

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 10, October 2014

Pages: 1473 - 1479

Mining Spatial Data & Enhancing Classification Using Bio - Inspired Approaches

Poonam Kataria, Navpreet Rupal

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 1 Issue 3, December 2012

Pages: 8 - 11

A New Link Based Approach for Categorical Data Clustering

Kavya S.A, M.V.Panduranga Rao, S.Basavaraj Patil

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 6 Issue 1, January 2017

Pages: 2084 - 2087

Efficient Seed and K-Value Selection in K-Means Clustering using Relative Weight and New Distance Metric

Premsagar Dandge, Aruna Gupta

Share this Article

Similar Articles with Keyword 'data density'

Comparative Studies, Computer Science & Engineering, India, Volume 4 Issue 5, May 2015

Pages: 181 - 183

Low Energy Consumption with Secure Routing in Wireless Sensor Network

Shireesh Bhoyar, N. A. Chavhan

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 6 Issue 1, January 2017

Pages: 2084 - 2087

Efficient Seed and K-Value Selection in K-Means Clustering using Relative Weight and New Distance Metric

Premsagar Dandge, Aruna Gupta

Share this Article
Top