International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
www.ijsr.net | Most Trusted Research Journal Since Year 2012

ISSN: 2319-7064

Research Paper | Computer Science & Engineering | India | Volume 2 Issue 11, November 2013

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

In todays business world, the database plays a vital role in decision making. As the organization grows, the size of the database also gets increased. This enormous growth in the database size leads to a problem of dirty data. Dirty data is the replicated data in the database which causes some issues like performance degradation, increasing operational cost and the lack of quality. This can be removed by the process of record deduplication. The record deduplication refers to identifying the same entity with different representations. Further cleaning and removing of replica in the repository become a mandatory work. Thus this paper surveys some of the record deduplication approaches. Also it compares with three approaches to record deduplication such as genetic programming, Modified BAT algorithm, and firefly algorithm approach with its limitation and advantages on all the three got discussed.

Keywords: Record Deduplication, preprocessing, Cleaning, Dirty data, genetic programming, mbat algorithm, firefly algorithm

Edition: Volume 2 Issue 11, November 2013

Pages: 216 - 219


How to Cite this Article?

R. Gayathri, A. Malathi, "Exploration of Data Mining Techniques in Record Deduplication", International Journal of Science and Research (IJSR), https://www.ijsr.net/search_index_results_paperid.php?id=02013451, Volume 2 Issue 11, November 2013, 216 - 219

29 PDF Views | 18 PDF Downloads

Download Article PDF



Similar Articles with Keyword 'Record Deduplication'

Research Paper, Computer Science & Engineering, India, Volume 2 Issue 11, November 2013

Pages: 216 - 219

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

Share this article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 3, March 2015

Pages: 757 - 760

Record Deduplication Approaches and Algorithm for Removing Duplicate Data

Nikita A. Pande, Namrata D. Ghuse

Share this article

Similar Articles with Keyword 'preprocessing'

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 7, July 2014

Pages: 938 - 943

HCR Using K-Means Clustering Algorithm

Meha Mathur, Anil Saroliya

Share this article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 2730 - 2733

Preprocessing of Various Data Sets Using Different Classification Algorithms for Evolutionary Programming

R. Karthick, Dr. A. Malathi

Share this article

Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 10, October 2015

Pages: 1483 - 1486

A Study of Differentially Private Frequent Itemset Mining

Trupti Kenekar, A. R. Dani

Share this article

Research Paper, Computer Science & Engineering, Kenya, Volume 5 Issue 8, August 2016

Pages: 1210 - 1214

Towards Effective Data Preprocessing for Classification Using WEKA

Kariuki Paul Wahome, Wekesa Bongo, Dr. Rimiru Richard Maina

Share this article

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 6, June 2014

Pages: 2791 - 2796

Design and Implementation of Association Rules Based System for Evaluating WSD

Samit Kumar, Dr. S. Niranjan

Share this article

Similar Articles with Keyword 'Cleaning'

Review Papers, Computer Science & Engineering, India, Volume 4 Issue 1, January 2015

Pages: 2180 - 2182

Review of Improved Cross Redundant Data Cleaning Algorithm for RFID and WSN Integration

Jayashri M. Dupare, N. U. Sambhe

Share this article

Comparative Studies, Computer Science & Engineering, India, Volume 4 Issue 11, November 2015

Pages: 2540 - 2543

A Comparative Study of Rule Mining Based Web Usage Mining Algorithms

B. Uma Maheswari, Dr. P.Sumathi

Share this article

Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 12, December 2015

Pages: 2250 - 2254

Diversifies XML Keyword Search Based on its Different Contexts in the XML Data

Pooja Chudiwal, A. C. Lomte

Share this article

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014

Pages: 1850 - 1856

A Review on Detection of Outliers Over High Dimensional Streaming Data Using Cluster Based Hybrid Approach

Abhishek B. Mankar, Namrata Ghuse

Share this article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 3 Issue 3, March 2014

Pages: 286 - 291

Fast and Accurate Incremental Entity Relationships

Rajeshkumar S, Geofrin Shirly S

Share this article

Similar Articles with Keyword 'Dirty data'

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 2525 - 2528

Efficient Technique for Network Lifetime Enhancement by Cleaning Dirty Data

Komal V. Shiyale, Pranay D. Saraf

Share this article

Research Paper, Computer Science & Engineering, India, Volume 2 Issue 11, November 2013

Pages: 216 - 219

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

Share this article

Similar Articles with Keyword 'genetic programming'

Research Paper, Computer Science & Engineering, India, Volume 2 Issue 11, November 2013

Pages: 216 - 219

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

Share this article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 3, March 2015

Pages: 757 - 760

Record Deduplication Approaches and Algorithm for Removing Duplicate Data

Nikita A. Pande, Namrata D. Ghuse

Share this article

Similar Articles with Keyword 'firefly algorithm'

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 5, May 2014

Pages: 1721 - 1725

A New Method for Noisy Image Segmentation using Firefly Algorithm

Bhavana Vishwakarma, Amit Yerpude

Share this article

Research Paper, Computer Science & Engineering, India, Volume 2 Issue 11, November 2013

Pages: 216 - 219

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

Share this article

Survey Paper, Computer Science & Engineering, India, Volume 5 Issue 2, February 2016

Pages: 1593 - 1599

A Survey of Digital Image Watermarking Optimization Algorithms Inspired by Nature

Komal Sinsinwar, SPS Chauhan

Share this article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 3, March 2015

Pages: 757 - 760

Record Deduplication Approaches and Algorithm for Removing Duplicate Data

Nikita A. Pande, Namrata D. Ghuse

Share this article

Research Paper, Computer Science & Engineering, India, Volume 6 Issue 7, July 2017

Pages: 540 - 544

Lung Anomaly Detection System (LADS) Using SVM based on Firefly Algorithm

Rasha Abdul Razak A P, Harish Binu K P

Share this article

Top