Exploration of Data Mining Techniques in Record Deduplication
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
www.ijsr.net | Open Access | Fully Refereed | Peer Reviewed International Journal

ISSN: 2319-7064

Research Paper | Computer Science & Engineering | India | Volume 2 Issue 11, November 2013

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

In todays business world, the database plays a vital role in decision making. As the organization grows, the size of the database also gets increased. This enormous growth in the database size leads to a problem of dirty data. Dirty data is the replicated data in the database which causes some issues like performance degradation, increasing operational cost and the lack of quality. This can be removed by the process of record deduplication. The record deduplication refers to identifying the same entity with different representations. Further cleaning and removing of replica in the repository become a mandatory work. Thus this paper surveys some of the record deduplication approaches. Also it compares with three approaches to record deduplication such as genetic programming, Modified BAT algorithm, and firefly algorithm approach with its limitation and advantages on all the three got discussed.

Keywords: Record Deduplication, preprocessing, Cleaning, Dirty data, genetic programming, mbat algorithm, firefly algorithm

Edition: Volume 2 Issue 11, November 2013

Pages: 216 - 219

Share this Article

How to Cite this Article?

R. Gayathri, A. Malathi, "Exploration of Data Mining Techniques in Record Deduplication", International Journal of Science and Research (IJSR), https://www.ijsr.net/search_index_results_paperid.php?id=02013451, Volume 2 Issue 11, November 2013, 216 - 219

89 PDF Views | 75 PDF Downloads

Download Article PDF



Similar Articles with Keyword 'Record Deduplication'

Research Paper, Computer Science & Engineering, India, Volume 2 Issue 11, November 2013

Pages: 216 - 219

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 3, March 2015

Pages: 757 - 760

Record Deduplication Approaches and Algorithm for Removing Duplicate Data

Nikita A. Pande, Namrata D. Ghuse

Share this Article

Similar Articles with Keyword 'preprocessing'

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 11, November 2015

Pages: 1165 - 1168

Privacy Preserving Closed Frequent Pattern Mining

Anju Vijayan

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 7, July 2014

Pages: 938 - 943

HCR Using K-Means Clustering Algorithm

Meha Mathur, Anil Saroliya

Share this Article

Review Papers, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014

Pages: 2810 - 2813

Palm Print Recognition System ? Review

Mohit Trehan, Nitin Bhagat

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 2730 - 2733

Preprocessing of Various Data Sets Using Different Classification Algorithms for Evolutionary Programming

R. Karthick, Dr. A. Malathi

Share this Article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015

Pages: 2114 - 2117

Enhanced Document Clustering for Forensic Analysis

Rahul D. Kopulwar, Fazeel Irshad Zama

Share this Article

Similar Articles with Keyword 'Cleaning'

Review Papers, Computer Science & Engineering, India, Volume 4 Issue 1, January 2015

Pages: 2180 - 2182

Review of Improved Cross Redundant Data Cleaning Algorithm for RFID and WSN Integration

Jayashri M. Dupare, N. U. Sambhe

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014

Pages: 1850 - 1856

A Review on Detection of Outliers Over High Dimensional Streaming Data Using Cluster Based Hybrid Approach

Abhishek B. Mankar, Namrata Ghuse

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014

Pages: 751 - 754

A Survey on Duplicate Detection in Hierarchical Data

Nikhil Gawande, S. R. Todamal

Share this Article

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 4 Issue 7, July 2015

Pages: 721 - 723

Techniques for Duplicate Detection in Hierarchical Data

Suvarna Kale, Basha Vankudothu

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 2525 - 2528

Efficient Technique for Network Lifetime Enhancement by Cleaning Dirty Data

Komal V. Shiyale, Pranay D. Saraf

Share this Article

Similar Articles with Keyword 'Dirty data'

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 2525 - 2528

Efficient Technique for Network Lifetime Enhancement by Cleaning Dirty Data

Komal V. Shiyale, Pranay D. Saraf

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 2 Issue 11, November 2013

Pages: 216 - 219

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

Share this Article

Similar Articles with Keyword 'genetic programming'

Research Paper, Computer Science & Engineering, India, Volume 2 Issue 11, November 2013

Pages: 216 - 219

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 3, March 2015

Pages: 757 - 760

Record Deduplication Approaches and Algorithm for Removing Duplicate Data

Nikita A. Pande, Namrata D. Ghuse

Share this Article

Similar Articles with Keyword 'firefly algorithm'

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 5, May 2014

Pages: 1721 - 1725

A New Method for Noisy Image Segmentation using Firefly Algorithm

Bhavana Vishwakarma, Amit Yerpude

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 2 Issue 11, November 2013

Pages: 216 - 219

Exploration of Data Mining Techniques in Record Deduplication

R. Gayathri, A. Malathi

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 4 Issue 3, March 2015

Pages: 757 - 760

Record Deduplication Approaches and Algorithm for Removing Duplicate Data

Nikita A. Pande, Namrata D. Ghuse

Share this Article

Survey Paper, Computer Science & Engineering, India, Volume 5 Issue 2, February 2016

Pages: 1593 - 1599

A Survey of Digital Image Watermarking Optimization Algorithms Inspired by Nature

Komal Sinsinwar, SPS Chauhan

Share this Article

Research Paper, Computer Science & Engineering, India, Volume 6 Issue 7, July 2017

Pages: 540 - 544

Lung Anomaly Detection System (LADS) Using SVM based on Firefly Algorithm

Rasha Abdul Razak A P, Harish Binu K P

Share this Article
Top