Bootstrapping in Text Mining Applications
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
www.ijsr.net | Open Access | Fully Refereed | Peer Reviewed International Journal

ISSN: 2319-7064



Downloads: 127

Research Paper | Information Technology | India | Volume 5 Issue 1, January 2016

Bootstrapping in Text Mining Applications

C. K. Chandrasekhar, M. R. Srinivasan, B. Ramesh Babu

Text mining involves analyzing large corpora of documents with thousands of words with a high level of noise content. Dimensionality reduction, noise mitigation, accurate and stable cluster formation are principal challenges of upstream analytics. This paper proposes a methodology for dimensionality as well as noise reduction using k-fold rotation estimation. Principal Component Analysis enables selecting a reduced set of dimensions (words). The resulting noise-reduced data set is the input to clustering algorithms. Experiments using benchmark data sets from the Brown corpus [5] and real life feedback data of a service provider show that our approach delivers improved results using the well-known performance measures recall, precision, and F-measure [14]. We used combination of projective transforms known as principal component analysis (PCA) and visual scree plot techniques [8, 6, 12] for dimensionality reduction and a k-Fold rotation sampling technique [1] for noise elimination and formation of stable clusters. Experimental results with corpora of different sizes demonstrate that the approach delivers improved clustering accuracy than standard k-means clustering algorithm [2].

Keywords: k-Fold Rotation Estimation, Clustering, k-Means, Principal Component Analysis, Dimensionality Reduction, Precision, Recall, F-Score, Scree Plot

Edition: Volume 5 Issue 1, January 2016

Pages: 337 - 344

Share this Article

How to Cite this Article?

C. K. Chandrasekhar, M. R. Srinivasan, B. Ramesh Babu, "Bootstrapping in Text Mining Applications", International Journal of Science and Research (IJSR), https://www.ijsr.net/search_index_results_paperid.php?id=NOV152700, Volume 5 Issue 1, January 2016, 337 - 344

Enter Your Email Address




Similar Articles with Keyword 'Clustering'

Downloads: 102

Research Paper, Information Technology, India, Volume 5 Issue 7, July 2016

Pages: 1920 - 1924

Improving Stability, Smoothing and Diversifying of Recommender Systems

Sagar Sontakke, Pratibha Chavan

Share this Article

Downloads: 103

Survey Paper, Information Technology, India, Volume 6 Issue 3, March 2017

Pages: 1403 - 1405

Inverse Problem with Solution Using Data Mining

Ashmikumari Shah, Pooja Jardosh

Share this Article

Similar Articles with Keyword 'Precision'

Downloads: 88

Research Paper, Information Technology, United States of America, Volume 9 Issue 7, July 2020

Pages: 1087 - 1088

Modern Warehouse Operations Execution Using Mobile Devices

Dhanesh Thatikonda

Share this Article

Downloads: 114

Research Paper, Information Technology, Iraq, Volume 5 Issue 5, May 2016

Pages: 1511 - 1516

Status of E-Government in Iraq and What the Challenges of Development and Implementation

Ali Abdulhussian Hassan

Share this Article

Similar Articles with Keyword 'Estimation'

Downloads: 103

Research Paper, Information Technology, India, Volume 4 Issue 3, March 2015

Pages: 2137 - 2140

Protection Figures Hitting by Refining Depiction Fragment Torrent

P Siddharthan, C Mahesh

Share this Article

Downloads: 112

Research Paper, Information Technology, India, Volume 5 Issue 5, May 2016

Pages: 1242 - 1252

Image Denoising Using Standard BP Algorithm

T. Uma Mageswari

Share this Article



Top