International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064

Downloads: 127 | Views: 140

Research Paper | Information Technology | India | Volume 5 Issue 1, January 2016

Bootstrapping in Text Mining Applications

C. K. Chandrasekhar | M. R. Srinivasan | B. Ramesh Babu [3]

Abstract: Text mining involves analyzing large corpora of documents with thousands of words with a high level of noise content. Dimensionality reduction, noise mitigation, accurate and stable cluster formation are principal challenges of upstream analytics. This paper proposes a methodology for dimensionality as well as noise reduction using k-fold rotation estimation. Principal Component Analysis enables selecting a reduced set of dimensions (words). The resulting noise-reduced data set is the input to clustering algorithms. Experiments using benchmark data sets from the Brown corpus [5] and real life feedback data of a service provider show that our approach delivers improved results using the well-known performance measures recall, precision, and F-measure [14]. We used combination of projective transforms known as principal component analysis (PCA) and visual scree plot techniques [8, 6, 12] for dimensionality reduction and a k-Fold rotation sampling technique [1] for noise elimination and formation of stable clusters. Experimental results with corpora of different sizes demonstrate that the approach delivers improved clustering accuracy than standard k-means clustering algorithm [2].

Keywords: k-Fold Rotation Estimation, Clustering, k-Means, Principal Component Analysis, Dimensionality Reduction, Precision, Recall, F-Score, Scree Plot

Edition: Volume 5 Issue 1, January 2016,

Pages: 337 - 344

How to Download this Article?

Type Your Email Address below to Download the Article PDF

How to Cite this Article?

C. K. Chandrasekhar, M. R. Srinivasan, B. Ramesh Babu, "Bootstrapping in Text Mining Applications", International Journal of Science and Research (IJSR), Volume 5 Issue 1, January 2016, pp. 337-344,

Similar Articles with Keyword 'Clustering'

Downloads: 103

Research Paper, Information Technology, India, Volume 5 Issue 7, July 2016

Pages: 1920 - 1924

Improving Stability, Smoothing and Diversifying of Recommender Systems

Sagar Sontakke | Pratibha Chavan

Share this Article

Downloads: 103

Survey Paper, Information Technology, India, Volume 6 Issue 3, March 2017

Pages: 1403 - 1405

Inverse Problem with Solution Using Data Mining

Ashmikumari Shah | Pooja Jardosh [3]

Share this Article