Bootstrapping in Text Mining Applications
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
www.ijsr.net | Open Access | Fully Refereed | Peer Reviewed International Journal

ISSN: 2319-7064

Research Paper | Information Technology | India | Volume 5 Issue 1, January 2016

Bootstrapping in Text Mining Applications

C. K. Chandrasekhar, M. R. Srinivasan, B. Ramesh Babu

Text mining involves analyzing large corpora of documents with thousands of words with a high level of noise content. Dimensionality reduction, noise mitigation, accurate and stable cluster formation are principal challenges of upstream analytics. This paper proposes a methodology for dimensionality as well as noise reduction using k-fold rotation estimation. Principal Component Analysis enables selecting a reduced set of dimensions (words). The resulting noise-reduced data set is the input to clustering algorithms. Experiments using benchmark data sets from the Brown corpus [5] and real life feedback data of a service provider show that our approach delivers improved results using the well-known performance measures recall, precision, and F-measure [14]. We used combination of projective transforms known as principal component analysis (PCA) and visual scree plot techniques [8, 6, 12] for dimensionality reduction and a k-Fold rotation sampling technique [1] for noise elimination and formation of stable clusters. Experimental results with corpora of different sizes demonstrate that the approach delivers improved clustering accuracy than standard k-means clustering algorithm [2].

Keywords: k-Fold Rotation Estimation, Clustering, k-Means, Principal Component Analysis, Dimensionality Reduction, Precision, Recall, F-Score, Scree Plot

Edition: Volume 5 Issue 1, January 2016

Pages: 337 - 344

Share this Article

How to Cite this Article?

C. K. Chandrasekhar, M. R. Srinivasan, B. Ramesh Babu, "Bootstrapping in Text Mining Applications", International Journal of Science and Research (IJSR), https://www.ijsr.net/search_index_results_paperid.php?id=NOV152700, Volume 5 Issue 1, January 2016, 337 - 344

112 PDF Views | 99 PDF Downloads

Download Article PDF



Similar Articles with Keyword 'Clustering'

Research Paper, Information Technology, India, Volume 5 Issue 7, July 2016

Pages: 1920 - 1924

Improving Stability, Smoothing and Diversifying of Recommender Systems

Sagar Sontakke, Pratibha Chavan

Share this Article

Research Paper, Information Technology, India, Volume 3 Issue 12, December 2014

Pages: 1896 - 1900

Determining and Exploring Dimensions in Subspace Clustering for Value Decomposition

Saranya Sagambari Devi.S

Share this Article

M.Tech / M.E / PhD Thesis, Information Technology, India, Volume 4 Issue 3, March 2015

Pages: 2441 - 2444

Privacy-Preservation of Centralized and Distributed Social Network by Using L-Diversity Algorithm

Shankaranand, P. Rajasekar

Share this Article

Survey Paper, Information Technology, India, Volume 6 Issue 3, March 2017

Pages: 1403 - 1405

Inverse Problem with Solution Using Data Mining

Ashmikumari Shah, Pooja Jardosh

Share this Article

M.Tech / M.E / PhD Thesis, Information Technology, India, Volume 3 Issue 7, July 2014

Pages: 1743 - 1746

Hybrid Approach for Outlier Detection in High Dimensional Dataset

Rohini Balkrishna Gurav, Sonali Rangdale

Share this Article

Similar Articles with Keyword 'k-Means'

Research Paper, Information Technology, India, Volume 4 Issue 4, April 2015

Pages: 988 - 991

GPU Accelerated Clustering Techniques

Komal D. Nistane, Shailendra W. Shende

Share this Article

Comparative Studies, Information Technology, India, Volume 5 Issue 4, April 2016

Pages: 2013 - 2019

Comprehensive Research on Privacy Preserving Emphasizing on Distributed Clustering

Prajna M.S., Sumana M.

Share this Article

Research Paper, Information Technology, India, Volume 4 Issue 9, September 2015

Pages: 876 - 880

Mining GPS Data for Traffic Congestion Detection and Prediction

Suhas Prakash Kaklij

Share this Article

Research Paper, Information Technology, India, Volume 5 Issue 1, January 2016

Pages: 337 - 344

Bootstrapping in Text Mining Applications

C. K. Chandrasekhar, M. R. Srinivasan, B. Ramesh Babu

Share this Article

Research Paper, Information Technology, India, Volume 8 Issue 1, January 2019

Pages: 1055 - 1058

Detection of ADHD using Machine Learning Algorithms

Rohit Kale

Share this Article

Similar Articles with Keyword 'Principal Component Analysis'

Research Paper, Information Technology, Kenya, Volume 8 Issue 8, August 2019

Pages: 1825 - 1829

The Social Software Learnability Prediction (SSLP) Tool

Masese. B. Nelson

Share this Article

Research Paper, Information Technology, India, Volume 5 Issue 1, January 2016

Pages: 337 - 344

Bootstrapping in Text Mining Applications

C. K. Chandrasekhar, M. R. Srinivasan, B. Ramesh Babu

Share this Article

Similar Articles with Keyword 'Precision'

Research Paper, Information Technology, United States of America, Volume 9 Issue 7, July 2020

Pages: 1087 - 1088

Modern Warehouse Operations Execution Using Mobile Devices

Dhanesh Thatikonda

Share this Article

Research Paper, Information Technology, Iraq, Volume 5 Issue 5, May 2016

Pages: 1511 - 1516

Status of E-Government in Iraq and What the Challenges of Development and Implementation

Ali Abdulhussian Hassan

Share this Article

Research Paper, Information Technology, India, Volume 5 Issue 1, January 2016

Pages: 337 - 344

Bootstrapping in Text Mining Applications

C. K. Chandrasekhar, M. R. Srinivasan, B. Ramesh Babu

Share this Article

Research Paper, Information Technology, Indonesia, Volume 4 Issue 12, December 2015

Pages: 361 - 364

Using Latent Semantic Index for Content-Based Image Retrieval

Andy, Bernardus Ari Kuncoro

Share this Article

Similar Articles with Keyword 'Recall'

Research Paper, Information Technology, India, Volume 5 Issue 1, January 2016

Pages: 337 - 344

Bootstrapping in Text Mining Applications

C. K. Chandrasekhar, M. R. Srinivasan, B. Ramesh Babu

Share this Article

Research Paper, Information Technology, Indonesia, Volume 4 Issue 12, December 2015

Pages: 361 - 364

Using Latent Semantic Index for Content-Based Image Retrieval

Andy, Bernardus Ari Kuncoro

Share this Article
Top