News Articles Tag Categorization using Neural Networks and Natural Language Processing

Bhat, Ganesh Manohar

doi:https://dx.dx.doi.org/10.21275/SR211210154742

News Articles Tag Categorization using Neural Networks and Natural Language Processing

Ganesh Manohar Bhat

Abstract: This research paper and its subsequent implementation is aimed at employing a combination of neural networks and natural language processing methodologies to successfully identify and categorize tags found in unmarked news articles and can be extended to other literary publications as well. These methodologies will then be applied onto the specific dataset containing a collection of news headlines associated with a brief description of the article along with other associated information. The result of this implementation will allow us to identify such tags in the future for unmarked articles. Unmarked news articles are becoming increasingly common these days and thus, make it even more difficult for the everyday user to identify and read forth articles that are of importance and of interest to him. In the present day, people are attracted to news articles based on their headlines primarily, and unmarked articles are a means of representation for unambiguous statements found not only in the headlines but also in the description of the article as well. Thus, the goal of this implementation is to successfully classify the tags found in both the headlines and description and to produce a clear representation regarding the category the article belongs under. Once the models have been successfully built, we will then evaluate their accuracy and hence determine their effectiveness for the use of similar datasets in the future. An extension of this paper can be used to identify the type of language being used in the articles in order to identify the different writing styles present. After the models have been successfully implemented their accuracy will be used to determine their effectiveness in predicting such similar datasets in the future. In this paper, the methods being used are Text based Convolutional Neural networks, Bidirectional Gated Research Unit (GRU) and a Long Short-Term Memory (LSTM) with Attention. The performance is then the topic of discussion based on how effectively the testing data is handled as visualized by the confusion matrices of the respective models.

Keywords: Neural Networks, Gated Recurrent Units, Natural Language Processing, Long Short-Term Memory (LSTM), Text CNN, Tensorflow, Keras

How to Cite?: Ganesh Manohar Bhat, "News Articles Tag Categorization using Neural Networks and Natural Language Processing", Volume 11 Issue 1, January 2022, International Journal of Science and Research (IJSR), Pages: 328-338, https://www.ijsr.net/getabstract.php?paperid=SR211210154742, DOI: https://dx.dx.doi.org/10.21275/SR211210154742

Download Citation: APA | MLA | BibTeX | EndNote | RefMan