International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 45 | Monthly Hits: ⮙1

Research Paper | Computational Linguistics | India | Volume 10 Issue 3, March 2021


Comparison of Various Models in the Context of Language Identification (Indo Aryan Languages)

Salman Alam


Abstract: Automatic language detection is a text classification task in which language is identified in a given multilingual text by the machine. This paper compares the different models of machine learning algorithm in the context of language identification. The corpus includes five major Indo-Aryan Language which are closely related to each other like Hindi, Bhojpuri, Awadhi, Maghahi and Braj. In this paper I have compared models like Random forest classifier, SVC, SGD Classifier, Multi-nominal logistic Regression, Gaussian Naïve Bayes and Bernoulli Naïve Bayes. Out of these models Multi-nominal Naïve Bayes has attained the best accuracy of 74 %.


Keywords: Hindi, Magahi, Bhojpuri, Braj, Awadhi, SVC, Multinominal NB, RNN, Linear SVC, SGD Classifier, Indo-Aryan


Edition: Volume 10 Issue 3, March 2021,


Pages: 185 - 188


How to Download this Article?

You Need to Register Your Email Address Before You Can Download the Article PDF


How to Cite this Article?

Salman Alam, "Comparison of Various Models in the Context of Language Identification (Indo Aryan Languages)", International Journal of Science and Research (IJSR), Volume 10 Issue 3, March 2021, pp. 185-188, https://www.ijsr.net/get_abstract.php?paper_id=SR21303115028

Top