Parts of Speech (POS) Tagging in Telugu Corpora Using CRF Algorithm
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 3 | Views: 157 | Weekly Hits: ⮙1 | Monthly Hits: ⮙3

Research Paper | Computational Linguistics | India | Volume 13 Issue 11, November 2024 | Popularity: 4.7 / 10


     

Parts of Speech (POS) Tagging in Telugu Corpora Using CRF Algorithm

Rajula Valaraju


Abstract: The study of NLP (Natural Language Processing), a branch of computer science and AI (Artificial Intelligence), enables machines to comprehend human language effectively and assist with linguistic tasks. The initial step in every NLP task is POS (Parts of Speech) tagging, which assigns a tag to a word based on its meaning and context. The present paper discusses parts of speech tagging (POS) in Telugu using Conditional Random Fields (CRF), a sequence modelling algorithm that is particularly effective in identifying entities or text patterns, such as POS tags, in highly inflectional and agglutinative languages like Telugu. Telugu is a highly inflectional and agglutinative language widely spoken in the southern part of India (mainly Andhra Pradesh and Telangana). The Language belongs to the Dravidian Family and, it follows the S - O - V structure. Compared to other machine learning algorithms, CRF has been proven more effective in overcoming label - bias problems in a language. In order to understand the language features and to tag the test corpus, an annotated corpus of 62, 996 words and a tag set of 18 tags is used for the study. The present study has achieved an accuracy of 80.17%.


Keywords: POS tagging, CRF Model, BIS Tag set, Telugu Language


Edition: Volume 13 Issue 11, November 2024


Pages: 188 - 190


DOI: https://www.doi.org/10.21275/SR241102123024


Please Disable the Pop-Up Blocker of Web Browser

Verification Code will appear in 2 Seconds ... Wait



Text copied to Clipboard!
Rajula Valaraju, "Parts of Speech (POS) Tagging in Telugu Corpora Using CRF Algorithm", International Journal of Science and Research (IJSR), Volume 13 Issue 11, November 2024, pp. 188-190, https://www.ijsr.net/getabstract.php?paperid=SR241102123024, DOI: https://www.doi.org/10.21275/SR241102123024

Top