Aye Myat Mon, Thandar Thein
Abstract: Natural Language Processing (NLP) is one of the most important research area carried out in the world of Human Language. For every language, spell checker is an essential component of many of Office Automation Systems and Machine Translation Systems. In this paper, we develop a Myanmar Spell Checker System which can handle Typographic errors, Sequence errors, Phonetic errors, and Context errors. A Myanmar Text Corpus is created for developing Myanmar Spell checker. To check Typographic Errors, corpus look up approach is applied. Myanmar3 Unicode is applied in this system so that it can automatically reorder the character sequence. A compound misused word detection algorithm is proposed for Phonetic Errors checking and Bayesian Classifier is applied for Context Errors checking. In this system, Levenshtein Distance Algorithm is applied to improve users efficiency by providing a suggestion list for misspelled Myanmar Words. We provide evaluation results of the system and our approach can handle various types of Myanmar spell errors.
Keywords: Levenshtein Distance Algorithm, Myanmar Spell Checker, Myanmar Text Corpus, Natural Language Processing, Nave Bayesian Classifier