Author : Kaung Myat Thu, H. Mamata Devi, Th. Rupachandra Singh
Date of Publication :5th December 2024
Abstract: Morphological Analysis and Generation (MAG) are essential in natural language processing, especially for morphologically rich languages. It is the first step toward every NLP task, including lemmatization, POS tagging, spell checking, grammar checking, machine translation, text summarization, and information extraction. MAG deals with the study of word formation and grammatical structure inside a word. Every MAG task comprises three main parts: a morpheme lexicon, a set of morphotactic or orthographic rules, and decision algorithms. This paper has reviewed some popular approaches that many researchers have taken. We found that the Corpus-based machine learning approach (SVM, NN, CRF, MDL, ...), Paradigm based approach, two-level technique, Finite State Automata (FSA) based techniques, Finite State Transducers (FST) based techniques, Suffix stripping, DAWG (Directed Acrylic Word Graph) are popular, successful methods reported in the literature. Few or no research and developments in morphological analysis and generation for the Myanmar language have made this study a review of the literature on other similar languages.
Reference :