Date of Publication :8th June 2017
Abstract: The wide spectrum of languages are been used for communication around the world , utilization of world wide web for searching information requires computational linguistics because majority of the search engines uses bag of words that causes problem in extracting of the information due to use of Multi words . This has made to think beyond the boundaries about what kinds of query a human can submit and also its interpretation in forms of its annotation could be used to obtain good result. The essential st ep in the Natural Language Processing resides in obtaining the grammatical information of the words used in the input as per it appearance in the text .POS taggers for several other Indian languages have been developed but assumption of unavailability of the POS tagger for the Konkani language aims at developing the same. Further POS tagging to do manually is much tougher job due to huge content of data. This paper aims at part of speech tagging for Konkani corpus.
Reference :
-
- Ed. T. Jaynes, “Information Theory”, dated 1957 http://homepages.inf.ed.ac.uk/lzhang10/maxent.html
- Abney, “Stochastic Attribute-Value Grammars”, dated 1997 http://citeseer.ist.psu.edu/490897.html
- Christopher D. Manning, Hinrich Schutze, “Foundations of statistical natural language processing”
- Experiences in Building the Konkani WordNet Using the Expansion Approach http://www.cfilt.iitb.ac.in/gwc2010/pdfs/54_Konkani_Wo rdNet__Walawalikar.pdf
- Daniel Jurafsky and James H.Martin, “Speech and Language Processing”Adam L. Berger, Stephen A. Della Pietra and Vincent J. Della Pietra, “A maximum entropy approach to natural language processing”
- Stochastic Algorithm http://citeseer.ist.psu.edu/rosenfeld94adaptive.html
- Morphological Analyzer http://Morphadorner.northwestern.edu/morphadorner/post agger/example
- “A Part Of Speech Tagger For Indian Languages” http://shiva.iiit.ac.in/SPSAL2007/iiit_tagset_guidelines.pd f
- Hindi POS Tagging and Chunking Itrc.ac.in/nlpai_contest06/papers/msrindia.pdf
- Sanskrit Tagger, a stochastic lexical and pos tagger for Sanskrit http://hal.inria.fr/inria-00203467/fr/
- A maximum entropy model for Part of Speech tagging www.Idc.upenn.edu/acI/W/W96/W96-0213.pdf
- Natural Language Processing cnlp.syr.edu/publications/03NLP.LIS.Encyclopedia.pdf
- BIS Annotation Standards With Reference to Konkani Language – Goa university
- Multiword Expressions Dataset for Indian Languages https://www.cse.iitb.ac.in/~pb/papers/lrec16-m w-resource.pdf