Author : Gouthami Kumari G S 1
Date of Publication :7th July 2016
Abstract: Big Data is a term that satiates the inadequacy of traditional data processing of large or complex data applications. Big Data processing systems must deal with very high data ingest rates (velocity) and massive volumes of variety of data. Most data translation approaches are based primarily on the replacement or elimination of human interference with automation in key areas of information processing on the voluminous, varied variety of data arriving at high velocity. While applying Knowledge Engineering (KE) to Big Data applications, the major challenge is that the data is either structured, unstructured or semi- structured. Data is written in multiple languages using characters, each character has different encodings when text is exchanged between the customers and the Enterprise. Several linguistics structures of the text, keyword taxonomies, content categorization, language translation, context and temporally-based information retrieval areas are being considered in Big Data management. The challenges include how to capture, transfer, store, clean, analyze, share, secure and visualize the data. This paper provides survey on the pre-processing methods used in the data translation such as various Statistical Machine Translations.
Reference :
-
- Stephen G. Eick, John W. Lockwood, Ron Loui, James Moscola, Doyle J. Weishar “Transformation Algorithms for Data Streams”, IEEEAC paper #1633, Version 5, Updated December 5, 2004
- Jia Tan,Wang Chao, “ Data-English Language Statistical Machine Translation Oriented Classification Algorithm” , International Conference on Intelligent Transportation, Big Data & Smart City, 2015
- Silviu Paun , “Pattern Discovery in Big Data Streams
- T Munger S Desa, C. Wong , “The Use of Domain Knowledge Models for Effective Data Mining of Unstructured Customer Service Data in Engineering Applications”, IEEE First International Conference on Big Data Computing Service and Applications, 2015
- Johanna Monti Mario Monteleone, Maria Pia di Buono, Federica Marano, “Natural Language Processing and Big Data An Ontology-Based Approach for Cross- Lingual Information Retrieval” , SocialCom/PASSAT/BigData/EconCom/BioMedCom 2013
- J Blanco, N Marin, O Pons, MA Vila , “Softening the Object Oriented Database Model : Imprecision, Uncertainty and Fuzzy Types”, IEEE Conference, 2001
- Deyi Xiong, Min Zhang, Member, IEEE, and Xing Wang, “Topic-Based Coherence Modeling for Statistical Machine Translation”, ACM Transactions on Audio, Speech March 2015
- Gong Zhengxian, Zhou Guodong, “Employing Topic Modeling for Statistical Machine Translation”, IEEE 2011
- Francisco Oliveira, Fai Wong, Sam Chao, Pui-Chi Fong, “Design of Web based Machine Translation Environment for Multi-languages based on Moses”, International Conference on System Science and Engineering, Macau, China - June 2011
- Mohammad Anugrah Sulaeman, Ayu Purwarianti, “Development of Indonesian- Japanese SMT Using Lemma Translation and Additional Post – Process”, The 5th International Conference on Electrical Engineering and Informatics August 10-11, 2015, Bali, Indonesia
- Runxiang Zhang, YaoHong Jin, “Identification and Transformation of Comparative SentencesinPatentChinese-EnglishMachineTranslation” International Conference on Asian Language Processing, IEEE 2012
- Lei Chen, Miao Li, Maintao He, Hui Lui, “Dependency Parsing on Source Language with Reordering Information in SMT”, International Conference on Asian Language Processing, IEEE 2012
- Rahul.C, Dinunath.K, Remya Ravindran, K.P.Soman, “Processing For English-Malayalam Statistical Machine Translation “
- Amin Mansouri, Heshaam Faili, “State-of-the-art English to Persian Statistical Machine Translation System ”, The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012), IEEE 2012