Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Big Data Concepts, Challenges and Solution in Hadoop Ecosystem

Author : Dr. Ujjwal Agarwal 1

Date of Publication :12th October 2017

Abstract: Data becomes big data when its volume, variety, and velocity exceed the abilities of our systems architecture and algorithm. This paper discusses about three major sources of big data: machine generated data, people generated data and organization generated data, 6V’s of Big Data: volume, velocity, variety, valence, veracity and value along with we discussed the different variety of data: structured, semi-structured and un-structured data like sensor, images, PDF, CSV, JSON, RDMS, database, table data etc. out of which approximately 5% of available data is in structured form rest other data is in either unstructured or semi structured. Big data is facing lots of challenges due to volume, variety and other complexity in the data. Hadoop is the platform where we can find all our solution related to big data to store process and analysis purpose. The main objective of this paper to describe how Hadoop can solve different challenges of Big data by using HDFS (Hadoop distributed file System), Map Reduce and Hadoop Ecosystem components like Hive, Sqoop, HBase, Pig, spark, Flume, Kafka etc.

Reference :

    1. Big Data. Nature (http://www.nature.com/ news/specials/bigdata/index.html), Sep 2008.
    2. http://blogs.worldbank.org/voices/meet-winners-andfinalists-firstwbg-big-data-innovation-challenge
    3. Vivekananth.P, Leo John Baptist.A. An Analysis of Big Data Analytics Techniques. International Journal of Engineering and Management Research. October2015,Volume-5, Issue-5
    4. Hirak Kashyap, Hasin Afzal Ahmed, Nazrul Hoque, Swarup Roy, and Dhruba Kumar Bhattacharyya. Big Data Analytics in Bioinformatics: A Machine Learning Perspective. JOURNAL OF LATEX CLASS FILES, Sepetember 2014, Vol. 13, NO. 9.
    5. Hadoop release. Apache.org Apache software foundation. Retrieved 2016-11-27.
    6. R. J. Robison, How big is the human genome? Precision Medicine, January 2014.
    7. A. Labrinidis and H. Jagadish, “Challenges and opportunities with big data,” Proceedings of the VLDB Endowment, vol. 5, no. 12, pp. 2032–2033, 2012. View at Google Scholar.
    8. R. T. Kouzes, G. A. Anderson, S. T. Elbert, I. Gorton, and D. K. Gracio, “The changing paradigm of data-intensive computing,” IEEE Computer, vol. 42, no. 1, pp. 26–34, 2009. View at Publisher • View at Google Scholar • View at Scopus.
    9. searchcloudcomputing.techtarget.com/definition/ bigdata-Big-Data
    10.  IOSR Journal of Computer Engineering (IOSR-JCE) eISSN: 2278-0661,p-ISSN: 2278-8727 PP 01-05

Recent Article