Author : Er. Zainab Mirza 1
Date of Publication :7th March 2016
Abstract: In this paper, we try to implement Hadoop for storing and retrieving different parameters of network log data for analysis. This paper also consists of technologies such as shell script, Hive, PHP in accordance with other web technologies to get a better performance with Hadoop for managing Network traffic.
Reference :
-
- “Big data: science in the petabyte era,” Nature 455 (7209): 1, 2008.
- Douglas and Laney, “The importance of ‘big data’: A definition,” 2008.
- Google Research Papers, static.googleusercontent.com/media/research.google.com/e n//archive/. 2004 and 2006
- J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. In OSDI, pages 137-150{ 2004}.
- D. DeWitt and M. Stonebraker, “Mapreduce: A Major Step Backwards,” Database Column, 2008.
- Tcpdump, http://www.tcpdump.org
- Wireshark, http://www.wireshark.org
- CAIDA CoralReef Software Suite, http://www.caida.org/tools/measurement/coralreef.
- Cisco NetFlow, http://www.cisco.com/web/go/netflow
- Arbor Networks, http://www.arbornetworks.com
- Ntopng, http://www.ntop.org/products/trafficanalysis/ ntop/
- Hadoop in Practise by Alex Holmes https://manningcontent.s3.amazonaws.com/download/0/63 e4590-f9ab-4f35-825b-36a3d3b99fc4/HiP_sample_ch1.pdf
- Google Developer’s Console, https://developers.google.com/chart