International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Call For Paper : Vol 11, Issue 03, March 2024

Enhancing Map Reduce Performance in Heterogeneous Distributed Environment

Author : Suyash Mishra ¹ Dr. Anuranjan Mishra ²

Date of Publication :7th September 2017

Abstract: Now a dayâ€™s size of the data used in todayâ€™s enterprises worlds has been growing at exponential rates day by day. This had triggered need to process and analyze the large volumes of data for business decision making quickly as well. MapReduce is considered as a core-processing engine of Hadoop, which is prominently used to cater continuously increasing demands on computing resources imposed by massive data sets. Highly scalable feature of MapReduce processing, allows parallel and distributed processing on multiple computing nodes. This paper talks about various scheduling methodologies and most appropriate one can be used for improving MapReduce processing .Also tried to identify scheduling methods scaling or processing limitations along with the situations wherein they can be best suited. Map Reduce is used majorly for short jobs, which eventually require low response time. The current Hadoop implementation assumes underline computing nodes in a cluster are homogeneous, have same processing capability and memory. Hadoopâ€™s scheduler suffers from severe performance degradation in heterogeneous environments. In heterogeneous environment, Longest Approximate Time to End (LATE) scheduling can be most efficient in comparison to other scheduling .It has been seen in various studies that LATE has improved Hadoop response times by approximately two times in a clusters.

Reference :

1. Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters. In: OSDI 2004, San Francisco, CA, pp. 137–150 (December 2004)
2. Hadoop MapReduce, http://hadoop.apache.org/mapreduce/
3. Thusoo, A., Shao, Z., Anthony, S., Borthakur, D., Jain, N., Sen Sarma, J., Murthy, R., Liu, H.: Data warehousing and analytics infrastructure at facebook. In: Proceedings of the 2010 International Conference on Management of Data, SIGMOD 2010, pp. 1013–1020. ACM, New York (2010)
4. Ananthanarayanan, G., Kandula, S., Greenberg, A., Stoica, I., Lu, Y., Saha, B., Harris, E.: Reining in the outliers in map-reduce clusters using mantri. In: OSDI 2010, pp. 1–16. USENIX Asoc., Berkeley (2010)
5. Polo, J., Carrera, D., Becerra, Y., Steinder, M., Whalley, I.: Performance-driven task co-scheduling for MapReduce environments. In: Network Operations and Management Symposium, NOMS, pp. 373–380. IEEE, Osaka (2010)
6. Wolf, J., Rajan, D., Hildrum, K., Khandekar, R., Kumar, V., Parekh, S., Wu, K.-L., Balmin, A.: Flex: A Slot Allocation Scheduling Optimizer for Mapreduce Workloads. In: Gupta, I., Mascolo, C. (eds.) Middleware 2010. LNCS, vol. 6452, pp. 1–20. Springer, Heidelberg (2010)
7. Dynamic Proportional share scheduling in Hadoop Thomas sandholm and Kevin Springer Berlin Heidelberg Volume 6253, 2010, pp 110-131
8. Improving Map Reduce Performance through Data Placement in Heterogeneous Hadoop Clusters- Jiong Xie, Shu Yin, Xiaojun Ruan, Zhiyang Ding, Yun Tian, James Majors, Adam Manzanares, and Xiao Qin -Department of Computer Science and Software Engineering Auburn University, Auburn, AL 36849-5347
9. An Empirical Analysis of Scheduling techniques for Real-time cloud based data processing-linh T.X. Phan Zhuoyao zhang, Qi Zheng Boon Thau Loo University of Pennsylvania
10. Herodotou, H., and Babu, S. Profiling, what-if analysis, and cost-based optimization of MapReduce programs. In Proc. Int’ Conf. on Very Large Data Bases (VLDB) (2011).
11. MapR. The executive’s guide to big data. http://www.mapr.com/resources/white-papers.
12. Pettijohn, E., Guo, Y., Lama, P., and Zhou, X. Usercentric heterogeneity-aware mapreduce job provisioning in the public cloud. In Proc. Int’l Conference on Autonomic Computing (ICAC) (2014).
13. Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F. B., and Babu, S. Starfish: A self-tuning system for big data analytics. In Proc. Conference on Innovative Data Systems Research (CIDR) (2011).
14. Lama, P., and Zhou, X. Aroma: Automated resource allocation and configuration of mapreduce environment in the cloud. In Proc. Int’l Conf. on Autonomic computing (ICAC) (2012).
15. Li, X., Wang, Y., Jiao, Y., Xu, C., and Yu, W. Coomr: Cross-task coordination for efficient data management in mapreduce programs. In Proc. Int’l Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2013).
16. Kambatla, K., Pathak, A., and Pucha, H. Towards optimizing hadoop provisioning in the cloud. In Proc. USENIX HotCloud Workshop (2009).
17. Li, Z., Cheng, Y., Liu, C., and Zhao, C. Minimum standard deviation difference-based thresholding. In Proc. Int’l Conference on Measuring Technology and Mechatronics Automation (ICMTMA) (2010).
18. Jinda, A., Quian-Ruiz, J., and Dittrich, J. Trojan data layouts: Right shoes for a running elephant. In Proc. of ACM Symposium on Cloud Computing (SoCC) (2011).

Recent Article

● Design and Analysis of an Intelligent Speech Recognition System

● Smart Engine Locking System Using GPS & GSM Module

● Marine Monitoring Using Wireless Ad Hoc Network and GPS

● Thermodynamic Analysis of Kalina Cycle

● Effective Technologies for Reducing Emissions and Saving Energy in Mechanical Industries

● A Research on Security System in Cloud Computing

● A Research Paper on Infrared Thermography

● Digitization Eraâ€“ How the Manufacturing Sector Can Change Its Rationale

● Review on Graphene Sensor for Human Health

● Design and Analysis of an Intelligent Speech Recognition System

● Smart Engine Locking System Using GPS & GSM Module

● Marine Monitoring Using Wireless Ad Hoc Network and GPS

● Thermodynamic Analysis of Kalina Cycle

● Effective Technologies for Reducing Emissions and Saving Energy in Mechanical Industries

● A Research on Security System in Cloud Computing

● Design and Implementation of Ultrasonic Walking Stick for Visually Impaired People

● A Research Paper on Infrared Thermography

● Statistical Model for Airport Access Mode Choice to Address Congestion at Airport

● Approach of intelligent agent in Cloud resource management

● Data Mining Method using Clustering Mechanisms and Feature assortment for efficient content categorization

● Modern car parking system using Micro controller and smart intelligent application system Techniques

● A Reliable Congestion Control Framework for Wireless Sensor Networks

● Green Enhancement in Cloud Computing Environments

● Student's Performance Predictor using Multi Channel Classifier

● Some Results on Cyclic Codes over the ring of integers modulo 8

● Detection and Analysis of Influence by Renowned Leaders in Online Social Networks

● Prediction of Cancer Risk in Perspective of Symptoms using NaÃ¯ve Bayes Classifier

● Cloud Computing and Semantic web: An Interplay

● Detecting Better Keyword Query Suggestion Based on Location Using Weighted Keyword Document The graph on Partition Based Algorithm.

● Dynamic Secret Key Generation for Multi Data User in Cloud Computing

● Live Migration of Delta Compressed Virtual Machines using MPTCP

● A Review Paper on the Most Trending Technology: â€œ Big Data & itâ€™s Processing using Hadoop â€

● Design and Implementation of Real time Wireless Sensor Networks based multi patient health care monitoring system

● Deep Learning- Emerging Science

● A Secured Public key Exchange Technique for Elliptic Curve Cryptography

● Energy Balance of Structural System with Load Sliding

● Behavior and appearance cram of Android Robot for HRI

● A Study on Trends, Reviews and Effects of Online Shopping in India

● Feature Vectors Generation for Mammogram Classification based on 2-D GLCM matrix

● Industrial Internet of Things

● A Preliminary Study about an emerging approach in Cryptography: Quantum Cryptography

● Metric Based Approach to Identify Test Case Orderings

● Sentiment Analysis with Vector Feature Extraction and Classification of Social Media Dataset

● Outlier Detection using Kmeans and Neural Network in Data Mining

● FiDoop-DP: An Efficient Data Mining Technique on Heterogeneous Clusters

● Multi-Channel Three-Dimensional Probability CSMA Protocol of Analysis with Monitoring Function for WSN

● Enhancement of Digital Images Using Fir Filter

● Effective Approach for Inconsistent Probabilistic Graph Database

● AI in Cyber Security

● Perspective of Reality

● Spectral Response of Multispectral Sensors to Remote Sensing Based PM10 Retrieval

● A Fuzzy Logic Based Techniques for content based image Retrieval and Digital Data Transmission

● Analysis and Prediction of Chronic Kidney Disease using Data Mining Techniques

● A Survey Paper on Diabetes Retinopathy

● Enhancing Map Reduce Performance in Heterogeneous Distributed Environment

● Behavior and appearance cram of Android Robot for HRI

● A Study of WannaCry Ransomware Attack.

● A Study on Software Development Life Cycle & its Model