Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Batch Normalization and Its Optimization Techniques: Review

Author : Niveditha Kumaran 1 Ashlesha Vaidya 2

Date of Publication :9th August 2017

Abstract: — Batch normalization is a boon to the training of a deep neural network. It acts as a panacea to the problem of internal covariate shift and facilitates the usage of higher learning rates. It also accounts for the inclusion of saturating non-linear functions, while excluding the need of drop outs for regularisation. However, mini-batch normalization is not self-sufficient and comes with a few limitations such as inability to deal with non-i.i.d inputs and decreased efficiency with a batch size of one. In this paper, we explore normalization, the need for its optimization, and evaluate the optimization technique provided by researchers.

Reference :

    1. TheSutskever, Ilya, Martens, James, Dahl, George E., and Hinton, Geoffrey E. On the importance of initialization and momentum in deep learning. In ICML(3), volume 28 of JMLR Proceedings, pp. 1139–1147.JMLR.org, 2013.
    2. Shimodaira, Hidetoshi. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference,90(2):227–244, October 2000.
    3. Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex,Sutskever, Ilya, and Salakhutdinov, Ruslan. Dropout:A simple way to prevent neural networks from overfitting.J. Mach. Learn. Res., 15(1):1929–1958, January 2014.
    4. LeCun, Y., Bottou, L., Orr, G., and Muller, K. Efficient backprop. In Orr, G. and K., Muller (eds.), Neural Networks:Tricks of the trade. Springer, 1998b.
    5. S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conferenceon Machine Learning (ICML-15), pages 448–456, 2015.
    6. J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov. Neighbourhood components analysis. In Advances in Neural Information Processing Systems 17, 2004.
    7. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna.Rethinking the inception architecture for computer vision.InProceedings of the IEEE Conference on ComputerVision and Pattern Recognition, pages 2818–2826, 2016.
    8. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh,S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein,A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge, 2014.
    9. S. Ioffe. Batch renormalization: Towards reducing minibatch dependence in batch-normalized models. arXiv preprint arXiv:1702.03275, 2017
    10. J. Chen, R. Monga, S. Bengio, and R. Jozefowicz. Revisiting distributed synchronous sgd. arXiv preprint arXiv:1604.00981, 2016.
    11. http://https: // gab41.lab41.org /batch-normalizationwhat the-hey- d480039a9e3b
    12. http://cs231n.github.io/optimization-1/ https :// www . semantic scholar .org/ paper / Image Netpre- trained- models- with – batch – normalizatiSimon Rodner / 1d5fe8230 3712a7 0c1d231 ead2ee0 3f042d8ad70

Recent Article