Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Multi View Video Summarization Using RNN and SURF Based High Level Moving Object Feature Frames

Author : Vinsent Paramanantham 1 Dr. S. SureshKumar 2

Date of Publication :19th May 2022

Abstract: Multi-View Video summarization is a process to ease the storage consumption that facilitates organized storage, and perform other mainline videos analytical task. This in-turn helps quick search or browse and retrieve the video data with minimum time and without losing crucial data. In static video summarization, there is less challenge in time and sequence issues to rearrange the video-synopsis. The low-level features are easy to compute and retrieve. But for high-level features like event detection, emotion detection, object recognition, face detection, gesture detection, and others requires the comprehension of the video content. This research is to propose an approach to over- come the difficulties in handling the high-level features. The distinguishable contents from the videos are identified by object detection and feature-based area strategy. The major aspect of the proposed solution is to retrieve the attributes of a motion source from a video frame. By dividing the details of the object that are available in the video frame wavelet decomposition are achieved. The motion frequency scoring method records the time of motions in the video. The frequency motion feature of video usage is a challenge given the continuous change of objects shape. Therefore, the object position and corner points are spotted using Speeded Up Robust Features (SURF) feature points. Support vector machine clustering extracts keyframes. The memory-based re- current neural network (RNN) recognizes the object in the video frame and remembers a long sequence. RNN is an artificial neural network where nodes form a temporal relationship. The attention layer in the proposed RNN network extracts the details about the objects in motion. The motion objects identified using the three video clippings is finally summarized using video summarization algorithm. To perform the simulation, MATLAB R 2014b software was used.

Reference :

    1. J. Mas and G. Fernandez, “Video shot boundary detection based on color histogram.” in TRECVID, 2003.
    2. P. Porwik and A. Lisowska, “The haar-wavelet transform in digital image processing: its status and achievements,” Machine graphics and vision, vol. 13, no. 1/2, pp. 79–98, 2004.
    3. D. Popov, A. Gapochkin, and A. Nekrasov, “An algorithm of daubechies wavelet transform in the final field when processing speech signals,” Electronics, vol. 7, no. 7, p. 120, 2018.
    4. P. Lipinski and M. Yatsymirskyy, “Efficient 1d and 2d daubechies wavelet transforms with applica- tion to signal processing,” in International Conference on Adaptive and Natural Computing Algorithms. Springer, 2007, pp. 391–398.
    5. Z. Ji, Y. Zhang, Y. Pang, and X. Li, “Hypergraph dominant set based multi-video summarization,” Signal Processing, vol. 148, pp. 114–123, 2018.
    6. K. Zhang, K. Grauman, and F. Sha, “Retrospective encoders for video summarization,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 383–399
    7. Z. Ji, K. Xiong, Y. Pang, and X. Li, “Video summarization with attention-based encoder-decoder net- works,” IEEE Transactions on Circuits and Systems for Video Technology, 2019.
    8. Z. Akhtar and T. H. Falk, “Audio-visual multimedia quality assessment: A comprehensive survey,” IEEE Access, vol. 5, pp. 21 090–21 117, 2017.
    9. M. Srinivas, M. M. Pai, and R. M. Pai, “An improved algorithm for video summarization–a rank based approach,” Procedia Computer Science, vol. 89, pp. 812–819, 2016.
    10. S. Lin, H. Ruimin, and Z. Rui, “Depth similarity enhanced image summarization algorithm for hole-filling in depth image-based rendering,” China Communications, vol. 11, no. 11, pp. 60–68, 2014.
    11. M. A. T. Jimenez, “Summarization of video from feature extraction method using image processing & artificial intelligence.”
    12. M. M. Salehin, M. Paul, and M. A. Kabir, “Video summarization using line segments, angles and conic parts,” PloS one, vol. 12, no. 11, p. e0181636, 2017.
    13. C. Huang and H. Wang, “A novel key-frames selection framework for comprehensive video summariza- tion,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 2, pp. 577–589, 2019.
    14.  L. D’Amiano, D. Cozzolino, G. Poggi, and L. Verdoliva, “A patchmatch-based dense-field algorithm for video copy–move detection and localization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 3, pp. 669–682, 2018
    15. A. Eltantawy and M. S. Shehata, “An accelerated sequential pcp-based method for ground-moving objects detection from aerial videos,” IEEE Transactions on Image Processing, vol. 28, no. 12, pp. 5991–6006, 2019.
    16. A. ElTantawy and M. S. Shehata, “Krmaro: Aerial detection of small-size ground moving objects using kinematic regularization and matrix rank optimization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 6, pp. 1672–1686, 2018.
    17. S. Javed, A. Mahmood, S. Al-Maadeed, T. Bouwmans, and S. K. Jung, “Moving object detection in complex scene using spatiotemporal structured-sparse rpca,” IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 1007–1022, 2018.
    18. B. Zhao and E. P. Xing, “Quasi real-time summarization for consumer videos,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 2513–2520.
    19. Z. Liu, D. An, and X. Huang, “Moving target shadow detection and global background reconstruction for videosar based on single-frame imagery,” IEEE Access, vol. 7, pp. 42 418–42 425, 2019.
    20. T. Zhuo, Z. Cheng, P. Zhang, Y. Wong, and M. Kankanhalli, “Unsupervised online video object segmenta- tion with motion property understanding,” IEEE Transactions on Image Processing, vol. 29, pp. 237–249, 2019.
    21. K. He, G. Gkioxari, P. Dolla´r, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961–2969. [22]. N. Dorudian, S. Lauria, and S. Swift, “Moving object detection using adaptive blind update and rgb-d camera,” IEEE Sensors Journal, vol. 19, no. 18, pp. 8191–8201, 2019
    22. Nonparametric background modelling and segmentation to detect micro air vehicles using rgb-d sensor,” International Journal of Micro Air Vehicles, vol. 11, p. 1756829318822327, 2019.
    23. J. Li, X. Liu, F. Liu, D. Xu, Q. Gu, and I. Ishii, “A hardware-oriented algorithm for ultra-high-speed object detection,” IEEE Sensors Journal, vol. 19, no. 10, pp. 3818–3831, 2019.
    24. C. Yi and Y. Tian, “Scene text recognition in mobile applications by character descriptor and structure configuration,” IEEE transactions on image processing, vol. 23, no. 7, pp. 2972–2982, 2014.
    25. P. Gao, X. Sun, and W. Wang, “Moving object detection based on kirsch operator combined with optical flow,” in 2010 International Conference on Image Analysis and Signal Processing. IEEE, 2010, pp. 620–624.
    26. J. Lu, “An algorithm for edge extraction based on fast kirsch and the probability of being true edges,” [28]. Computer Applications, vol. 21, pp. 33–35, 2001.
    27. X. X. W. H. Y. Huisheng, “Application of an improved otsu method in the kirsch edge detection [j],” [30]. Computer & Digital Engineering, vol. 3, 2009
    28. Wikipedia contributors, “Video frame resizing,image scaling,” 15 October 2020, [Online; accessed 04-Nov-2020]. [Online]. Available: https://en.wikipedia.org/wiki/Image scaling
    29. D. Sen, B. Raman et al., “Video skimming: taxonomy and comprehensive survey,” ACM Computing Surveys (CSUR), vol. 52, no. 5, p. 106, 2019.
    30. M. I. Sezan and R. L. Lagendijk, Motion analysis and image sequence processing. Springer Science & Business Media, 2012, vol. 220.
    31. R. R. Schultz and R. L. Stevenson, “Extraction of high-resolution frames from video sequences,” IEEE transactions on image processing, vol. 5, no. 6, pp. 996–1011, 1996.
    32. C. Liu and D. Sun, “A bayesian approach to adaptive video super resolution,” in CVPR 2011. IEEE, 2011, pp. 209–216.
    33.  Z.-Q. Zhao, P. Zheng, S.-t. Xu, and X. Wu, “Object detection with deep learning: A review,” IEEE transactions on neural networks and learning systems, vol. 30, no. 11, pp. 3212–3232, 2019.
    34. L. Liu, W. Ouyang, X. Wang, P. Fieguth, J. Chen, X. Liu, and M. Pietika¨inen, “Deep learning for generic object detection: A survey,” International journal of computer vision, vol. 128, no. 2, pp. 261–318, 2020.
    35. W. Shuigen, C. Zhen, and D. Hua, “Motion detection based on temporal difference method and optical flow field,” in 2009 Second International Symposium on Electronic Commerce and Security, vol. 2. IEEE, 2009, pp. 85–88.
    36. C. Jang and S. Lee, “Object motion based video key-frame extraction,” in ACM SIGGRAPH ASIA 2010 Posters, 2010, pp. 1–1.
    37. H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-up robust features (surf),” Computer vision and image understanding, vol. 110, no. 3, pp. 346– 359, 2008.
    38. J. Fajtl, H. S. Sokeh, V. Argyriou, D. Monekosso, and P. Remagnino, “Summarizing videos with attention,” 2018.
    39. S. E. F. De Avila, A. P. B. Lopes, A. da Luz Jr, and A. de Albuquerque Arau´jo, “Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method,” Pattern Recognition Letters, vol. 32, no. 1, pp. 56–68, 2011.
    40. I. Yahiaoui, B. Merialdo, and B. Huet, “Comparison of multiepisode video summarization algorithms,”
    41. EURASIP Journal on Advances in Signal Processing, vol. 2003, no. 1, p. 613895, 2003.
    42. Automatic video summarization,” in Proc. CBMIR Conf, 2001.
    43. F. Chen, M. Cooper, and J. Adcock, “Video summarization preserving dynamic content,” in Proceedings of the international workshop on TRECVID video summarization, 2007, pp. 40–44.
    44. B. T. Truong and S. Venkatesh, “Video abstraction: A systematic review and classification,” ACM transactions on multimedia computing, communications, and applications (TOMM), vol. 3, no. 1, pp. 3–es, 2007.
    45. T. Mei, L.-X. Tang, J. Tang, and X.-S. Hua, “Near-lossless semantic video summarization and its appli- cations to video analysis,” ACM Transactions on Multimedia Computing, Communications, and Applica- tions (TOMM), vol. 9, no. 3, pp. 1–23, 2013.
    46. C. Kim and J.-N. Hwang, “Object-based video abstraction for video surveillance systems,” IEEE Trans- actions on Circuits and Systems for Video Technology, vol. 12, no. 12, pp. 1128–1138, 2002.
    47. Y. Fu, Y. Guo, Y. Zhu, F. Liu, C. Song, and Z.-H. Zhou, “Multi-view video summarization,” IEEE Trans- actions on Multimedia, vol. 12, no. 7, pp. 717– 729, 2010.
    48. T. Hussain, K. Muhammad, A. Ullah, Z. Cao, S. W. Baik, and V. H. C. de Albuquerque, “Cloud-assisted multiview video summarization using cnn and bidirectional lstm,” IEEE Transactions on Industrial Infor- matics, vol. 16, no. 1, pp. 77–86, 2019.

Recent Article