Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Automatic Subtitle Generation In Videos Using Mel Frequency Cepstral Coefficients

Author : Aseem Mahajan 1 Lakshmi Sureshbabu 2 Jessal Manidhar R 3 Kumar Bhagwat 4 Poonam P.Bari 5

Date of Publication :7th March 2016

Abstract: Video is currently one of the most popular multimedia over pcs and the internet. It is very difficult to comprehend the meaning of videos without proper subtitles for deaf and hearing impaired people. The proposed system aims to create a media player that generates subtitles automatically using speech recognition. The input is in the form of a video file. The video file is subjected to audio extraction to create an audio file. Speech recognition using Mel Frequency Cepstral Coefficients (MFCC) is then performed on the extracted audio file. Java speech Application Programming Interface (API) is used for speech processing with the help of the grammar design .The subtitle file generated is then synchronized with the input video file. The subtitle generation is performed offline and for English videos only.

Reference :

    1. Lindasalwa Muda, Mumtaj Begam and Elamvazuthi., "Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and DTW Techniques", Journal of Computing, Volume 2, Issue 3, March 2010.
    2. Mahdi Shaneh and Azizollah Taheri,"Voice Command Recognition System based on MFCC and VQ Algorithms”, World Academy of Science, Engineering and Technology Journal, 2009.
    3. R. P. Lippmann, “Speech recognition by machines and humans,” SpeechCommun., vol. 22, no. 1, pp. 1–15, 1997.
    4. Gerasimos Potamianos, Member, IEEE et. al. "Recent Advances in the Automatic Recognition of Audio-Visual Speech"
    5. http://practicalcryptography.com/miscellaneous/machin e-learning/guide-mel-frequency-cepstral-coefficientsmfccs/ 
    6. http://Wikipedia:http://en.m.wikipedia.org/wiki/Java_Speech_ API
    7. Mudit Ratana Bhalla,"Performance Improvement of Speaker Recognition System", International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 3, March 2012

Recent Article