International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 143

Case Studies | Computer Science & Engineering | India | Volume 8 Issue 4, April 2019


Video Description and Collision Detection for Visually Impaired

Vinay Kumar Saini [2] | Hitesh Kwatra | Himanshu Narang


Abstract: Using Deep learning techniques, find a new approach that analyses a video and then present it in understandable language using NLP techniques. For most people, watching a brief video and describing what happened is an easy task. For machines, extracting the meaning from video pixels and generating natural-sounding language is a very complex problem. Solutions have been proposed for narrow domains with a small set of known actions and objects We plan to extract features for each frame, mean pool the features across the entire video and input this at every time step to the LSTM network. The LSTM outputs one word at each time step, based on the video features until it picks the end-of-sentence tag and extends them to generate sentences describing events in videos. They then use a sequence model, specifically a Recurrent Neural Network (RNN), to decode the vector into a sentence. In this work, we plan to show that interpreting a visual vector into a set of English words will work same for videos as well as static images. We did this in all the experiments, and it did help quite a lot in terms of generalization. Another set of weights that could be sensibly initialized are We, the word embeddings. We tried initializing them from a large news corpus, but no significant gains were observed, and we decided to just leave them uninitialized for simplicity. Lastly, we did some model level overfitting-avoiding techniques. We tried dropout and ensembling models, as well as exploring the size (i. e. , capacity) of the model by trading off number of hidden units versus depth. We also propose collision detection system so that along with getting what is happening around the person, it also gets a collision warning if the distance between the object and the person become smaller than a certain threshold.


Keywords: NLP, Computer Vision, Deep Learning, RNN, LSTM, Collision Detection, Object Detection


Edition: Volume 8 Issue 4, April 2019,


Pages: 1306 - 1308


How to Download this Article?

You Need to Register Your Email Address Before You Can Download the Article PDF


How to Cite this Article?

Vinay Kumar Saini, Hitesh Kwatra, Himanshu Narang, "Video Description and Collision Detection for Visually Impaired", International Journal of Science and Research (IJSR), Volume 8 Issue 4, April 2019, pp. 1306-1308, https://www.ijsr.net/get_abstract.php?paper_id=ART20197124

Similar Articles with Keyword 'NLP'

Downloads: 0

Research Paper, Computer Science & Engineering, India, Volume 12 Issue 2, February 2023

Pages: 916 - 919

Sentiment Analysis: A Case Study for Apparel Brands - FABINDIA v/s BIBA

Syed Aqsa Ahmed

Share this Article

Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Masters Thesis, Computer Science & Engineering, India, Volume 12 Issue 4, April 2023

Pages: 1324 - 1330

A Thesis on News Recommendation

Abhik Naskar | Sudeshna Sarkar

Share this Article
Top