Mrunmayee Patil, Ramesh Kagalkar
Abstract: An image can be defined as a matrix of square pixels arranged in rows and columns. Image processing is a leading technology which enhances raw images received from gadgets such as camera or a mobile phone in normal day-to-day life for various applications. An image to text and speech conversion system can be useful for blind as well as physically challenging people to understand the scenario from the images. Core idea for image to text and speech conversion is to overcome the challenges faced by a blind person in real life. The techniques of image segmentation and edge detection play an important role in implementing this system. We formulate the interaction between image segmentation and object recognition in the framework of Canny algorithm. The system goes through various phases such as preprocessing, feature extraction, object recognition, edge detection, image segmentation and text-to-speech (TTS) conversion. The database of this system consists of huge set of sample images which help to identify similar kind of objects in every different image. The system mainly consists of two main modules such as image-to-text and text-to-speech. An image-to-text module generates text descriptions in natural language based on understanding of image. A text-to-speech module converts natural language into speech synthesis.
Keywords: Image Processing, Feature Extraction, Edge Detection, Image Segmentation, Object Recognition, Text-to-Speech TTS