Volume 15 / Issue 18

DOI:   10.3217/jucs-015-18-3325


Robust Extraction of Text from Camera Images using Colour and Spatial Information Simultaneously

Shyama Prosad Chowdhury (Queen's University Belfast, United Kingdom)

Soumyadeep Dhar (Videonetics Technology Pvt. Ltd., India)

Karen Rafferty (Queen's University Belfast, United Kingdom)

Amit Kumar Das (Bengal Engineering and Science University, India)

Bhabatosh Chanda (Indian Statistical Institute, India)

Abstract: The importance and use of text extraction from camera based coloured scene images is rapidly increasing with time. Text within a camera grabbed image can contain a huge amount of meta data about that scene. Such meta data can be useful for identification, indexing and retrieval purposes. While the segmentation and recognition of text from document images is quite successful, detection of coloured scene text is a new challenge for all camera based images. Common problems for text extraction from camera based images are the lack of prior knowledge of any kind of text features such as colour, font, size and orientation as well as the location of the probable text regions. In this paper, we document the development of a fully automatic and extremely robust text segmentation technique that can be used for any type of camera grabbed frame be it single image or video. A new algorithm is proposed which can overcome the current problems of text segmentation. The algorithm exploits text appearance in terms of colour and spatial distribution. When the new text extraction technique was tested on a variety of camera based images it was found to out perform existing techniques (or something similar). The proposed technique also overcomes any problems that can arise due to an unconstraint complex background. The novelty in the works arises from the fact that this is the first time that colour and spatial information are used simultaneously for the purpose of text extraction.

Keywords: camera image, discrete edge boundary, text extraction, text localisation, video frame

Categories: I.4, I.4.6, I.4.8