Exploiting edge features for scene text understanding systems and scene text searching applications

Abstract : Scene texts have been attracting increasing interest in recent years as witnessed by a large number of applications such as car licence plate recognition systems, navigation systems, self-driving cars based on traffic sign, and so on. In this research, we tackle challenges of designing robust and reliable automatic scene text reading systems. Two major steps of the system as a scene text localization and a scene text recognition have been studied and novel algorithms have been developed to address them. Our works are based on the observation that providing primary scene text regions which have high probability of being texts is very important for localizing and recognizing texts in scenes. This factor can influence both accuracy and efficiency of detection and recognition systems. Inspired by successes of object proposal researches in general object detection and recognition, two state-of-the-art scene text proposal techniques have been proposed, namely Text-Edge-Box (TEB) and Max-Pooling Text Proposal (MPT). In the TEB, proposed bottom-up features, which are extracted from binary Canny edge maps, are used to group edge connected components into proposals and score them. In the MPT technique, a novel grouping solution is proposed as inspired by the max-pooling idea. Different from existing grouping techniques, it does not rely on any text specific heuristic rules and thresholds for providing grouping decisions. Based on our proposed scene text proposal techniques, we designed an end-to-end scene text reading system by integrating proposals with state-of-the-art scene text recognition models, where a false positive proposals suppression and a word recognition can be processed concurrently. Furthermore, we developed an assisted scene text searching system by building a web-page user interface on top of the proposed end-to-end system. The system can be accessed by any smart device at the link: Experiments on various public scene text datasets show that the proposed scene text proposal techniques outperform other state-of-the-art scene text proposals under different evaluation frameworks. The designed end-to-end systems also outperforms other scene-text-proposal based end-to-end systems and are competitive to other systems as presented in the robust reading competition community. It achieves the fifth position in the champion list (Dec-2017): =evaluation&task=4.
Dinh Nguyen Van. Exploiting edge features for scene text understanding systems and scene text searching applications. Artificial Intelligence [cs.AI]. Sorbonne Université; Nanyang Technological University, 2018. English. ⟨NNT : 2018SORUS473⟩. ⟨tel-02924995⟩



