Re-Attention Is All You Need: Memory-Efficient Scene Text Detection via Re-Attention on Uncertain Regions

Chang, Hsiang ChunHsiang ChunChangChen, Hung JenHung JenChenShen, Yu ChiaYu ChiaShenShuai, Hong HanHong HanShuaiWEN-HUANG CHENG2023-02-202023-02-202021-01-01978166541714321530858https://scholars.lib.ntu.edu.tw/handle/123456789/628552Scene text detection plays an important role on vision-based robot navigation to many potential landmarks such as nameplates, information signs, floor button in the elevators. Recently, scene text detection with segmentation-based methods has been receiving more and more attention. The segmentation results can be used to efficiently predict scene text of various shapes, such as irregular text in most scene text images. However, two kinds of texts remain unsolved: 1) tiny and 2) blurry instances. Moreover, the annotations for tiny/blurry texts are usually ignored during training, while tiny/blurry texts can still offer visual auxiliaries for robots to understand the world. Therefore, in this paper, we propose a new approach to effectively detect both clear and blurry texts. Specifically, we propose a re-attention module without increasing the learnable parameters, which first predicts the region of texts as the candidate region and leverages the same network to detect the candidate region again for reducing the required memory. Moreover, to avoid the errors from the first detection propagating to the re-attended area, we propose a new fusion module that learns to integrate the results of the re-attended regions and the first prediction. Experimental results manifest that the proposed method outperforms state-of-the-art methods on four challenging datasets.[SDGs]SDG4Re-Attention Is All You Need: Memory-Efficient Scene Text Detection via Re-Attention on Uncertain Regionsconference paper10.1109/IROS51168.2021.96365102-s2.0-85124373433https://api.elsevier.com/content/abstract/scopus_id/85124373433