Text Spotting in Natural Scenes Based on Feature Pyramid Neural Network
Journal
Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
ISBN
9781665480451
Date Issued
2022-01-01
Author(s)
Liou, Guojhen
Abstract
The natural scene text recognition (STR) task has been a popular research field in computer vision due to its many applications. Earlier studies mainly relied on hand-crafted features, which often limited the recognition performance. In recent years, deep learning neural networks have made significant progress in STR tasks with the rise and development of deep learning. This study analyzes and evaluates model optimization strategies for text detection and recognition tasks. In the text detection model, the EAST model is used as the base to optimize the backbone of the front-end feature extraction and the feature fusion part of the middle part of the network. In the text recognition model, the SRN model is used as the base to optimize the backbone of front-end feature extraction. In the end-to-end architecture integration, MobileNetV3 is used as the base to train a text orientation classifier to achieve the goal of recognizing straight text. The experimental results show that after the improvement of this study, in the detection task, compared to the original EAST model [1], the precision can be increased by 6.9%, the recall rate can be increased by 2.3%, and the F-measure can be increased by 4.6%. In the identification task, compared to the original SRN model [2], the accuracy can be increased by 8.8%, and the normalized edit distance can be increased by 9.7%. Finally, this research also integrates the two tasks into an end-to-end system architecture, which solves the problem of straight writing in Chinese characters and makes the algorithm more practical. The mentioned models are available at https://github.com/Isaac93546/SRN_OCR.
Subjects
Computer vision | convolutional neural network | scene text | text detection | text recognition
Type
conference paper
