https://scholars.lib.ntu.edu.tw/handle/123456789/629046
標題: | Text Spotting in Natural Scenes Based on Feature Pyramid Neural Network | 作者: | CHIEN-KANG HUANG Liou, Guojhen |
關鍵字: | Computer vision | convolutional neural network | scene text | text detection | text recognition | 公開日期: | 1-一月-2022 | 來源出版物: | Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022 | 摘要: | The natural scene text recognition (STR) task has been a popular research field in computer vision due to its many applications. Earlier studies mainly relied on hand-crafted features, which often limited the recognition performance. In recent years, deep learning neural networks have made significant progress in STR tasks with the rise and development of deep learning. This study analyzes and evaluates model optimization strategies for text detection and recognition tasks. In the text detection model, the EAST model is used as the base to optimize the backbone of the front-end feature extraction and the feature fusion part of the middle part of the network. In the text recognition model, the SRN model is used as the base to optimize the backbone of front-end feature extraction. In the end-to-end architecture integration, MobileNetV3 is used as the base to train a text orientation classifier to achieve the goal of recognizing straight text. The experimental results show that after the improvement of this study, in the detection task, compared to the original EAST model [1], the precision can be increased by 6.9%, the recall rate can be increased by 2.3%, and the F-measure can be increased by 4.6%. In the identification task, compared to the original SRN model [2], the accuracy can be increased by 8.8%, and the normalized edit distance can be increased by 9.7%. Finally, this research also integrates the two tasks into an end-to-end system architecture, which solves the problem of straight writing in Chinese characters and makes the algorithm more practical. The mentioned models are available at https://github.com/Isaac93546/SRN_OCR. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/629046 | ISBN: | 9781665480451 | DOI: | 10.1109/BigData55660.2022.10020609 |
顯示於: | 工程科學及海洋工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。