Wang, Cyun BoCyun BoWangJIAN-JIUN DING2024-01-122024-01-122023-01-019798350300673https://scholars.lib.ntu.edu.tw/handle/123456789/638461This paper represents the EffSegmentNet, which is a powerful real-time semantic segmentation model. It consists of two segments: (1) A novel MetaFormer-based encoder, termed the EffVisionFormer, is introduced. It captures multiscale image features efficiently. (2) A lightweight decoder which utilizes multiscale image features from the encoder is applied to conduct rapid yet accurate segmented result. The proposed EffSegmentNet achieves remarkable performance which takes the inference speed, accuracy, and model parameters into account. On the Cityscapes test set, we attain 71.9% mIoU with 195.1 frames per second (FPS) on a NVIDIA RTX 2080Ti card. Furthermore, the proposed EffSegmentNet utilizes only 4.4 million parameters, which demonstrates its advantage on real-time segmentation.EffSegmentNet: Efficient Design for Real-time Semantic Segmentationconference paper10.1109/APSIPAASC58517.2023.103171312-s2.0-85180007721https://api.elsevier.com/content/abstract/scopus_id/85180007721