Liew L.-YSHENG-DE WANG2023-06-092023-06-0920220747668Xhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85127030796&doi=10.1109%2fICCE53296.2022.9730159&partnerID=40&md5=91986267b58fa5d7b236641c6e458ac1https://scholars.lib.ntu.edu.tw/handle/123456789/632360Object detection tasks implemented using complex convolutional neural network (CNN) algorithms are both computational and memory intensive, making them difficult to deploy on CPU-only embedded systems due to their limited edge computing capabilities. Heterogeneous multiprocessor systems come in handy to perform these tasks. These systems usually integrate CPU and other processing units like GPU, DSP and FPGA such that each task is preferably executed by the unit that is able to perform that task efficiently with superior energy efficiency. This paper proposes a workflow with a series of optimization approaches such as model pruning, model quantization and multi-threading design in implementing an object detection task based on YOLOv4-CSP on a FPGA-based heterogeneous multiprocessor system. The YOLOv4-CSP network architecture is the state-of-the-art one-stage detection model. It is widely known for its fast inference time in object detection task. The experiments show that we can achieve a significant edge performance with lesser computing resources to implement object detection with complex CNN algorithms. © 2022 IEEE.edge-computing; FPGA SoCs; heterogeneous multiprocessor system; model optimization; YOLOv4-CSP[SDGs]SDG7Complex networks; Convolutional neural networks; Deep learning; Embedded systems; Energy efficiency; Field programmable gate arrays (FPGA); Multilayer neural networks; Multiprocessing systems; Network architecture; Network-on-chip; Object detection; Object recognition; Convolutional neural network; Edge computing; Embedded-system; FPGA soc; Heterogeneous multiprocessor systems; Model optimization; Neural networks algorithms; Objects detection; Performance optimizations; YOLOv4-CSP; Edge computingObject Detection Edge Performance Optimization on FPGA-Based Heterogeneous Multiprocessor Systemsconference paper10.1109/ICCE53296.2022.97301592-s2.0-85127030796