PerfNetRT: Platform-Aware Performance Modeling for Optimized Deep Neural Networks
Journal
Proceedings - 2020 International Computer Symposium, ICS 2020
Pages
153-158
Date Issued
2020
Author(s)
Abstract
As deep learning techniques based on artificial neural networks have been widely applied to diverse application domains, the delivered performance of such deep learning models on the target hardware platforms should be taken into account during the system design process in order to meet the application-specific timing requirements. Specifically, there are neural network optimization frameworks available for boosting the execution efficiency of a trained model on the vendor-specific hardware platforms, e.g., OpenVINO [1] for Intel hardware and TensorRT [2] for NVIDIA GPUs, and it is important that system designers have access to the estimated performance of the optimized models running on the specific hardware so as to make better design decisions. In this work, we have developed PerfNetRT to facilitate the design making process by offering the estimated inference time of a trained model that is optimized for the NVIDIA GPU using TensorRT. Our preliminary results show that PerfNetRT is able to produce accurate estimates of the inference time for the popular models, including LeNet [3], AlexNet [4] and VGG16 [5], which are optimized with TensorRT running on NVIDIA GTX 1080Ti. ? 2020 IEEE.
Subjects
benchmark; machine learning; machine learning accelerators; performance pre-diction
Other Subjects
Computer hardware; Deep learning; Deep neural networks; Learning systems; Program processors; Systems analysis; Application specific; Diverse applications; Learning techniques; Neural network optimization; Performance Model; Specific hardware; System design process; Timing requirements; Neural networks
Type
conference paper
