Low Precision Deep Learning Training on Mobile Heterogeneous Platform
Journal
Proceedings - 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018
Pages
109-117
Date Issued
2018
Author(s)
Abstract
Recent advances in System-on-Chip architectures have made the use of deep learning suitable for a number of applications on mobile devices. Unfortunately, due to the computational cost of neural network training, it is often limited to inference task, e.g., prediction, on mobile devices. In this paper, we propose a deep learning framework that enables both deep learning training and inference tasks on mobile devices. While being able to accommodate with the heterogeneity of computing devices technology on mobile devices, it also uses OpenCL to efficiently leverage modern SoC capabilities, e.g., multi-core CPU, integrated GPU and shared memory architecture, and accelerate deep learning computation. In addition, our system encodes the arithmetic operations of deep networks down to 8-bit fixed-point on mobile devices. As a proof of concept, we trained three well-known neural networks on mobile devices and exhibited a significant performance gain, energy consumption reduction, and memory saving. © 2018 IEEE.
SDGs
Other Subjects
Energy utilization; Fixed point arithmetic; Memory architecture; Mobile computing; Network architecture; Neural networks; Program processors; Programmable logic controllers; System-on-chip; GPGPU; Heterogeneous platforms; Heterogeneous systems; Neural network training; Opencl; Shared memory architecture; System-on-chip architecture; Transfer learning; Deep learning
Type
conference paper