GrateTile: Efficient Sparse Tensor Tiling for CNN Processing

Lin Y.-SLu H.-CTsao Y.-BChih Y.-MChen W.-CSHAO-YI CHIEN2021-09-022021-09-02202015206130https://www.scopus.com/inward/record.uri?eid=2-s2.0-85096804138&doi=10.1109%2fSiPS50750.2020.9195243&partnerID=40&md5=e37a8c6f339c549ed02371faba6853afhttps://scholars.lib.ntu.edu.tw/handle/123456789/581095We propose GrateTile, an efficient, hardware-friendly data storage scheme for sparse CNN feature maps (activations). It divides data into uneven-sized subtensors and, with small indexing overhead, stores them in a compressed yet randomly accessible format. This design enables modern CNN accelerators to fetch and decompressed sub-tensors on-the-fly in a tiled processing manner. GrateTile is suitable for architectures that favor aligned, coalesced data access, and only requires minimal changes to the overall architectural design. We simulate GrateTile with state-of-the-art CNNs and show an average of 55% DRAM bandwidth reduction while using only 0.6% of feature map size for indexing storage. ? 2020 IEEE.Data Compression; Neural Network Hardware; Sparse MatrixDynamic random access storage; Indexing (of information); Signal processing; Tensors; Bandwidth reductions; Data access; Data storage; Feature map; Indexing storage; On the flies; Sparse tensors; State of the art; Silicon compoundsGrateTile: Efficient Sparse Tensor Tiling for CNN Processingconference paper10.1109/SiPS50750.2020.91952432-s2.0-85096804138