RM-SSD: In-Storage Computing for Large-Scale Recommendation Inference

Sun X; Wan H; Li Q; CHIA-LIN YANG; TEI-WEI KUO; Xue C.J.; Sun X;Wan H;Li Q;Yang C.-L;Kuo T.-W;Xue C.J.

doi:10.1109/HPCA53966.2022.00081

RM-SSD: In-Storage Computing for Large-Scale Recommendation Inference

Journal

Proceedings - International Symposium on High-Performance Computer Architecture

Journal Volume

2022-April

Pages

1056-1070

Date Issued

2022

Author(s)

Sun X

Wan H

Li Q

CHIA-LIN YANG

TEI-WEI KUO

Xue C.J.

DOI

10.1109/HPCA53966.2022.00081

URI

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85130735559&doi=10.1109%2fHPCA53966.2022.00081&partnerID=40&md5=c99f9d855c9187a4bb520ab4b5598d3a

https://scholars.lib.ntu.edu.tw/handle/123456789/632317

Abstract

To meet the strict service level agreement requirements of recommendation systems, the entire set of embeddings in recommendation systems needs to be loaded into the memory. However, as the model and dataset for production-scale recommendation systems scale up, the size of the embeddings is approaching the limit of memory capacity. Limited physical memory constrains the algorithms that can be trained and deployed, posing a severe challenge for deploying advanced recommendation systems. Recent studies offload the embedding lookups into SSDs, which targets the embedding-dominated recommendation models. This paper takes it one step further and proposes to offload the entire recommendation system into SSD with in-storage computing capability. The proposed SSD-side FPGA solution leverages a low-end FPGA to speed up both the embedding-dominated and MLP-dominated models with high resource efficiency. We evaluate the performance of the proposed solution with a prototype SSD. Results show that we can achieve 20-100× throughput improvement compared with the baseline SSD and 1.5-15× improvement compared with the state-of-art. © 2022 IEEE.

Subjects

n/a

Other Subjects

Embeddings; Field programmable gate arrays (FPGA); Computing capability; Embeddings; Large-scales; Lookups; Memory capacity; N/a; Physical memory; Production scale; Scale-up; Servicelevel agreement (SLA); Recommender systems

Type

conference paper

RM-SSD: In-Storage Computing for Large-Scale Recommendation Inference

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)