Jiing-Ping WangMing-Guang LinAn-Yeu Andy Wu2024-10-142024-10-142024-04-22https://scholars.lib.ntu.edu.tw/handle/123456789/722017LATTE: Low-Precision Approximate Attention with Head-wise Trainable Threshold for Efficient Transformerconference paper10.1109/aicas59952.2024.10595864