SMITH: A Self-supervised Downstream-Aware Framework for Missing Testing Data Handling

Yang C.-C; Li C.-T; SHOU-DE LIN; Yang C.-C;Li C.-T;Lin S.-D.

doi:10.1007/978-3-031-05936-0_39

SMITH: A Self-supervised Downstream-Aware Framework for Missing Testing Data Handling

Journal

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Journal Volume

13281 LNAI

Pages

499-510

Date Issued

2022

Author(s)

Yang C.-C

Li C.-T

SHOU-DE LIN

DOI

10.1007/978-3-031-05936-0_39

URI

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85130260540&doi=10.1007%2f978-3-031-05936-0_39&partnerID=40&md5=bfa428c6009ff2150fc63a1f5b4920d5

https://scholars.lib.ntu.edu.tw/handle/123456789/632338

Abstract

Missing values in testing data has been a notorious problem in machine learning community since it can heavily deteriorate the performance of downstream model learned from complete data without any precaution. To better perform the prediction task with this kind of downstream model, we must impute the missing value first. Therefore, the imputation quality and how to utilize the knowledge provided by the pre-trained and fixed downstream model are the keys to address this problem. In this paper, we aim to address this problem and focus on models learned from tabular data. We present a novel Self-supervised downstream-aware framework for MIssing Testing data Handling (SMITH), which consists of a transformer-based imputation model and a downstream label estimation algorithm. The former can be replaced by any existing imputation model of interest with additional performance gain acquired in comparison with that of their original design. By advancing two self-supervised tasks and the knowledge from the prediction of the downstream model to guide the learning of our transformer-based imputation model, our SMITH framework performs favorably against state-of-the-art methods under several benchmarking datasets. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Subjects

Downstream-aware; Missing testing data; Self-supervised learning; Tabular data; Transformer

Other Subjects

Electric transformer testing; Machine learning; Down-stream; Downstream-aware; Machine learning communities; Missing testing data; Missing values; Performance; Self-supervised learning; Tabular data; Testing data; Transformer; Data handling

Type

conference paper

SMITH: A Self-supervised Downstream-Aware Framework for Missing Testing Data Handling

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)