Generalization-Aware Zero-Shot Neural Architecture Search for Self-Supervised Transformers

Ko, Jun-Hua; Chiueh, Tzi-Dar

doi:10.1109/ijcnn64981.2025.11229357

Generalization-Aware Zero-Shot Neural Architecture Search for Self-Supervised Transformers

Journal

Proceedings of the International Joint Conference on Neural Networks

Start Page

1

End Page

8

Date Issued

2025-11-14

Author(s)

Ko, Jun-Hua

Chiueh, Tzi-Dar

DOI

10.1109/ijcnn64981.2025.11229357

URI

https://www.scopus.com/record/display.uri?eid=2-s2.0-105029356117&origin=resultslist

https://scholars.lib.ntu.edu.tw/handle/123456789/737230

Abstract

Neural Architecture Search (NAS) aims to automate the design of neural networks, enabling the discovery of highly effective architectures. Recent advancements in NAS have shown significant success in identifying high-performing Transformer architectures for computer vision and natural language processing tasks. However, most NAS research has focused on supervised learning frameworks, which rely heavily on labeled data. This dependence on labeled data makes deploying these methods in real-world applications challenging due to the high cost of data annotation. Additionally, previous studies often prioritize model performance while neglecting generalization ability, particularly in scenarios with limited labeled data. To address these challenges, this study introduces a generalization-aware zero-shot proxy based on self-supervised learning. By combining this proxy with a complementary zero-shot proxy, we identify architectures that balance generalization ability and expressivity. Experimental results demonstrate that the architectures discovered using the proposed approach achieve competitive performance on the ImageNet and Wikitext-2 datasets while significantly reducing the required labeled data by up to 75% and 99%, respectively.

Event(s)

2025 International Joint Conference on Neural Networks, IJCNN 2025

Subjects

Computer Vision

Natural Language Processing

Neural Architecture Search

Self-Supervised Learning

Publisher

Institute of Electrical and Electronics Engineers Inc.

Type

conference paper

Generalization-Aware Zero-Shot Neural Architecture Search for Self-Supervised Transformers

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)