Generalization-Aware Zero-Shot Neural Architecture Search for Self-Supervised Transformers
Journal
Proceedings of the International Joint Conference on Neural Networks
Start Page
1
End Page
8
Date Issued
2025-11-14
Author(s)
Ko, Jun-Hua
Abstract
Neural Architecture Search (NAS) aims to automate the design of neural networks, enabling the discovery of highly effective architectures. Recent advancements in NAS have shown significant success in identifying high-performing Transformer architectures for computer vision and natural language processing tasks. However, most NAS research has focused on supervised learning frameworks, which rely heavily on labeled data. This dependence on labeled data makes deploying these methods in real-world applications challenging due to the high cost of data annotation. Additionally, previous studies often prioritize model performance while neglecting generalization ability, particularly in scenarios with limited labeled data. To address these challenges, this study introduces a generalization-aware zero-shot proxy based on self-supervised learning. By combining this proxy with a complementary zero-shot proxy, we identify architectures that balance generalization ability and expressivity. Experimental results demonstrate that the architectures discovered using the proposed approach achieve competitive performance on the ImageNet and Wikitext-2 datasets while significantly reducing the required labeled data by up to 75% and 99%, respectively.
Event(s)
2025 International Joint Conference on Neural Networks, IJCNN 2025
Subjects
Computer Vision
Natural Language Processing
Neural Architecture Search
Self-Supervised Learning
Publisher
Institute of Electrical and Electronics Engineers Inc.
Type
conference paper
