Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning
Journal
IEEE/ACM Transactions on Audio Speech and Language Processing
Journal Volume
30
Pages
202-217
Date Issued
2022
Author(s)
Abstract
Previous works have shown that automatic speaker verification (ASV) is seriously vulnerable to malicious spoofing attacks, such as replay, synthetic speech, and recently emerged adversarial attacks. Great efforts have been dedicated to defending ASV against replay and synthetic speech; however, only a few approaches have been explored to deal with adversarial attacks. All the existing approaches to tackle adversarial attacks for ASV require the knowledge for adversarial samples generation, but it is impractical for defenders to know the exact attack algorithms that are applied by the in-the-wild attackers. This work is among the first to perform adversarial defense for ASV without knowing the specific attack algorithms. Inspired by self-supervised learning models (SSLMs) that possess the merits of alleviating the superficial noise in the inputs and reconstructing clean samples from the interrupted ones, this work regards adversarial perturbations as one kind of noise and conducts adversarial defense for ASV by SSLMs. Specifically, we propose to perform adversarial defense from two perspectives: 1) adversarial perturbation purification and 2) adversarial perturbation detection. The purification module aims at alleviating the adversarial perturbations in the samples and pulling the contaminated adversarial inputs back towards the decision boundary. Experimental results show that our proposed purification module effectively counters adversarial attacks and outperforms traditional filters from both alleviating the adversarial noise and maintaining the performance of genuine samples. The detection module aims at detecting adversarial samples from genuine ones based on the statistical properties of ASV scores derived by a unique ASV integrating with different number of SSLMs. Experimental results show that our detection module helps shield the ASV by detecting adversarial samples. Both purification and detection methods are helpful for defending against different kinds of attack algorithms. Moreover, since there is no common metric for evaluating the ASV performance under adversarial attacks, this work also formalizes evaluation metrics for adversarial defense considering both purification and detection based approaches into account. We sincerely encourage future works to benchmark their approaches based on the proposed evaluation framework. ? 2014 IEEE.
Subjects
adversarial attacks
adversarial defense
Automatic speaker verification
self-supervised learning
Job analysis
Network security
Perturbation techniques
Purification
Speech recognition
Supervised learning
Adversarial attack
Adversarial defense
Benchmark testing
Features extraction
Performance
Perturbation method
Self-supervised learning
Synthetic speech
Task analysis
Speech synthesis
Type
journal article