Privacy Leakage via Speech-induced Vibrations on Room Objects through Remote Sensing based on Phased-MIMO

Shi, Cong; Zhang, Tianfang; Xu, Zhaoyi; Li, Shuping; Gao, Donglin; Li, Changming; Petropulu, Athina; CHUNG-TSE  WU; Chen, Yingying

doi:10.1145/3576915.3616634

Privacy Leakage via Speech-induced Vibrations on Room Objects through Remote Sensing based on Phased-MIMO

Journal

CCS 2023 - Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

ISBN

9798400700507

Date Issued

2023-11-15

Author(s)

Shi, Cong

Zhang, Tianfang

Xu, Zhaoyi

Li, Shuping

Gao, Donglin

Li, Changming

Petropulu, Athina

CHUNG-TSE WU

Chen, Yingying

DOI

10.1145/3576915.3616634

URI

https://scholars.lib.ntu.edu.tw/handle/123456789/640302

URL

https://api.elsevier.com/content/abstract/scopus_id/85179843635

Abstract

Speech eavesdropping has long been an important threat to the privacy of individuals and enterprises. Recent research has shown the possibility of deriving private speech information from sound-induced vibrations. Acoustic signals transmitted through a solid medium or air may induce vibrations upon solid surfaces, which can be picked up by various sensors (e.g., motion sensors, high-speed cameras and lasers), without using a microphone. To date, these threats are limited to scenarios where the sensor is in contact with the vibration surface or at least in the visual line-of-sight. In this paper, we revisit this important line of research and show that a remote, long-distance, and even thru-the-wall speech eavesdropping attack is possible. We discover a new form of speech eavesdropping attack that remotely elicits speech from minute surface vibrations upon common room objects (e.g., paper bags, plastic storage bin) via mmWave sensing, signal processing, and advanced deep learning techniques. While mmWave signals have high sensitivity for vibrations, they have limited sensing distance and normally do not penetrate through walls. We overcome this key challenge through designing and implementing a high-resolution software-defined phased-MIMO radar that integrates transmit beamforming, virtual array, and receive beamforming. The proposed system enhances sensing directivity by focusing all the mmWave beams toward a target room object, allowing mmWave signals to pick up minute speech-induced vibrations from a long distance and even through walls. To realize the attack, we design an object identification technique that scans objects in a room and identifies a prominent object that is most sensitive to speech vibrations for vibration feature extraction. We successfully demonstrate speech privacy leakage using speech-induced vibrations via the development of a deep learning framework. Our framework can leverage domain adaptation techniques to infer speech content based only on the unlabeled vibration data of a victim. We validate the proof-of-concept attack on digit recognition through extensive experiments, involving 40 speakers, five common room objects, and attack scenarios with mmWave devices inside and outside the room. Our phased-MIMO-based attack can achieve success rates of 88% ∼ 98% and 64% ∼ 86% with and without using speech labels for training. The success rates are 81% ∼ 94% and 58% ∼ 74% for thru-the-wall attacks. Furthermore, we discuss possible defense methods to mitigate this unprecedented security threat.

Subjects

mmWave sensing | phased-MIMO | Speech privacy attack

SDGs

[SDGs]SDG16

Type

conference paper

Privacy Leakage via Speech-induced Vibrations on Room Objects through Remote Sensing based on Phased-MIMO

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)