Layne BerryYi-Jen ShihHsuan-Fu WangHeng-Jui ChangHung-Yi LeeDavid Harwath2025-06-022025-06-022023-06-04[9781728163277]15206149https://www.scopus.com/record/display.uri?eid=2-s2.0-86000377893&origin=resultslisthttps://scholars.lib.ntu.edu.tw/handle/123456789/729820falseM-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrievalconference paper10.1109/icassp49357.2023.100968822-s2.0-86000377893