Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering

Su, Hung Ting; Niu, Yulei; Lin, Xudong; WINSTON HSU; Chang, Shih Fu

doi:10.1109/CVPRW59228.2023.00523

Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering

Journal

IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Journal Volume

2023-June

ISBN

9798350302493

Date Issued

2023-01-01

Author(s)

Su, Hung Ting

Niu, Yulei

Lin, Xudong

WINSTON HSU

Chang, Shih Fu

DOI

10.1109/CVPRW59228.2023.00523

URI

https://scholars.lib.ntu.edu.tw/handle/123456789/636152

URL

https://api.elsevier.com/content/abstract/scopus_id/85170827546

Abstract

Causal Video Question Answering (CVidQA) queries not only association or temporal relations but also causal relations in a video. Existing question synthesis methods pretrained question generation (QG) systems on reading comprehension datasets with text descriptions as inputs. However, QG models only learn to ask association questions (e.g., "what is someone doing...") and result in inferior performance due to the poor transfer of association knowledge to CVidQA, which focuses on causal questions like "why is someone doing...". Observing this, we proposed to exploit causal knowledge to generate question-answer pairs, and proposed a novel framework, Causal Knowledge Extraction from Language Models (CaKE-LM), leveraging causal commonsense knowledge from language models to tackle CVidQA. To extract knowledge from LMs, CaKE-LM generates causal questions containing two events with one triggering another (e.g., "score a goal"triggers "soccer player kicking ball") by prompting LM with the action (soccer player kicking ball) to retrieve the intention (to score a goal). CaKE-LM significantly outperforms conventional methods by 4% to 6% of zero-shot CVidQA accuracy on NExT-QA and Causal-VidQA datasets. We also conduct comprehensive analyses and provide key findings for future research.

Type

conference paper

Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)