NQuAD: 70,000+ Questions for Machine Comprehension of the Numerals in Text
Journal
International Conference on Information and Knowledge Management, Proceedings
Pages
2925-2929
Date Issued
2021
Author(s)
Abstract
Numeral information plays an important role in narratives of several domains such as medicine, engineering, and finance. Previous works focus on the foundation exploration toward numeracy and show that fine-grained numeracy is a challenging task. In machine reading comprehension, our statistics show that only a few numeral-related questions appear in previous datasets. It indicates that few benchmark datasets are designed for numeracy learning. In this paper, we present a Numeral-related Question Answering Dataset, NQuAD, for fine-grained numeracy, and propose several baselines for future works. We compare NQuAD with three machine reading comprehension datasets and show that NQuAD is more challenging than the numeral-related questions in other datasets. NQuAD is published under the CC BY-NC-SA 4.0 license for academic purposes. ? 2021 ACM.
Subjects
cloze test
machine reading comprehension
numeracy
Benchmark datasets
Cloze test
Fine grained
Machine reading comprehension
Question Answering
Reading comprehension
Statistics
SDGs
Type
conference paper
