Chen C.-CHuang H.-HHSIN-HSI CHEN2022-04-252022-04-252021https://www.scopus.com/inward/record.uri?eid=2-s2.0-85119179870&doi=10.1145%2f3459637.3482155&partnerID=40&md5=c0e10eb5a78e5bfc9b9bb97b21b0e053https://scholars.lib.ntu.edu.tw/handle/123456789/607401Numeral information plays an important role in narratives of several domains such as medicine, engineering, and finance. Previous works focus on the foundation exploration toward numeracy and show that fine-grained numeracy is a challenging task. In machine reading comprehension, our statistics show that only a few numeral-related questions appear in previous datasets. It indicates that few benchmark datasets are designed for numeracy learning. In this paper, we present a Numeral-related Question Answering Dataset, NQuAD, for fine-grained numeracy, and propose several baselines for future works. We compare NQuAD with three machine reading comprehension datasets and show that NQuAD is more challenging than the numeral-related questions in other datasets. NQuAD is published under the CC BY-NC-SA 4.0 license for academic purposes. ? 2021 ACM.cloze testmachine reading comprehensionnumeracyBenchmark datasetsCloze testFine grainedMachine reading comprehensionQuestion AnsweringReading comprehensionStatistics[SDGs]SDG4NQuAD: 70,000+ Questions for Machine Comprehension of the Numerals in Textconference paper10.1145/3459637.34821552-s2.0-85119179870