Recycling Numeracy Data Augmentation with Symbolic Verification for Math Word Problem Solving
Journal
ACM International Conference Proceeding Series
ISBN
9781450391153
Date Issued
2021-12-14
Author(s)
Abstract
Most studies of automatic math word problem solving rely on a dataset for training the model that transforms a question into the corresponding answer directly, or translates the question into a sequence of operations that form a program to derive the answer. The program serving as the intermediate symbolic form between the question and the answer provides more information for the model to learn arithmetic reasoning. However, manually composing the programs for numerous questions is a labor-intensive work, resulting in only one medium-sized dataset, MathQA, is available. This work proposes a novel recycling numeracy data augmentation (RNDA) approach that automatically generates high quality training instances in the MathQA style. Experimental results show that the model trained on the augmented data achieves the state-of-the-art performance. We will release the dataset as a resource for the research community.
Subjects
data augmentation | math word problem solving | recycling
Other Subjects
Statistics; Data augmentation; High quality; Labour-intensive; Learn+; Math word problem solving; Quality training; Sequences of operations; State-of-the-art performance; Symbolic verification; Word problem solving; Recycling
Type
conference paper
