Pretrained Language Models are Symbolic Mathematics Solvers too!

Abstract

Solving symbolic mathematics has always been of in the arena of humaningenuity that needs compositional reasoning and recurrence. However, recentstudies have shown that large-scale language models such as transformers areuniversal and surprisingly can be trained as a sequence-to-sequence task tosolve complex mathematical equations. These large transformer models needhumongous amounts of training data to generalize to unseen symbolic mathematicsproblems. In this paper, we present a sample efficient way of solving thesymbolic tasks by first pretraining the transformer model with languagetranslation and then fine-tuning the pretrained transformer model to solve thedownstream task of symbolic mathematics. We achieve comparable accuracy on theintegration task with our pretrained model while using around $1.5$ orders ofmagnitude less number of training samples with respect to the state-of-the-artdeep learning for symbolic mathematics. The test accuracy on differentialequation tasks is considerably lower comparing with integration as they needhigher order recursions that are not present in language translations. Wepretrain our model with different pairs of language translations. Our resultsshow language bias in solving symbolic mathematics tasks. Finally, we study therobustness of the fine-tuned model on symbolic math tasks against distributionshift, and our approach generalizes better in distribution shift scenarios forthe function integration.

Quick Read (beta)

loading the full paper ...