Pretrained Language Models are Symbolic Mathematics Solvers too!

Abstract

Solving symbolic mathematics has always been of in the arena of humaningenuity that needs compositional reasoning and recurrence. However, recentstudies have shown that large-scale language models such as transformers areuniversal and surprisingly can be trained as a sequence-to-sequence task tosolve complex mathematical equations. These large transformer models needhumongous amounts of training data to generalize to unseen symbolic mathematicsproblems. In this paper, we present a sample efficient way of solving thesymbolic tasks by first pretraining the transformer model with languagetranslation and then fine-tuning the pretrained transformer model to solve thedownstream task of symbolic mathematics. We achieve comparable accuracy on theintegration task with our pretrained model while using around $1.5$ orders ofmagnitude less number of training samples with respect to the state-of-the-artdeep learning for symbolic mathematics. The test accuracy on differentialequation tasks is considerably lower comparing with integration as they needhigher order recursions that are not present in language translations. Wepropose the generalizability of our pretrained language model from AnnaKarenina Principle (AKP). We pretrain our model with different pairs oflanguage translations. Our results show language bias in solving symbolicmathematics tasks. Finally, we study the robustness of the fine-tuned model onsymbolic math tasks against distribution shift, and our approach generalizesbetter in distribution shift scenarios for the function integration.

Quick Read (beta)

loading the full paper ...