Predicting Variable Types in Dynamically Typed Programming Languages

Abstract

Dynamic Programming Languages are quite popular because they increase theprogrammer's productivity. However, the absence of types in the source codemakes the program written in these languages difficult to understand andvirtual machines that execute these programs cannot produced optimized code. Toovercome this challenge, we develop a technique to predict types of allidentifiers including variables, and function return types. We propose the first implementation of $2^{nd}$ order Inside OutsideRecursive Neural Networks with two variants (i) Child-Sum Tree-LSTMs and (ii)N-ary RNNs that can handle large number of tree branching. We predict the typesof all the identifiers given the Abstract Syntax Tree by performing just twopasses over the tree, bottom-up and top-down, keeping both the content andcontext representation for all the nodes of the tree. This allows theserepresentations to interact by combining different paths from the parent,siblings and children which is crucial for predicting types. Our best modelachieves 44.33\% across 21 classes and top-3 accuracy of 71.5\% on our gatheredPython data set from popular Python benchmarks.

Quick Read (beta)

loading the full paper ...