On the Practical Computational Power of Finite Precision RNNs for Language Recognition

Abstract

While Recurrent Neural Networks (RNNs) are famously known to be Turingcomplete, this relies on infinite precision in the states and unboundedcomputation time. We consider the case of RNNs with finite precision whosecomputation time is linear in the input length. Under these limitations, weshow that different RNN variants have different computational power. Inparticular, we show that the LSTM and the Elman-RNN with ReLU activation arestrictly stronger than the RNN with a squashing activation and the GRU. This isachieved because LSTMs and ReLU-RNNs can easily implement counting behavior. Weshow empirically that the LSTM does indeed learn to effectively use thecounting mechanism.

Quick Read (beta)

loading the full paper ...