Nested LSTMs

  • 2018-01-31 05:52:08
  • Joel Ruben Antony Moniz, David Krueger
  • 41

Abstract

We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiplelevels of memory. Nested LSTMs add depth to LSTMs via nesting as opposed tostacking. The value of a memory cell in an NLSTM is computed by an LSTM cell,which has its own inner memory cell. Specifically, instead of computing thevalue of the (outer) memory cell as $c^{outer}_t = f_t \odot c_{t-1} + i_t\odot g_t$, NLSTM memory cells use the concatenation $(f_t \odot c_{t-1}, i_t\odot g_t)$ as input to an inner LSTM (or NLSTM) memory cell, and set$c^{outer}_t$ = $h^{inner}_t$. Nested LSTMs outperform both stacked andsingle-layer LSTMs with similar numbers of parameters in our experiments onvarious character-level language modeling tasks, and the inner memories of anLSTM learn longer term dependencies compared with the higher-level units of astacked LSTM.

 

Introduction (beta)

None

 

Conclusion (beta)

None