Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets

Abstract

At present, the state-of-the-art computational models across a range ofsequential data processing tasks, including language modeling, are based onrecurrent neural network architectures. This paper begins with the observationthat most research on developing computational models capable of processingsequential data fails to explicitly analyze the long distance dependencies(LDDs) within the datasets the models process. In this context, in this paper,we make five research contributions. First, we argue that a key step inmodeling sequential data is to understand the characteristics of the LDDswithin the data. Second, we present a method to compute and analyze the LDDcharacteristics of any sequential dataset, and demonstrate this method on anumber of sequential datasets that are frequently used for model benchmarking.Third, based on the analysis of the LDD characteristics within the benchmarkingdatasets, we observe that LDDs are far more complex than previously assumed,and depend on at least four factors: (i) the number of unique symbols in adataset, (ii) size of the dataset, (iii) the number of interacting symbolswithin an LDD, and (iv) the distance between the interacting symbols. Fourth,we verify these factors by using synthetic datasets generated using Strictlyk-Piecewise (SPk) languages. We then demonstrate how SPk languages can be usedto generate benchmarking datasets with varying degrees of LDDs. The advantageof these synthesized datasets being that they enable the targeted testing ofrecurrent neural architectures. Finally, we demonstrate how understanding thecharacteristics of the LDDs in a dataset can inform better hyper-parameterselection for current state-of-the-art recurrent neural architectures and alsoaid in understanding them...

Quick Read (beta)

loading the full paper ...