This paper extends previous analysis of the gradient decay to a class of discrete-time recurrent networks, called Dynamical Recurrent Neural Networks (DRNN), obtained by modeling synapses as Finite Impulse Response (FIR) filters instead of multiplicative scalars. Using elementary matrix manipulations, we provide an upper bound on the norm of the weight matrix ensuring that the gradient vector, when propagated in a reverse manner in time through the error-propagation network, decays exponentially to zero. These bounds apply to all FIR architecture proposals as well as fixed point recurrent networks, regardless of delay and connectivity. In addition, we show that the computational overhead of the learning algorithm can be reduced drastically by taking advantage of the exponential decay of the gradient.
Index Terms:
Recurrent neural networks, Gradient descent, Forgetting behavior
Citation:
Alex Aussem, "Sufficient Conditions for Error Back Flow Convergence in Dynamical Recurrent Neural Networks," ijcnn, vol. 4, pp.4577, IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 4, 2000