The echo state network, is a recurrent neural network with a sparsely connected hidden layer. The connectivity and weights of hidden neurons are fixed and randomly assigned. The weights of output neurons can be learned so that the network can produce or reproduce specific temporal patterns. The main interest of this network is that although its behaviour is non-linear, the only weights that are modified during training are for the synapses that connect the hidden neurons to output neurons. Thus, the error function is quadratic with respect to the parameter vector and can be differentiated easily to a linear system. Alternatively, one may consider a nonparametric Bayesian formulation of the output layer, under which: a prior distribution is imposed over the output weights; and the output weights are marginalized out in the context of prediction generation, given the training data. This idea has been demonstrated in by using Gaussian priors, whereby a Gaussian process model with ESN-driven kernel function is obtained. Such a solution was shown to outperform ESNs with trainable sets of weights in several benchmarks. Some publicly available implementations of ESNs are: : an efficient C++ library for various kinds of echo state networks with python/numpy bindings; and : an efficient matlab for an echo state network. The Echo State Network belongs to the Recurrent Neural Network family and provide their architecture and supervised learning principle. Unlike Feedforward Neural Networks, Recurrent Neural Networks are dynamic systems and not functions. Recurrent Neural Networks are typically used for: Learn dynamical process: signal treatment in engineering and telecommunications, vibration analysis, seismology, control of engines and generators. Signal forecasting and generation: text, music, electric signals. Modeling of biological systems, neurosciences, memory modeling, brain-computer Interfaces, filtering and Kalman processes, military applications, volatility modeling etc. For the training of RNN a number of learning algorithms are available: backpropagation through time, real-time recurrent learning. Convergence is not guaranteed due to instability and bifurcation phenomena. The main approach of the ESN is firstly to operate a random, large, fixed, recurring neural network with the input signal, which induces a nonlinear response signal in each neuron within this "reservoir" network, and secondly connect a desired output signal by a trainable linear combination of all these response signals. Another feature of the ESN is the autonomous operation in prediction: if the Echo State Network is trained with an input that is a backshifted version of the output, then it can be used for signal generation/prediction by using the previous output as input. The main idea of ESNs is tied to Liquid State Machines, which were independently and simultaneously developed with ESNs by Wolfgang Maass. LSMs, ESNs and the newly researched Backpropagation Decorrelation learning rule for RNNs are more and more summarized under the name Reservoir Computing. Schiller and Steil also demonstrated that in conventional training approaches for RNNs, in which all weights are adapted, the dominant changes are in output weights. In cognitive neuroscience, Peter F. Dominey analysed a related process related to the modelling of sequence processing in the mammalian brain, in particular speech recognition in the human brain. The basic idea also included a model of temporal input discrimination in biological neuronal networks. An early clear formulation of the reservoir computing idea is due to K. Kirby, who disclosed this concept in a largely forgotten conference contribution. The first formulation of the reservoir computing idea known today stems from L. Schomaker, who described how a desired target output can be obtained from an RNN by learning to combine signals from a randomly configured ensemble of spiking neural oscillators.
Variants
Echo state networks can be built in different ways. They can be set up with or without directly trainable input-to-output connections, with or without output reservation feedback, with different neurotypes, different reservoir internal connectivity patterns etc. The output weight can be calculated for linear regression with all algorithms whether they are online or offline. In addition to the solutions for errors with smallest squares, margin maximization criteria, so-called training support vector machines, are used to determine the output values. The fixed RNN acts as a random, nonlinear medium whose dynamic response, the "echo", is used as a signal base. The linear combination of this base can be trained to reconstruct the desired output by minimizing some error criteria.
Significance
RNNs were rarely used in practice before the introduction of the ESN. Because these models fit need a version of the gradient descent to adjust the connections. As a result, the algorithms are slow and much worse, making the learning process vulnerable to branching errors. Convergence cannot therefore be guaranteed. The problem with branching does not have the ESN training and is additionally easy to implement. ESNs outperform all other nonlinear dynamic models. However, today the problem that RNNs made slow and error-prone has been solved with the advent of Deep Learning and the unique selling point of ESNs has been lost. In addition, the RNNs have proven themselves in several practical areas such as language processing. To cope with tasks of similar complexity using reservoir calculation methods, it would require memory of excessive size. However, they are used in some areas such as many signal processing applications. However, ESNs have been widely used as a computing principle that mixes with non-digital computer substrates. For example: optical microchips, mechanical nanooscillators, polymer mixtures or even artificial soft limbs.