Volterra series
The Volterra series is a model for non-linear behavior similar to the Taylor series. It differs from the Taylor series in its ability to capture 'memory' effects. The Taylor series can be used for approximating the response of a nonlinear system to a given input if the output of this system depends strictly on the input at that particular time. In the Volterra series the output of the nonlinear system depends on the input to the system at all other times. This provides the ability to capture the 'memory' effect of devices like capacitors and inductors.
It has been applied in the fields of medicine and biology, especially neuroscience. It is also used in electrical engineering to model intermodulation distortion in many devices including power amplifiers and frequency mixers. Its main advantage lies in its generality: it can represent a wide range of systems. Thus it is sometimes considered a non-parametric model.
In mathematics, a Volterra series denotes a functional expansion of a dynamic, nonlinear, time-invariant functional. Volterra series are frequently used in system identification. The Volterra series, which is used to prove the Volterra theorem, is an infinite sum of multidimensional convolutional integrals.
History
The Volterra series is a modernized version of the theory of analytic functionals due to the Italian mathematician Vito Volterra in work dating from 1887. Norbert Wiener became interested in this theory in the 1920s from contact with Volterra's student Paul Lévy. He applied his theory of Brownian motion to the integration of Volterra analytic functionals.The use of Volterra series for system analysis originated from a restricted 1942 wartime report of Wiener, then professor of mathematics at MIT. It used the series to make an approximate analysis of the effect of radar noise in a nonlinear receiver circuit. The report became public after the war. As a general method of analysis of nonlinear systems, Volterra series came into use after about 1957 as the result of a series of reports, at first privately circulated, from MIT and elsewhere. The name Volterra series came into use a few years later.
Mathematical theory
The theory of Volterra series can be viewed from two different perspectives: either one considers an operator mapping between two real function spaces or a functional mapping from a real function space into the real numbers. The latter, functional perspective is in more frequent use, due to the assumed time-invariance of the system.Continuous time
A continuous time-invariant system with x as input and y as output can be expanded in Volterra series as:Here the constant term on the right hand side is usually taken to be zero by suitable choice of output level. The function is called the n-th order Volterra kernel. It can be regarded as a higher-order impulse response of the system. For the representation to be unique the kernels must be symmetrical in the n variables. If it is not symmetrical it can be replaced by a symmetrized kernel which is the average over the n! permutations of these n variables τ.
If N is finite, the series is said to be truncated. If a,b, and N are finite, the series is called doubly finite.
Sometimes the nth-order term is divided by n factorial, a convention which is convenient when taking the output of one Volterra system as the input of another.
The causality condition: Since in any physically realizable system the output can only depend on previous values of the input, the kernels will be zero if any of the variables are negative. The integrals may then be written over the half range from zero to infinity.
So if the operator is causal,.
Fréchet's approximation theorem: The use of the Volterra series to represent a time-invariant functional relation is often justified by appealing to a theorem due to Fréchet. This theorem states that a time-invariant functional relation can be approximated uniformly and to an arbitrary degree of precision by a sufficiently high finite order Volterra series. Among other conditions, the set of admissible input functions for which the approximation will hold is required to be compact. It is usually taken to be an equicontinuous, uniformly bounded set of functions, which is compact by the Arzelà–Ascoli theorem. In many physical situations, this assumption about the input set is a reasonable one. The theorem, however, gives no indication as to how many terms are needed for a good approximation, which is an essential question in applications.
Discrete time
This is similar to the continuous-time case:are called discrete-time Volterra kernels.
If P is finite, the series operator is said to be truncated. If a,b and P are finite the series operator is called doubly finite Volterra series. If the operator is said to be causal.
We can always consider, without loss of the generality, the kernel as symmetrical. In fact, for the commutativity of the multiplication it is always possible to symmetrize it by forming a new kernel taken as the average of the kernels for all permutations of the variables.
For a causal system with symmetrical kernels we can rewrite the nth term approximately in triangular form
Methods to estimate the kernel coefficients
Estimating the Volterra coefficients individually is complicated since the basis functionals of the Volterra series are correlated. This leads to the problem of simultaneously solving a set of integral-equations for the coefficients. Hence, estimation of Volterra coefficients is generally performed by estimating the coefficients of an orthogonalized series, e.g. the Wiener series, and then recomputing the coefficients of the original Volterra series. The Volterra series main appeal over the orthogonalized series lies in its intuitive, canonical structure, i.e. all interactions of the input have one fixed degree. The orthogonalized basis functionals will generally be quite complicated.An important aspect, with respect to which the following methods differ is whether the orthogonalization of the basis functionals is to be performed over the idealized specification of the input signal or over the actual realization of the input. The latter methods, despite their lack of mathematical elegance, have been shown to be more flexible and precise.
Crosscorrelation method
This method, developed by Lee & Schetzen, orthogonalizes with respect to the actual mathematical description of the signal, i.e. the projection onto the new basis functionals is based on the knowledge of the moments of the random signal.We can write the Volterra series in terms of homogeneous operators, as
where
To allow identification orthogonalization, Volterra series must be rearranged in terms of orthogonal non-homogeneous G operators :
The G operators can be defined by the following
whenever is arbitrary homogeneous Volterra, x is some stationary white noise with zero mean and variance A.
Recalling that every Volterra functional is orthogonal to all Wiener functional of greater order, and considering the following Volterra functional:
we can write
If x is SWN, and by letting, we have:
So if we exclude the diagonal elements,, it is
If we want to consider the diagonal elements, the solution proposed by Lee and Schetzen is:
The main drawback of this technique is that the estimation errors, made on all elements of lower-order kernels, will affect each diagonal element of order p by means of the summation, conceived as the solution for the estimation of the diagonal elements themselves.
Efficient formulas to avoid this drawback and references for diagonal kernel element estimation can be found in
and
Once the Wiener kernels were identified, Volterra kernels can be obtained by using Wiener to Volterra formulas, in the following reported for a fifth order Volterra series:
Multiple-variance method
In the traditional orthogonal algorithm, using inputs with high has the advantage of stimulating high order nonlinearity, so as to achieve more accurate high order kernel identification.As a drawback, the use of high values causes high identification error in lower order kernels, as shown in
mainly due to nonideality of the input and truncation errors.
On the contrary the use of lower in the identification process can lead to a better estimation of lower order kernel, but can be insufficient to stimulate high order nonlinearity.
This phenomenon, that can be called locality of truncated Volterra series, can be revealed by
calculating the output error of a series as a function of different variances of input.
This test can be repeated with series identified with different input variances, obtaining different curves, each with a minimum in correspondence of the variance used in the identification.
To overcome this limitation, a low value should be used for the lower order kernel and gradually increased for higher order kernels.
This is not a theoretical problem in Wiener kernel identification, since the Wiener functional are orthogonal to each other, but an appropriate normalization is needed in Wiener to Volterra conversion formulas for taking into account the use of different variances.
Furthermore, new Wiener to Volterra conversion formulas are needed.
The traditional Wiener kernel identification should be changed as follows:
In the above formulas the impulse functions are introduced for the identification of diagonal kernel points.
If the Wiener kernels are extracted with the new formulas, the following Wiener to Volterra formulas are needed:
As can be seen, the drawback with respect to the formula proposed in is that for the identification of the n-order kernel, all lower kernels must be identified again with the higher variance.
However an outstanding improvement in the output MSE will be obtained if the Wiener and Volterra kernels are obtained with the new formulas, as can be seen in.
Exact orthogonal algorithm
This method and its more efficient version were invented by KorenbergIn this method the orthogonalization is performed empirically over the actual input. It has been shown to perform more precisely than the Crosscorrelation method. Another advantage is that arbitrary inputs can be used for the orthogonalization and that fewer data-points suffice to reach a desired level of accuracy. Also, estimation can be performed incrementally until some criterion is fulfilled.