Dynamic time warping
In time series analysis, dynamic time warping is one of the algorithms for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walking faster than the other, or if there were accelerations and decelerations during the course of an observation. DTW has been applied to temporal sequences of video, audio, and graphics data — indeed, any data that can be turned into a linear sequence can be analyzed with DTW. A well known application has been automatic speech recognition, to cope with different speaking speeds. Other applications include speaker recognition and online signature recognition. It can also be used in partial shape matching application.
In general, DTW is a method that calculates an optimal match between two given sequences with certain restriction and rules:
- Every index from the first sequence must be matched with one or more indices from the other sequence, and vice versa
- The first index from the first sequence must be matched with the first index from the other sequence
- The last index from the first sequence must be matched with the last index from the other sequence
- The mapping of the indices from the first sequence to indices from the other sequence must be monotonically increasing, and vice versa, i.e. if are indices from the first sequence, then there must not be two indices in the other sequence, such that index is matched with index and index is matched with index, and vice versa
The sequences are "warped" non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension. This sequence alignment method is often used in time series classification. Although DTW measures a distance-like quantity between two given sequences, it doesn't guarantee the triangle inequality to hold.
In addition to a similarity measure between the two sequences, a so called "warping path" is produced, by warping according to this path the two signals may be aligned in time. The signal with an original set of points X, Y is transformed to X, Y. This finds applications in genetic sequence and audio synchronisation. In a related technique sequences of varying speed may be averaged using this technique see the [|average sequence] section.
This is conceptually very similar to the Needleman–Wunsch algorithm.
Implementation
This example illustrates the implementation of the dynamic time warping algorithm when the two sequences s and t are strings of discrete symbols. For two symbols x and y,d
is a distance between the symbols, e.g. d
=.int DTWDistance
where
DTW
is the distance between s
and t
with the best alignment.We sometimes want to add a locality constraint. That is, we require that if
s
is matched with t
, then is no larger than w, a window parameter.We can easily modify the above algorithm to add a locality constraint.
However, the above given modification works only if is no larger than w, i.e. the end point is within the window length from diagonal. In order to make the algorithm work, the window parameter w must be adapted so that .
int DTWDistance
Warping properties
The DTW algorithm produces a discrete matching between existing elements of one series to another. In other words, it does not allow time-scaling of segments within the sequence. Other methods allow continuous warping. For example, Correlation Optimized Warping divides the sequence into uniform segments that are scaled in time using linear interpolation, to produce the best matching warping. The segment scaling causes potential creation of new elements, by time-scaling segments either down or up, and thus produces a more sensitive warping than DTW's discrete matching of raw elements.Complexity
The time complexity of DTW algorithm is, where and are the lengths of the two input sequences. Assuming that, the time complexity can be said to be. The same is true for space complexity. The 50 years old quadratic time bound was recently broken, an implementation due to Gold and Sharir enables computing DTW in time and space.Fast computation
Fast techniques for computing DTW include PrunedDTW, SparseDTW, FastDTW, and the MultiscaleDTW.A common task, retrieval of similar time series, can be accelerated by using lower bounds such as LB_Keogh or LB_Improved. In a survey, Wang et al. reported slightly better results with the LB_Improved lower bound than the LB_Keogh bound, and found that other techniques were inefficient.
Average sequence
Averaging for dynamic time warping is the problem of finding an average sequence for a set of sequences.NLAAF is an exact method to average two sequences using DTW.
For more than two sequences, the problem is related to the one of the multiple alignment and requires heuristics.
DBA is currently a reference method to average a set of sequences consistently with DTW.
COMASA efficiently randomizes the search for the average sequence, using DBA as a local optimization process.
Supervised learning
A nearest-neighbour classifier can achieve state-of-the-art performance when using dynamic time warping as a distance measure.Alternative approach
An alternative technique for DTW is based on functional data analysis, in which the time series are regarded as discretizations of smooth functions of time and therefore continuous mathematics is applied. Smoothness and monotonicity of time warp functions may be obtained for instance by integrating a time-varying radial basis function, thus being a one-dimensional diffeomorphism.Optimal nonlinear time warping functions are computed by minimizing a measure of distance of the set of functions to their warped average. Roughness penalty terms for the warping functions may be added, e.g., by constraining the size of their curvature. The resultant warping functions are smooth, which facilitates further processing. This approach has been successfully applied to analyze patterns and variability of speech movements.
Another related approach are hidden Markov models and it has been shown that the Viterbi algorithm used to search for the most likely path through the HMM is equivalent to stochastic DTW.
Open-source software
- The C++ library implements Fast Nearest-Neighbor Retrieval algorithms under the GNU General Public License. It also provides a C++ implementation of dynamic time warping, as well as various lower bounds.
- The library is a Java implementation of DTW and a FastDTW implementation that provides optimal or near-optimal alignments with an O time and memory complexity, in contrast to the O requirement for the standard DTW algorithm. FastDTW uses a multilevel approach that recursively projects a solution from a coarser resolution and refines the projected solution.
- published to Maven Central.
- The provides Python and R packages with a comprehensive coverage of the DTW algorithm family members, including a variety of recursion rules, constraints, and substring matching.
- The mlpy Python library implements DTW.
- The Python library implements the Manhattan and Euclidean flavoured DTW measures including the LB_Keogh lower bounds.
- The C++/CUDA library implements subsequence alignment of Euclidean-flavoured DTW and z-normalized Euclidean distance similar to the popular UCR-Suite on CUDA-enabled accelerators.
- The machine learning library implements .
- The C# library implements DTW with various options.
- uses Greedy DTW as part of LaTeX symbol classifier program.
- The implements DTW to match mel-frequency cepstral coefficients of audio signals.
- : a GPL Java implementation of DBA.
- The C++ real-time gesture-recognition toolkit implements DTW.
- The software package implements DTW and nearest-neighbour classifiers, as well as their extensions.
- The Python library implements the classic O Dynamic Programming algorithm and bases on Numpy. It supports values of any dimension, as well as using custom norm functions for the distances. It is licensed under the MIT license.
- The Python library implements DTW in the time-series context.
- The CUDA Python library implements a state of the art improved Time Warp Edit Distance using only linear memory with phenomenal speedups.
- Is a Julia implementation of DTW and related algorithms such as FastDTW, SoftDTW, GeneralDTW and DTW barycenters.
Applications