Heavy-tailed distribution

In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded: that is, they have heavier tails than the exponential distribution. In many applications it is the right tail of the distribution that is of interest, but a distribution may have a heavy left tail, or both tails may be heavy.
There are three important subclasses of heavy-tailed distributions: the fat-tailed distributions, the long-tailed distributions and the subexponential distributions. In practice, all commonly used heavy-tailed distributions belong to the subexponential class.
There is still some discrepancy over the use of the term heavy-tailed. There are two other definitions in use. Some authors use the term to refer to those distributions which do not have all their power moments finite; and some others to those distributions that do not have a finite variance. The definition given in this article is the most general in use, and includes all distributions encompassed by the alternative definitions, as well as those distributions such as log-normal that possess all their power moments, yet which are generally considered to be heavy-tailed.

Definitions

Definition of heavy-tailed distribution

The distribution of a random variable X with distribution function F is said to have a heavy tail if the moment generating function of X, M_X, is infinite for all t > 0.
That means
An implication of this is that
This is also written in terms of the tail distribution function
as

Definition of long-tailed distribution

The distribution of a random variable X with distribution function F is said to have a long right tail if for all t > 0,
or equivalently
This has the intuitive interpretation for a right-tailed long-tailed distributed quantity that if the long-tailed quantity exceeds some high level, the probability approaches 1 that it will exceed any other higher level.
All long-tailed distributions are heavy-tailed, but the converse is false, and it is possible to construct heavy-tailed distributions that are not long-tailed.

Subexponential distributions

Subexponentiality is defined in terms of convolutions of probability distributions. For two independent, identically distributed random variables with common distribution function the convolution of with itself, is convolution square, using Lebesgue–Stieltjes integration, by:
The n-fold convolution is defined in the same way. The tail distribution function is defined as.
A distribution on the positive half-line is subexponential if
This implies that, for any,
The probabilistic interpretation of this is that, for a sum of independent random variables with common distribution,
This is often known as the principle of the single big jump or catastrophe principle.
A distribution on the whole real line is subexponential if the distribution
is. Here is the indicator function of the positive half-line. Alternatively, a random variable supported on the real line is subexponential if and only if is subexponential.
All subexponential distributions are long-tailed, but examples can be constructed of long-tailed distributions that are not subexponential.

Common heavy-tailed distributions

All commonly used heavy-tailed distributions are subexponential.
Those that are one-tailed include:

the Pareto distribution;
the Log-normal distribution;
the Lévy distribution;
the Weibull distribution with shape parameter greater than 0 but less than 1;
the Burr distribution;
the log-logistic distribution;
the log-gamma distribution;
the Fréchet distribution;
the log-Cauchy distribution, sometimes described as having a "super-heavy tail" because it exhibits logarithmic decay producing a heavier tail than the Pareto distribution.

Those that are two-tailed include:

The Cauchy distribution, itself a special case of both the stable distribution and the t-distribution;
The family of stable distributions, excepting the special case of the normal distribution within that family. Some stable distributions are one-sided, see e.g. Lévy distribution. See also financial models with long-tailed distributions and volatility clustering.
The t-distribution.
The skew lognormal cascade distribution.
Relationship to fat-tailed distributions

A fat-tailed distribution is a distribution for which the probability density function, for large x, goes to zero as a power. Since such a power is always bounded below by the probability density function of an exponential distribution, fat-tailed distributions are always heavy-tailed. Some distributions, however, have a tail which goes to zero slower than an exponential function, but faster than a power. An example is the log-normal distribution. Many other heavy-tailed distributions such as the log-logistic and Pareto distribution are, however, also fat-tailed.

Estimating the tail-index

There are parametric and non-parametric approaches to the problem of the tail-index estimation.
To estimate the tail-index using the parametric approach, some authors employ GEV distribution or Pareto distribution; they may apply the maximum-likelihood estimator.

Pickand's tail-index estimator

With a random sequence of independent and same density function, the Maximum Attraction Domain of the generalized extreme value density, where. If and , then the Pickands tail-index estimation is
where. This estimator converges in probability to.

Hill's tail-index estimator

Let be a sequence of independent and identically distributed random variables with distribution function, the maximum domain of attraction of the generalized extreme value distribution, where. The sample path is where is the sample size. If
is an intermediate order sequence, i.e., and , then the Hill tail-index estimator is
where is the -th order statistic of.
This estimator converges in probability to, and is asymptotically normal provided is restricted based on a higher order regular variation property
. Consistency and asymptotic normality extend to a large class of dependent and heterogeneous sequences, irrespective of whether is observed, or a computed residual or filtered data from a large class of models and estimators, including mis-specified models and models with errors that are dependent.

Ratio estimator of the tail-index

The ratio estimator of the tail-index was introduced by Goldie
and Smith.
It is constructed similarly to Hill's estimator but uses a non-random "tuning parameter".
A comparison of Hill-type and RE-type estimators can be found in Novak.

Software

, C tool for estimating the heavy-tail index.
Estimation of heavy-tailed density

Nonparametric approaches to estimate heavy- and superheavy-tailed probability density functions were given in
Markovich. These are approaches based on variable bandwidth and long-tailed kernel estimators; on the preliminary data transform to a new random variable at finite or infinite intervals which is more convenient for the estimation and then inverse transform of the obtained density estimate; and "piecing-together approach" which provides a certain parametric model for the tail of the density and a non-parametric model to approximate the mode of the density. Nonparametric estimators require an appropriate selection of tuning parameters like a bandwidth of kernel estimators and the bin width of the histogram. The well known data-driven methods of such selection are a cross-validation and its modifications, methods based on the minimization of the mean squared error and its asymptotic and their upper bounds. A discrepancy method which uses well-known nonparametric statistics like Kolmogorov-Smirnov's, von Mises and Anderson-Darling's ones as a metric in the space of distribution functions and quantiles of the later statistics as a known uncertainty or a discrepancy value can be found in. Bootstrap is another tool to find smoothing parameters using approximations of unknown MSE by different schemes of re-samples selection, see e.g.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...