Legendre transformation
In mathematics and physics, the Legendre transformation, named after Adrien-Marie Legendre, is an involutive transformation on the real-valued convex functions of one real variable. It is commonly used in classical mechanics to derive the Hamiltonian formalism out of the Lagrangian formalism and in thermodynamics to derive the thermodynamic potentials, as well as in the solution of differential equations of several variables.
For sufficiently smooth functions on the real line, the Legendre transform of a function can be specified, up to an additive constant, by the condition that the functions' first derivatives are inverse functions of each other. This can be expressed in Euler's derivative notation as
or, equivalently, as and in Lagrange's notation.
The generalization of the Legendre transformation to affine spaces and non-convex functions is known as the convex conjugate, which can be used to construct a function's convex hull.
Definition
Let be an interval, and a convex function; then its Legendre transform is the function defined bywhere is the supremum, and the domain is
The transform is always well-defined when is convex.
The generalization to convex functions on a convex set is straightforward: has domain
and is defined by
where denotes the dot product of and.
The function is called the convex conjugate function of. For historical reasons, the conjugate variable is often denoted, instead of. If the convex function is defined on the whole line and is everywhere differentiable, then
can be interpreted as the negative of the -intercept of the tangent line to the graph of that has slope.
The Legendre transformation is an application of the duality relationship between points and lines. The functional relationship specified by can be represented equally well as a set of points, or as a set of tangent lines specified by their slope and intercept values.
Understanding the transform in terms of derivatives
For differentiable convex functionson the real line with an invertible first derivative, the Legendre transform
can be specified, up to an additive constant, by the condition that the functions' first
derivatives are inverse functions of each other.
To see this, first note that if is differentiable and is a critical point
of the function of, then the
supremum is achieved at .
Therefore,.
Suppose that is invertible and let denote its inverse.
Then for each, the point is the unique critical point of
. Indeed, and so.
Hence we have for each.
By differentiating with respect to we find
Since this simplifies to .
In other words, and are inverses.
In general, if is an inverse of, then
and so integration provides a constant so that .
In practical terms, given, the parametric plot of versus amounts to the graph of versus.
In some cases, a non-standard requirement is used, amounting to an alternative definition of with a minus sign,
Properties
- The Legendre transform of a convex function is convex.
- It follows that the Legendre transformation is an involution, i.e., :
Examples
Example 1
The exponential function hasas a Legendre transform, since their respective first derivatives and are inverse functions of each other.
This example illustrates that the respective domains of a function and its Legendre transform need not agree.
Example 2
Let defined on ℝ, where is a fixed constant.For fixed, the function of, has the first derivative and second derivative ; there is one stationary point at, which is always a maximum.
Thus, and
The first derivatives of , 2, and of, , are inverse functions to each other. Clearly, furthermore,
namely.
Example 3
Let for.For fixed, is continuous on compact, hence it always takes a finite maximum on it; it follows that.
The stationary point at is in the domain if and only if, otherwise the maximum is taken either at, or. It follows that
Example 4
The function is convex, for every . Clearly is never bounded from above as a function of, unless. Hence is defined on and.One may check involutivity: of course is always bounded as a function of, hence . Then, for all one has
and hence.
Example 5: several variables
Letbe defined on, where is a real, positive definite matrix.
Then is convex, and
has gradient and Hessian, which is negative; hence the stationary point is a maximum.
We have, and
Behavior of differentials under Legendre transforms
The Legendre transform is linked to integration by parts, .Let be a function of two independent variables and, with the differential
Assume that it is convex in for all, so that one may perform the Legendre transform in, with the variable conjugate to. Since the new independent variable is, the differentials and devolve to and, i.e., we build another function with its differential expressed in terms of the new basis and.
We thus consider the function so that
The function is the Legendre transform of, where only the independent variable has been supplanted by. This is widely used in thermodynamics, as illustrated below.
Applications
Analytical mechanics
A Legendre transform is used in classical mechanics to derive the Hamiltonian formulation from the Lagrangian formulation, and conversely. A typical Lagrangian has the formwhere are coordinates on, is a positive real matrix, and
For every fixed, is a convex function of, while plays the role of a constant.
Hence the Legendre transform of as a function of is the Hamiltonian function,
In a more general setting, are local coordinates on the tangent bundle of a manifold. For each, is a convex function of the tangent space. The Legendre transform gives the Hamiltonian as a function of the coordinates of the cotangent bundle ; the inner product used to define the Legendre transform is inherited from the pertinent canonical symplectic structure. In this abstract setting, the Legendre transformation corresponds to the tautological one-form.
Thermodynamics
The strategy behind the use of Legendre transforms in thermodynamics is to shift from a function that depends on a variable to a new function that depends on a new variable, the conjugate of the original one. The new variable is the partial derivative of the original function with respect to the original variable. The new function is the difference between the original function and the product of the old and new variables. Typically, this transformation is useful because it shifts the dependence of, e.g., the energy from an extensive variable to its conjugate intensive variable, which can usually be controlled more easily in a physical experiment.For example, the internal energy is an explicit function of the extensive variables entropy, volume, and chemical composition
which has a total differential
By using the Legendre transform of the internal energy,, with respect to volume,, it is possible to define the enthalpy as
which is an explicit function of the pressure,. The enthalpy contains all of the same information as the internal energy, but is often easier to work with in situations where the pressure is constant.
It is likewise possible to shift the dependence of the energy from the extensive variable of entropy,, to the intensive variable, resulting in the Helmholtz and Gibbs free energies. The Helmholtz free energy,, and Gibbs energy,, are obtained by performing Legendre transforms of the internal energy and enthalpy, respectively,
The Helmholtz free energy is often the most useful thermodynamic potential when temperature and volume are held constant, while the Gibbs energy is often the most useful when temperature and pressure are held constant.
An example – variable capacitor
As another example from physics, consider a parallel-plate capacitor, in which the plates can move relative to one another. Such a capacitor would allow transfer of the electric energy which is stored in the capacitor into external mechanical work, done by the force acting on the plates. One may think of the electric charge as analogous to the "charge" of a gas in a cylinder, with the resulting mechanical force exerted on a piston.Compute the force on the plates as a function of, the distance which separates them. To find the force, compute the potential energy, and then apply the definition of force as the gradient of the potential energy function.
The energy stored in a capacitor of capacitance and charge is
where the dependence on the area of the plates, the dielectric constant of the material between the plates, and the separation are abstracted away as the capacitance.
The force between the plates due to the electric field is then
If the capacitor is not connected to any circuit, then the charges on the plates remain constant as they move, and the force is the negative gradient of the electrostatic energy
However, suppose, instead, that the voltage between the plates is maintained constant by connection to a battery, which is a reservoir for charge at constant potential difference; now the charge is variable instead of the voltage, its Legendre conjugate. To find the force, first compute the non-standard Legendre transform,
The force now becomes the negative gradient of this Legendre transform, still pointing in the same direction,
The two conjugate energies happen to stand opposite to each other, only because of the linearity of the capacitance—except now is no longer a constant. They reflect the two different pathways of storing energy into the capacitor, resulting in, for instance, the same "pull" between a capacitor's plates.
Probability theory
In large deviations theory, the rate function is defined as the Legendre transformation of the logarithm of the moment generating function of a random variable. An important application of the rate function is in the calculation of tail probabilities of sums of i.i.d. random variables.Microeconomics
Legendre transformation arises naturally in microeconomics in the process of finding the supply of some product given a fixed price on the market knowing the cost function, i.e. the cost for the producer to make/mine/etc. units of the given product.A simple theory explains the shape of the supply curve based solely on the cost function. Let us suppose the market price for a one unit of our product is. For a company selling this good, the best strategy is to adjust the production so that its profit is maximized. We can maximize the profit
by differentiating with respect to and solving
represents the optimal quantity of goods that the producer is willing to supply, which is indeed the supply itself:
If we consider the maximal profit as a function of price,, we see that it is the Legendre transform of the cost function.
Geometric interpretation
For a strictly convex function, the Legendre transformation can be interpreted as a mapping between the graph of the function and the family of tangents of the graph.The equation of a line with slope and -intercept is given by. For this line to be tangent to the graph of a function at the point requires
and
The function is strictly monotone as the derivative of a strictly convex function. The second equation can be solved for, allowing elimination of from the first, and solving for the -intercept of the tangent as a function of its slope,
Here, denotes the Legendre transform of.
The family of tangents of the graph of parameterized by is therefore given by
or, written implicitly, by the solutions of the equation
The graph of the original function can be reconstructed from this family of lines as the envelope of this family by demanding
Eliminating from these two equations gives
Identifying with and recognizing the right side of the preceding equation as the Legendre transform of, yields
Legendre transformation in more than one dimension
For a differentiable real-valued function on an open subset of the Legendre conjugate of the pair is defined to be the pair, where is the image of under the gradient mapping, and is the function on given by the formulawhere
is the scalar product on. The multidimensional transform can be interpreted as an encoding of the convex hull of the function's epigraph in terms of its supporting hyperplanes.
Alternatively, if is a vector space and is its dual vector space, then for each point of and of, there is a natural identification of the cotangent spaces with and with. If is a real differentiable function over, then its exterior derivative,, is a section of the cotangent bundle and as such, we can construct a map from to. Similarly, if is a real differentiable function over, then defines a map from to. If both maps happen to be inverses of each other, we say we have a Legendre transform. The notion of the tautological one-form is commonly used in this setting.
When the function is not differentiable, the Legendre transform can still be extended, and is known as the Legendre-Fenchel transformation. In this more general setting, a few properties are lost: for example, the Legendre transform is no longer its own inverse.
Legendre transformation on manifolds
Let be a smooth manifold, and let denote its tangent bundle. Let be a smooth function, which we will refer to as the Lagrangian. The Legendre transformation of is its fiber derivative. This is a morphism of vector bundles defined as follows. Suppose that and that is a chart. Then is a chart on, and for any point in this chart, the fiber derivative of is defined byThe associated energy function is the function defined by
where the angle brackets denote the natural pairing of a tangent and cotangent vector. The Legendre transform can be further generalized to a function from a vector bundle over to its dual bundle.
Further properties
Scaling properties
The Legendre transformation has the following scaling properties: For,It follows that if a function is homogeneous of degree then its image under the Legendre transformation is a homogeneous function of degree, where. Thus, the only monomial whose degree is invariant under Legendre transform is the quadratic.
Behavior under translation
Behavior under inversion
Behavior under linear transformations
Let be a linear transformation. For any convex function on, one haswhere is the adjoint operator of defined by
and is the push-forward of along
A closed convex function is symmetric with respect to a given set of orthogonal linear transformations,
if and only if is symmetric with respect to.
Infimal convolution
The infimal convolution of two functions and is defined asLet be proper convex functions on. Then