Rate-distortion optimization is a method of improving video quality in video compression. The name refers to the optimization of the amount of distortion against the amount of data required to encode the video, the rate. While it is primarily used by video encoders, rate-distortion optimization can be used to improve quality in any encoding situation where decisions have to be made that affect both file size and quality simultaneously.
Background
The classical method of making encoding decisions is for the video encoder to choose the result which yields the highest quality output image. However, this has the disadvantage that the choice it makes might require more bits while giving comparatively little quality benefit. One common example of this problem is in motion estimation, and in particular regarding the use of quarter pixel-precision motion estimation. Adding the extra precision to the motion of a block during motion estimation might increase quality, but in some cases that extra quality isn't worth the extra bits necessary to encode the motion vector to a higher precision.
How it works
Rate-distortion optimization solves the aforementioned problem by acting as a video quality metric, measuring both the deviation from the source material and the bit cost for each possible decision outcome. The bits are mathematically measured by multiplying the bit cost by the Lagrangian, a value representing the relationship between bit cost and quality for a particular quality level. The deviation from the source is usually measured as the mean squared error, in order to maximize the PSNR video quality metric. Calculating the bit cost is made more difficult by the entropy encoders in modern video codecs, requiring the rate-distortion optimization algorithm to pass each block of video to be tested to the entropy coder to measure its actual bit cost. In MPEG codecs, the full process consists of a discrete cosine transform, followed by quantization and entropy encoding. Because of this, rate-distortion optimization is much slower than most other block-matching metrics, such as the simple sum of absolute differences and sum of absolute transformed differences. As such it is usually used only for the final steps of the motion estimation process, such as deciding between different partition types in H.264/AVC.