Shannon's source coding theorem


In information theory, Shannon's source coding theorem establishes the limits to possible data compression, and the operational meaning of the Shannon entropy.
Named after Claude Shannon, the source coding theorem shows that it is impossible to compress the data such that the code rate is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss.
The source coding theorem for symbol codes places an upper and a lower bound on the minimal possible expected length of codewords as a function of the entropy of the input word and of the size of the target alphabet.

Statements

Source coding is a mapping from symbols from an information source to a sequence of alphabet symbols such that the source symbols can be exactly recovered from the binary bits or recovered within some distortion. This is the concept behind data compression.

Extension to non-stationary independent sources

Fixed Rate lossless source coding for discrete time non-stationary independent sources

Define typical set as:
Then, for given, for large enough,. Now we just encode the sequences in the typical set, and usual methods in source coding show that the cardinality of this set is smaller than. Thus, on an average, bits suffice for encoding with probability greater than, where and can be made arbitrarily small, by making larger.