Cyclic redundancy check

A cyclic redundancy check is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to raw data. Blocks of data entering these systems get a short check value attached, based on the remainder of a polynomial division of their contents. On retrieval, the calculation is repeated and, in the event the check values do not match, corrective action can be taken against data corruption. CRCs can be used for error correction.
CRCs are so called because the check value is a redundancy and the algorithm is based on cyclic codes. CRCs are popular because they are simple to implement in binary hardware, easy to analyze mathematically, and particularly good at detecting common errors caused by noise in transmission channels. Because the check value has a fixed length, the function that generates it is occasionally used as a hash function.
The CRC was invented by W. Wesley Peterson in 1961; the 32-bit CRC function, used in Ethernet and many other standards, is the work of several researchers and was published in 1975.

Introduction

CRCs are based on the theory of cyclic error-correcting codes. The use of systematic cyclic codes, which encode messages by adding a fixed-length check value, for the purpose of error detection in communication networks, was first proposed by W. Wesley Peterson in 1961.
Cyclic codes are not only simple to implement but have the benefit of being particularly well suited for the detection of burst errors: contiguous sequences of erroneous data symbols in messages. This is important because burst errors are common transmission errors in many communication channels, including magnetic and optical storage devices. Typically an n-bit CRC applied to a data block of arbitrary length will detect any single error burst not longer than n bits, and the fraction of all longer error bursts that it will detect is.
Specification of a CRC code requires definition of a so-called generator polynomial. This polynomial becomes the divisor in a polynomial long division, which takes the message as the dividend and in which the quotient is discarded and the remainder becomes the result. The important caveat is that the polynomial coefficients are calculated according to the arithmetic of a finite field, so the addition operation can always be performed bitwise-parallel.
In practice, all commonly used CRCs employ the Galois field of two elements, GF. The two elements are usually called 0 and 1, comfortably matching computer architecture.
A CRC is called an n-bit CRC when its check value is n bits long. For a given n, multiple CRCs are possible, each with a different polynomial. Such a polynomial has highest degree n, which means it has terms. In other words, the polynomial has a length of ; its encoding requires bits. Note that most polynomial specifications either drop the MSB or LSB, since they are always 1. The CRC and associated polynomial typically have a name of the form CRC-n-XXX as in the [|table] below.
The simplest error-detection system, the parity bit, is in fact a 1-bit CRC: it uses the generator polynomial , and has the name CRC-1.

Application

A CRC-enabled device calculates a short, fixed-length binary sequence, known as the check value or CRC, for each block of data to be sent or stored and appends it to the data, forming a codeword.
When a codeword is received or read, the device either compares its check value with one freshly calculated from the data block, or equivalently, performs a CRC on the whole codeword and compares the resulting check value with an expected residue constant.
If the CRC values do not match, then the block contains a data error.
The device may take corrective action, such as rereading the block or requesting that it be sent again. Otherwise, the data is assumed to be error-free.

Data integrity

CRCs are specifically designed to protect against common types of errors on communication channels, where they can provide quick and reasonable assurance of the integrity of messages delivered. However, they are not suitable for protecting against intentional alteration of data.
Firstly, as there is no authentication, an attacker can edit a message and recompute the CRC without the substitution being detected. When stored alongside the data, CRCs and cryptographic hash functions by themselves do not protect against intentional modification of data. Any application that requires protection against such attacks must use cryptographic authentication mechanisms, such as message authentication codes or digital signatures.
Secondly, unlike cryptographic hash functions, CRC is an easily reversible function, which makes it unsuitable for use in digital signatures.
Thirdly, CRC is a linear function with a property that
as a result, even if the CRC is encrypted with a stream cipher that uses XOR as its combining operation, both the message and the associated CRC can be manipulated without knowledge of the encryption key; this was one of the well-known design flaws of the Wired Equivalent Privacy protocol.

Computation

To compute an n-bit binary CRC, line the bits representing the input in a row, and position the -bit pattern representing the CRC's divisor underneath the left-hand end of the row.
In this example, we shall encode 14 bits of message with a 3-bit CRC, with a polynomial. The polynomial is written in binary as the coefficients; a 3rd-degree polynomial has 4 coefficients. In this case, the coefficients are 1, 0, 1 and 1. The result of the calculation is 3 bits long.
Start with the message to be encoded:


11010011101100

This is first padded with zeros corresponding to the bit length n of the CRC. This is done so that the resulting code word is in systematic form. Here is the first calculation for computing a 3-bit CRC:


11010011101100 000 <--- input right padded by 3 bits
1011 <--- divisor = x³ + x + 1
------------------
01100011101100 000 <--- result

The algorithm acts on the bits directly above the divisor in each step. The result for that iteration is the bitwise XOR of the polynomial divisor with the bits above it. The bits not above the divisor are simply copied directly below for that step. The divisor is then shifted right to align with the highest remaining 1 bit in the input, and the process is repeated until the divisor reaches the right-hand end of the input row. Here is the entire calculation:


11010011101100 000 <--- input right padded by 3 bits
1011 <--- divisor
01100011101100 000 <--- result 
 1011 <--- divisor...
00111011101100 000
 1011
00010111101100 000
 1011
00000001101100 000 <--- note that the divisor moves over to align with the next 1 in the dividend 
 1011 
00000000110100 000
 1011
00000000011000 000
 1011
00000000001110 000
 1011
00000000000101 000
 101 1
-----------------
00000000000000 100 <--- remainder. Division algorithm stops here as dividend is equal to zero.

Since the leftmost divisor bit zeroed every input bit it touched, when this process ends the only bits in the input row that can be nonzero are the n bits at the right-hand end of the row. These n bits are the remainder of the division step, and will also be the value of the CRC function.
The validity of a received message can easily be verified by performing the above calculation again, this time with the check value added instead of zeroes. The remainder should equal zero if there are no detectable errors.


11010011101100 100 <--- input with check value
1011 <--- divisor
01100011101100 100 <--- result
 1011 <--- divisor...
00111011101100 100
......
00000000001110 100
 1011
00000000000101 100
 101 1
------------------
00000000000000 000 <--- remainder

The following Python code outlines a function which will return the initial CRC remainder for a chosen input and polynomial, with either 1 or 0 as the initial padding. Note that this code works with string inputs rather than raw numbers:

def crc_remainder:
"""Calculate the CRC remainder of a string of bits using a chosen polynomial.
initial_filler should be '1' or '0'.
"""
polynomial_bitstring = polynomial_bitstring.lstrip
len_input = len
initial_padding = initial_filler *
input_padded_array = list
while '1' in input_padded_array:
cur_shift = input_padded_array.index
for i in range:
input_padded_array = str
return ''.join
def crc_check:
"""Calculate the CRC check of a string of bits using a chosen polynomial."""
polynomial_bitstring = polynomial_bitstring.lstrip
len_input = len
initial_padding = check_value
input_padded_array = list
while '1' in input_padded_array:
cur_shift = input_padded_array.index
for i in range:
input_padded_array = str
return

>>> crc_check
True
>>> crc_remainder
'100'

CRC-32 algorithm

This is a practical algorithm for the CRC-32 variant of CRC. The CRCTable is a memoization of a calculation that would have to be repeated for each byte of the message.
Function CRC32
Input:
data: Bytes //Array of bytes
Output:
crc32: UInt32 //32-bit unsigned crc-32 value
//Initialize crc-32 to starting value
crc32 ← 0xFFFFFFFF
for each byte in data do
nLookupIndex ← and 0xFF;
crc32 ← xor CRCTable //CRCTable is an array of 256 32-bit constants
//Finalize the CRC-32 value by inverting all the bits
crc32 ← crc32 xor 0xFFFFFFFF
return crc32

Mathematics

Mathematical analysis of this division-like process reveals how to select a divisor that guarantees good error-detection properties. In this analysis, the digits of the bit strings are taken as the coefficients of a polynomial in some variable x—coefficients that are elements of the finite field GF, instead of more familiar numbers. The set of binary polynomials is a mathematical ring.

Designing polynomials

The selection of the generator polynomial is the most important part of implementing the CRC algorithm. The polynomial must be chosen to maximize the error-detecting capabilities while minimizing overall collision probabilities.
The most important attribute of the polynomial is its length, because of its direct influence on the length of the computed check value.
The most commonly used polynomial lengths are:

9 bits
17 bits
33 bits
65 bits

A CRC is called an n-bit CRC when its check value is n-bits. For a given n, multiple CRCs are possible, each with a different polynomial. Such a polynomial has highest degree n, and hence terms. The remainder has length n. The CRC has a name of the form CRC-n-XXX.
The design of the CRC polynomial depends on the maximum total length of the block to be protected, the desired error protection features, and the type of resources for implementing the CRC, as well as the desired performance. A common misconception is that the "best" CRC polynomials are derived from either irreducible polynomials or irreducible polynomials times the factor , which adds to the code the ability to detect all errors affecting an odd number of bits. In reality, all the factors described above should enter into the selection of the polynomial and may lead to a reducible polynomial. However, choosing a reducible polynomial will result in a certain proportion of missed errors, due to the quotient ring having zero divisors.
The advantage of choosing a primitive polynomial as the generator for a CRC code is that the resulting code has maximal total block length in the sense that all 1-bit errors within that block length have different remainders and therefore, since the remainder is a linear function of the block, the code can detect all 2-bit errors within that block length. If is the degree of the primitive generator polynomial, then the maximal total block length is, and the associated code is able to detect any single-bit or double-bit errors. We can improve this situation. If we use the generator polynomial, where is a primitive polynomial of degree, then the maximal total block length is, and the code is able to detect single, double, triple and any odd number of errors.
A polynomial that admits other factorizations may be chosen then so as to balance the maximal total blocklength with a desired error detection power. The BCH codes are a powerful class of such polynomials. They subsume the two examples above. Regardless of the reducibility properties of a generator polynomial of degree r, if it includes the "+1" term, the code will be able to detect error patterns that are confined to a window of r contiguous bits. These patterns are called "error bursts".

Specification

The concept of the CRC as an error-detecting code gets complicated when an implementer or standards committee uses it to design a practical system. Here are some of the complications:

Sometimes an implementation prefixes a fixed bit pattern to the bitstream to be checked. This is useful when clocking errors might insert 0-bits in front of a message, an alteration that would otherwise leave the check value unchanged.
Usually, but not always, an implementation appends n 0-bits to the bitstream to be checked before the polynomial division occurs. Such appending is explicitly demonstrated in the Computation of CRC article. This has the convenience that the remainder of the original bitstream with the check value appended is exactly zero, so the CRC can be checked simply by performing the polynomial division on the received bitstream and comparing the remainder with zero. Due to the associative and commutative properties of the exclusive-or operation, practical table driven implementations can obtain a result numerically equivalent to zero-appending without explicitly appending any zeroes, by using an equivalent, faster algorithm that combines the message bitstream with the stream being shifted out of the CRC register.
Sometimes an implementation exclusive-ORs a fixed bit pattern into the remainder of the polynomial division.
Bit order: Some schemes view the low-order bit of each byte as "first", which then during polynomial division means "leftmost", which is contrary to our customary understanding of "low-order". This convention makes sense when serial-port transmissions are CRC-checked in hardware, because some widespread serial-port transmission conventions transmit bytes least-significant bit first.
Byte order: With multi-byte CRCs, there can be confusion over whether the byte transmitted first is the least-significant byte or the most-significant byte. For example, some 16-bit CRC schemes swap the bytes of the check value.
Omission of the high-order bit of the divisor polynomial: Since the high-order bit is always 1, and since an n-bit CRC must be defined by an -bit divisor which overflows an n-bit register, some writers assume that it is unnecessary to mention the divisor's high-order bit.
Omission of the low-order bit of the divisor polynomial: Since the low-order bit is always 1, authors such as Philip Koopman represent polynomials with their high-order bit intact, but without the low-order bit. This convention encodes the polynomial complete with its degree in one integer.

These complications mean that there are three common ways to express a polynomial as an integer: the first two, which are mirror images in binary, are the constants found in code; the third is the number found in Koopman's papers. In each case, one term is omitted. So the polynomial may be transcribed as:

0x3 = 0b0011, representing
0xC = 0b1100, representing
0x9 = 0b1001, representing

In the table below they are shown as:

Name	Normal	Reversed	Reversed reciprocal
CRC-4	0x3	0xC	0x9

Obfuscation

CRCs in proprietary protocols might be obfuscated by using a non-trivial initial value and a final XOR, but these techniques do not add cryptographic strength to the algorithm and can be reverse engineered using straightforward methods.

Standards and common use

Numerous varieties of cyclic redundancy checks have been incorporated into technical standards. By no means does one algorithm, or one of each degree, suit every purpose; Koopman and Chakravarty recommend selecting a polynomial according to the application requirements and the expected distribution of message lengths. The number of distinct CRCs in use has confused developers, a situation which authors have sought to address. There are three polynomials reported for CRC-12, twenty-two conflicting definitions of CRC-16, and seven of CRC-32.
The polynomials commonly applied are not the most efficient ones possible. Since 1993, Koopman, Castagnoli and others have surveyed the space of polynomials between 3 and 64 bits in size, finding examples that have much better performance than the polynomials of earlier protocols, and publishing the best of these with the aim of improving the error detection capacity of future standards. In particular, iSCSI and SCTP have adopted one of the findings of this research, the CRC-32C polynomial.
The design of the 32-bit polynomial most commonly used by standards bodies, CRC-32-IEEE, was the result of a joint effort for the Rome Laboratory and the Air Force Electronic Systems Division by Joseph Hammond, James Brown and Shyan-Shiang Liu of the Georgia Institute of Technology and Kenneth Brayer of the Mitre Corporation. The earliest known appearances of the 32-bit polynomial were in their 1975 publications: Technical Report 2956 by Brayer for Mitre, published in January and released for public dissemination through DTIC in August, and Hammond, Brown and Liu's report for the Rome Laboratory, published in May. Both reports contained contributions from the other team. During December 1975, Brayer and Hammond presented their work in a paper at the IEEE National Telecommunications Conference: the IEEE CRC-32 polynomial is the generating polynomial of a Hamming code and was selected for its error detection performance. Even so, the Castagnoli CRC-32C polynomial used in iSCSI or SCTP matches its performance on messages from 58 bits to 131 kbits, and outperforms it in several size ranges including the two most common sizes of Internet packet. The ITU-T G.hn standard also uses CRC-32C to detect errors in the payload.
CRC32C computation is implemented in hardware as an operation of SSE4.2 instruction set, first introduced in Intel processors' Nehalem microarchitecture. CRC32C operations are also defined in AArch64.

Polynomial representations of cyclic redundancy checks

The table below lists only the polynomials of the various algorithms in use. Variations of a particular protocol can impose pre-inversion, post-inversion and reversed bit ordering as described above. For example, the CRC32 used in Gzip and Bzip2 use the same polynomial, but Gzip employs reversed bit ordering, while Bzip2 does not.
Note that even parity polynomials in GF with degree greater than 1 are never primitive. Even parity polynomial marked as primitive in this table represent a primitive polynomial multiplied by.

Implementations

;
CRC catalogues

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...