Hamming weight

The Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used. It is thus equivalent to the Hamming distance from the all-zero string of the same length. For the most typical case, a string of bits, this is the number of 1's in the string, or the digit sum of the binary representation of a given number and the ℓ₁ norm of a bit vector. In this binary case, it is also called the population count, popcount, sideways sum, or bit summation.

String	Hamming weight
11101	4
11101000	4
00000000	0
678012340567	10

A plot for the population count for numbers 0 to 256.

History and usage

The Hamming weight is named after Richard Hamming although he did not originate the notion. The Hamming weight of binary numbers was already used in 1899 by James W. L. Glaisher to give a formula for the number of odd binomial coefficients in a single row of Pascal's triangle. Irving S. Reed introduced a concept, equivalent to Hamming weight in the binary case, in 1954.
Hamming weight is used in several disciplines including information theory, coding theory, and cryptography. Examples of applications of the Hamming weight include:

In modular exponentiation by squaring, the number of modular multiplications required for an exponent e is log₂ e + weight. This is the reason that the public key value e used in RSA is typically chosen to be a number of low Hamming weight.
The Hamming weight determines path lengths between nodes in Chord distributed hash tables.
IrisCode lookups in biometric databases are typically implemented by calculating the Hamming distance to each stored record.
In computer chess programs using a bitboard representation, the Hamming weight of a bitboard gives the number of pieces of a given type remaining in the game, or the number of squares of the board controlled by one player's pieces, and is therefore an important contributing term to the value of a position.
Hamming weight can be used to efficiently compute find first set using the identity ffs = pop. This is useful on platforms such as SPARC that have hardware Hamming weight instructions but no hardware find first set instruction.
The Hamming weight operation can be interpreted as a conversion from the unary numeral system to binary numbers.
In implementation of some succinct data structures like bit vectors and wavelet trees.
Efficient implementation

The population count of a bitstring is often needed in cryptography and other applications. The Hamming distance of two words A and B can be calculated as the Hamming weight of A xor B.
The problem of how to implement it efficiently has been widely studied. A single operation for the calculation, or parallel operations on bit vectors are [|available on some processors]. For processors lacking those features, the best solutions known are based on adding counts in a tree pattern. For example, to count the number of 1 bits in the 16-bit binary number a = 0110 1100 1011 1010, these operations can be done:
Here, the operations are as in C programming language, so means to shift X right by Y bits, X & Y means the bitwise AND of X and Y, and + is ordinary addition. The best algorithms known for this problem are based on the concept illustrated above and are given here:

//types and constants used in the functions below
//uint64_t is an unsigned 64-bit integer variable type
const uint64_t m1 = 0x5555555555555555; //binary: 0101...
const uint64_t m2 = 0x3333333333333333; //binary: 00110011..
const uint64_t m4 = 0x0f0f0f0f0f0f0f0f; //binary: 4 zeros, 4 ones...
const uint64_t m8 = 0x00ff00ff00ff00ff; //binary: 8 zeros, 8 ones...
const uint64_t m16 = 0x0000ffff0000ffff; //binary: 16 zeros, 16 ones...
const uint64_t m32 = 0x00000000ffffffff; //binary: 32 zeros, 32 ones
const uint64_t h01 = 0x0101010101010101; //the sum of 256 to the power of 0,1,2,3...
//This is a naive implementation, shown for comparison,
//and to help in understanding the better functions.
//This algorithm uses 24 arithmetic operations.
int popcount64a
//This uses fewer arithmetic operations than any other known
//implementation on machines with slow multiplication.
//This algorithm uses 17 arithmetic operations.
int popcount64b
//This uses fewer arithmetic operations than any other known
//implementation on machines with fast multiplication.
//This algorithm uses 12 arithmetic operations, one of which is a multiply.
int popcount64c

The above implementations have the best worst-case behavior of any known algorithm. However, when a value is expected to have few nonzero bits, it may instead be more efficient to use algorithms that count these bits one at a time. As Wegner described in 1960, the bitwise AND of x with x − 1 differs from x only in zeroing out the least significant nonzero bit: subtracting 1 changes the rightmost string of 0s to 1s, and changes the rightmost 1 to a 0. If x originally had n bits that were 1, then after only n iterations of this operation, x will be reduced to zero. The following implementation is based on this principle.

//This is better when most bits in x are 0
//This is algorithm works the same for all data sizes.
//This algorithm uses 3 arithmetic operations and 1 comparison/branch per "1" bit in x.
int popcount64d

If a greater memory usage is allowed, we can calculate the Hamming weight faster than the above methods. With unlimited memory, we could simply create a large lookup table of the Hamming weight of every 64 bit integer. If we can store a lookup table of the hamming function of every 16 bit integer, we can do the following to compute the Hamming weight of every 32 bit integer.

static uint16_t wordbits = ;
//This algorithm uses 3 arithmetic operations and 2 memory reads.
int popcount32e

//Optionally, the wordbits table could be filled using this function
int popcount32e_init

Muła et al. have shown that a vectorized version of popcount64b can run faster than dedicated instructions.

Language support

Some C compilers provide intrinsic functions that provide bit counting facilities. For example, GCC includes a builtin function __builtin_popcount that will use a processor instruction if available or an efficient library implementation otherwise. LLVM-GCC has included this function since version 1.5 in June 2005.
In C++ STL, the bit-array data structure bitset has a count method that counts the number of bits that are set. In C++20, a new header was added, containing functions std::popcount and std::has_single_bit, taking arguments of unsigned integer types.
In Java, the growable bit-array data structure has a method that counts the number of bits that are set. In addition, there are and functions to count bits in primitive 32-bit and 64-bit integers, respectively. Also, the arbitrary-precision integer class also has a method that counts bits.
In Common Lisp, the function logcount, given a non-negative integer, returns the number of 1 bits. In either case the integer can be a BIGNUM.
Starting in GHC 7.4, the Haskell base package has a popCount function available on all types that are instances of the Bits class.
MySQL version of SQL language provides BIT_COUNT as a standard function.
Fortran 2008 has the standard, intrinsic, elemental function popcnt returning the number of nonzero bits within an integer.
Some programmable scientific pocket calculators feature special commands to calculate the number of set bits, e.g. #B on the HP-16C and WP 43S, #BITS or BITSUM on HP-16C emulators, and nBITS on the WP 34S.
FreePascal implements popcnt since version 3.0.

Processor support

The IBM STRETCH computer in the 1960s calculated the number of set bits as well as the number of leading zeros as a by-product of all logical operations.
Cray supercomputers early on featured a population count machine instruction, rumoured to have been specifically requested by the U.S. government National Security Agency for cryptanalysis applications.
Some of Control Data Corporation's Cyber 70/170 series machines included a population count instruction; in COMPASS, this instruction was coded as CXi.
The 64-bit SPARC version 9 architecture defines a POPC instruction, but most implementations do not implement it, requiring it be emulated by the operating system.
Donald Knuth's model computer MMIX that is going to replace MIX in his book The Art of Computer Programming has an SADD instruction since 1999. SADD a,b,c counts all bits that are 1 in b and 0 in c and writes the result to a.
Compaq's Alpha 21264A, released in 1999, was the first Alpha series CPU design that had the count extension.
Analog Devices' Blackfin processors feature the ONES instruction to perform a 32-bit population count.
AMD's Barcelona architecture introduced the advanced bit manipulation ISA introducing the POPCNT instruction as part of the SSE4a extensions in 2007.
Intel Core processors introduced a POPCNT instruction with the SSE4.2 instruction set extension, first available in a Nehalem-based Core i7 processor, released in November 2008.
The ARM architecture introduced the VCNT instruction as part of the Advanced SIMD extensions.
The RISC-V architecture introduced the PCNT instruction as part of the Bit Manipulation extension.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Hamming weight

History and usage

Efficient implementation

Language support

Processor support