Unum (number format)
Unums are an arithmetic and a binary representation format for real numbers analogous to floating point, proposed by John L. Gustafson as an alternative to the now ubiquitous IEEE 754 arithmetic. The first version of unums, now officially known as Type I unum, was introduced in his book The End of Error. Gustafson has since created two newer revisions of the unum format, Type II and Type III, in late 2016. Type III unum is also known as posits and valids; posits present arithmetic for single real values and valids present the interval arithmetic version. This data type can serve as a replacement for IEEE 754 floats for programs which do not depend on specific features of IEEE 754. Details of valids have yet to be officially articulated by Gustafson.
Type I and Type II Unum
The two defining features of the type I unum format are:- a variable-width storage format for both the significand and exponent, and
- a u-bit, which determines whether the unum corresponds to an exact number, or an interval between consecutive exact unums. In this way, the unums cover the entire extended real number line .
Unum implementations have been explored in Julia. including type II unum. Unum had been explored in MATLAB. Also, Roger Stokes has a learning lab for type II unum in J language.
William M. Kahan and John L. Gustafson discussed unums at the Arith23 conference.
Type III Unum – Posit
In February 2017, Gustafson officially introduced unum type III, posits and valids. Posit is a hardware-friendly version of unum where difficulties faced in the original type I unum due to its variable size are resolved. Similar size posits when compared to floats offer a bigger dynamic range and more fraction bits for accuracy. In an independent study, Lindstrom, Lloyd and Hittinger from Lawrence Livermore National Laboratory confirmed that posits out-perform floats in accuracy. Posits have particularly superior accuracy in the range near one, where most computations occur. This makes it very attractive to the current trend in deep learning to minimise the number of bits used. It potentially helps any applications to achieve a speedup by enabling the use of fewer bits thus reducing network and memory bandwidth, and power requirements, and bring us one step closer to exascale.Posits have a different format than IEEE 754 floats. They consist of four parts: sign, regime, exponent, and fraction. For a n-bit posit, regime can be of length 2 to. The format of the regime is such that it is a repetition of a same-sign bit and terminated by a different-sign bit.
Example 1:
Example 1 | Example 2 |
000000000000001 | 1110 |
Example 1 shows a regime with 14 same-sign bits, terminated by a different-sign bit. As there are 14 same-sign bits, the runlength of the regime is 14.
Example 2 shows a regime with 3 same-sign bits, terminated by a different-sign bit. As there are 3 same-sign bits, the runlength of the regime is 3.
Sign, exponent and fraction bits are very similar to IEEE 754; however, posits may omit either or both of the exponent and fraction bits, leaving a posit that consists of only sign and regime bits. Example 3 shows the longest possible regime runlength for a 16-bit posit, where the regime terminating bit, exponent bit and fraction bits are beyond the length of the size of the posit. Example 4 illustrates the shortest possible runlength of 1 for a 16-bit posit with one exponent bit and 12 fraction bits.
Example 3: Regime runlength = 15 | Example 4: Regime runlength = 1 |
0111111111111111 | 0101100000000001 |
The recommended posit sizes and corresponding exponent bits and quire sizes:
Posit size | Number of exponent bits | Quire size |
8 | 0 | 32 |
16 | 1 | 128 |
32 | 2 | 512 |
64 | 3 | 2048 |
Note: 32-bit posit is expected to be sufficient to solve almost all classes of applications.
Quire
Quire is one of the most useful features of posits. It is a special data type that will give posits "near-infinity" number of bits to accumulate dot products. It is based on the work of Ulrich W. Kulisch and Willard L. Miranker.Implementations
There are several software and hardware implementations of posits from the community. The first ever complete parameterized posit arithmetic hardware generator was proposed in 2018. The earliest software implementation in Julia came from Isaac Yonemoto. A C++ version with support for any posit sizes combined with any number of exponent bits is also provided. A fast implementation in C, SoftPosit, provided by the NGA research team based on Berkeley SoftFloat is the latest addition to the available software implementations.SoftPosit
SoftPosit is a software implementation of posits that is based on Berkeley SoftFloat. This allows software comparison between posits and floats. It currently supports- Add
- Subtract
- Multiply
- Divide
- Fused-multiply-add
- Fused-dot-product
- Square root
- Convert posit to signed and unsigned integer
- Convert signed and unsigned integer to posit
- Convert posit to another posit size
- Less than, equal, less than equal comparison
- Round to nearest integer
- convert double to posit
- convert posit to double
- cast unsigned integer to posit
Currently it supports x86_64 systems. It has been tested on GNU gcc 4.8.5 Apple LLVM version 9.1.0.
Examples:
Add with posit8_t
- include "softposit.h"
Fused dot product with quire16_t
//Convert double to posit
posit16_t pA = convertDoubleToP16;
posit16_t pB = convertDoubleToP16;
posit16_t pC = convertDoubleToP16;
posit16_t pD = convertDoubleToP16;
quire16_t qZ;
//Set quire to 0
qZ = q16_clr;
//accumulate products without roundings
qZ = q16_fdp_add;
qZ = q16_fdp_add;
//Convert back to posit
posit16_t pZ = q16_to_p16;
//To check answer
double dZ = convertP16ToDouble;
Critique
William M. Kahan, the principal architect of IEEE 754-1985 criticizes type I unums on the following grounds :- The description of unums sidesteps using calculus for solving physics problems.
- Unums can be expensive in terms of time and power consumption.
- Each computation in unum space is likely to change the bit length of the structure. This requires either unpacking them into a fixed-size space, or data allocation, deallocation, and garbage collection during unum operations, similar to the issues for dealing with variable-length records in mass storage.
- Unums provide only two kinds of numerical exception, quiet and signaling NaN.
- Unum computation may deliver overly loose bounds from the selection of an algebraically correct but numerically unstable algorithm.
- The costs and benefits of unum over short precision floating point for problems requiring low precision are not obvious.
- Solving differential equations and evaluating integrals with unums guarantee correct answers but may not be as fast as methods that usually work.