Consistent Overhead Byte Stuffing is an algorithm for encoding data bytes that results in efficient, reliable, unambiguous packet framing regardless of packet content, thus making it easy for receiving applications to recover from malformed packets. It employs a particular byte value, typically zero, to serve as a packet delimiter. When zero is used as a delimiter, the algorithm replaces each zero data byte with a non-zero value so that no zero data bytes will appear in the packet and thus be misinterpreted as packet boundaries. Byte stuffing is a process that transforms a sequence of data bytes that may contain 'illegal' or 'reserved' values into a potentially longer sequence that contains no occurrences of those values. The extra length of the transformed sequence is typically referred to as the overhead of the algorithm. The COBS algorithm tightly bounds the worst-case overhead, limiting it to a minimum of one byte and a maximum of bytes. Consequently, the time to transmit the encoded byte sequence is highly predictable, which makes COBS useful for real-time applications in which jitter may be problematic. The algorithm is computationally inexpensive and its average overhead is low compared to other unambiguous framing algorithms. COBS does, however, require up to 254 bytes of lookahead. Before transmitting its first byte, it needs to know the position of the first zero byte in the following 254 bytes.
Packet framing and stuffing
When packetized data is sent over any serial medium, some protocol is required to demarcate packet boundaries. This is done by using a framing marker, a special bit-sequence or character value that indicates where the boundaries between packets fall. Data stuffing is the process that transforms the packet data before transmission to eliminate all occurrences of the framing marker, so that when the receiver detects a marker, it can be certain that the marker indicates a boundary between packets. COBS transforms an arbitrary string of bytes in the range into bytes in the range . Having eliminated all zero bytes from the data, a zero byte can now be used to unambiguously mark the end of the transformed data. This is done by appending a zero byte to the transformed data, thus forming a packet consisting of the COBS-encoded data to unambiguously mark the end of the packet. There are two equivalent ways to describe the COBS encoding process: ; Prefixed block description ; Linked list description
These examples show how various data sequences would be encoded by the COBS algorithm. In the examples, all bytes are expressed as hexadecimal values, and encoded data is shown with text formatting to illustrate various features:
Bold indicates a data byte which has not been altered by encoding. All non-zero data bytes remain unaltered.
indicates a zero data byte that was altered by encoding. All zero data bytes are replaced during encoding by the offset to the following zero byte. It is effectively a pointer to the next packet byte that requires interpretation: if the addressed byte is non-zero then it is the following that points to the next byte requiring interpretation; if the addressed byte is zero then it is the.
is an overhead byte which is also a group header byte containing an offset to a following group, but does not correspond to a data byte. These appear in two places: at the beginning of every encoded packet, and after every group of 254 non-zero bytes.
A zero byte appears at the end of every packet to indicate end-of-packet to the data receiver. This packet delimiter byte is not part of COBS proper; it is an additional framing byte that is appended to the encoded output.
Example
Unencoded data
Encoded with COBS
1
2
3
4
5
6
7
8
9
10
Below is a diagram using example 3 from above table, to illustrate how each modified data byte is located, and how it is identified as a data byte or an end of frame byte.
: Overhead byte 3+ -------------->| : Points to relative location of first zero symbol 2+-------->| : Is a zero data byte, pointing to next zero symbol : Location of end-of-packet zero symbol. 0 1 2 3 4 5 : Byte Position 03 11 22 02 33 00 : COBS Data Frame 11 22 00 33 : Extracted Data
OHB = Overhead Byte EOP = End Of Packet
Examples 7 through 10 show how the overhead varies depending on the data being encoded for packet lengths of 255 or more.
Implementation
The following code implements a COBS encoder and decoder in the C programming language: /* * StuffData byte stuffs "length" bytes of data * at the location pointed to by "ptr", writing * the output to the location pointed to by "dst". * * Returns the length of the encoded data. */
include
include
size_t StuffData /* * UnStuffData decodes "length" bytes of data at * the location pointed to by "ptr", writing the * output to the location pointed to by "dst". * * Returns the length of the decoded data *. */ size_t UnStuffData