Bencoding is most commonly used in torrent files, and as such is part of the BitTorrent specification. These metadata files are simply bencoded dictionaries. While less efficient than a pure binary encoding, bencoding is simple and is unaffected by endianness, which is important for a cross-platform application like BitTorrent. It is also fairly flexible, as long as applications ignore unexpected dictionary keys, so that new ones can be added without creating incompatibilities.
An integer is encoded as ibase ten ASCII>e. Leading zeros are not allowed. Negative values are encoded by prefixing the number with a hyphen-minus. The number 42 would thus be encoded as, 0 as, and -42 as. Negative zero is not permitted.
A byte string is encoded as :. The length is encoded in base 10, like integers, but must be non-negative ; the contents are just the bytes that make up the string. The string "spam" would be encoded as. The specification does not deal with encoding of characters outside the ASCII set; to mitigate this, some BitTorrent applications explicitly communicate the encoding in various non-standard ways. This is identical to how netstrings work, except that netstrings additionally append a comma suffix after the byte sequence.
A list of values is encoded as le. The contents consist of the bencoded elements of the list, in order, concatenated. A list consisting of the string "spam" and the number 42 would be encoded as:. Note the absence of separators between elements, and the first character is the letter 'l', not digit '1'.
A dictionary is encoded as de. The elements of the dictionary are encoded with each key immediately followed by its value. All keys must be byte strings and must appear in lexicographical order. A dictionary that associates the values 42 and "spam" with the keys "foo" and "bar", respectively (in other words,
Features & drawbacks
Bencode is a very specialized kind of binary coding with some unique properties:
For each possible value, there is only a single valid bencoding; i.e. there is a bijection between values and their encodings. This has the advantage that applications may compare bencoded values by comparing their encoded forms, eliminating the need to decode the values.
Many BE codegroups can be decoded manually. Since the bencoded values often contain binary data, decoding may become quite complex. Bencode is not considered a human-readable encoding format.
Bencoding serves similar purposes as data languages like JSON and YAML, allowing complex yet loosely structured data to be stored in a platform independent way.
However, this uniqueness can cause some problems:
There are very few bencode editors
Because bencoded files contain binary data, and because of some of the intricacies involved in the way binary strings are typically stored, it is often not safe to edit bencode files in text editors.