Intel HEX


Intel hexadecimal object file format, Intel hex format or Intellec Hex is a file format that conveys binary information in ASCII text form. It is commonly used for programming microcontrollers, EPROMs, and other types of programmable logic devices. In a typical application, a compiler or assembler converts a program's source code to machine code and outputs it into a HEX file. Common file extensions used for the resulting files are.HEX or.H86. The HEX file is then read by a programmer to write the machine code into a PROM or is transferred to the target system for loading and execution.

History

The Intel hex format was originally designed for Intel's Intellec Microcomputer Development Systems in 1973 in order to load and execute programs from paper tape in order to replace the "paper-intensive" BNPF/BPNF format. Also, it served the purpose of easing the data transmission from customers to Intel for ROM production. The format was used to program PROMs via paper tapes or to control punched card-controlled EPROM programmers.
Since 1975, it was also utilized by the MCS Series II floppy-disk based ISIS-II systems, using the file extension HEX.

Format

Intel HEX consists of lines of ASCII text that are separated by line feed or carriage return characters or both. Each text line contains hexadecimal characters that encode multiple binary numbers. The binary numbers may represent data, memory addresses, or other values, depending on their position in the line and the type and length of the line. Each text line is called a record.

Record structure

A record consists of six fields that appear in order from left to right:
  1. Start code, one character, an ASCII colon ':'.
  2. Byte count, two hex digits, indicating the number of bytes in the data field. The maximum byte count is 255. 16 and 32 are commonly used byte counts.
  3. Address, four hex digits, representing the 16-bit beginning memory address offset of the data. The physical address of the data is computed by adding this offset to a previously established base address, thus allowing memory addressing beyond the 64 kilobyte limit of 16-bit addresses. The base address, which defaults to zero, can be changed by various types of records. Base addresses and address offsets are always expressed as big endian values.
  4. Record type, two hex digits, 00 to 05, defining the meaning of the data field.
  5. Data, a sequence of n bytes of data, represented by 2n hex digits. Some records omit this field. The meaning and interpretation of data bytes depends on the application.
  6. Checksum, two hex digits, a computed value that can be used to verify the record has no errors.

    Color legend

As a visual aid, the fields of Intel HEX records are colored throughout this article as follows:

Checksum calculation

A record's checksum byte is the two's complement of the least significant byte of the sum of all decoded byte values in the record preceding the checksum. It is computed by summing the decoded byte values and extracting the LSB of the sum, and then calculating the two's complement of the LSB.
For example, in the case of the record, the sum of the decoded byte values is + + + + + + = E2, which has LSB value E2. The two's complement of E2 is, which is the checksum byte appearing at the end of the record.
The validity of a record can be checked by computing its checksum and verifying that the computed checksum equals the checksum appearing in the record; an error is indicated if the checksums differ. Since the record's checksum byte is the two's complement — and therefore the additive inverse — of the data checksum, this process can be reduced to summing all decoded byte values, including the record's checksum, and verifying that the LSB of the sum is zero. When applied to the preceding example, this method produces the following result: + + + + + + + = 100, which has LSB value 00.

Text line terminators

Intel HEX records are separated by one or more ASCII line termination characters so that each record appears alone on a text line. This enhances legibility by visually delimiting the records and it also provides padding between records that can be used to improve machine parsing efficiency.
Programs that create HEX records typically use line termination characters that conform to the conventions of their operating systems. For example, Linux programs use a single LF character to terminate lines, whereas Windows programs use a CR followed by a LF.

Record types

Intel HEX has six standard record types:
Hex codeRecord typeDescriptionExample
DataContains data and a 16-bit starting address for the data. The byte count specifies number of data bytes in the record. The example shown to the right has data bytes located at consecutive addresses beginning at address.
End Of FileMust occur exactly once per file in the last line of the file. The data field is empty and the address field is typically.
Extended Segment AddressThe data field contains a 16-bit segment base address compatible with 80x86 real mode addressing. The address field is ignored. The segment address from the most recent record is multiplied by 16 and added to each subsequent data record address to form the physical starting address for the data. This allows addressing up to one megabyte of address space.
Start Segment AddressFor 80x86 processors, specifies the initial content of the CS:IP registers. The address field is, the byte count is always, the first two data bytes are the CS value, the latter two are the IP value.
Extended Linear AddressAllows for 32 bit addressing. The record's address field is ignored and its byte count is always. The two data bytes specify the upper 16 bits of the 32 bit absolute address for all subsequent type records; these upper address bits apply until the next record. The absolute address for a type record is formed by combining the upper 16 address bits of the most recent record with the low 16 address bits of the record. If a type record is not preceded by any type records then its upper 16 address bits default to 0000.
Start Linear AddressThe address field is and the byte count is always. The four data bytes represent a 32-bit address value. In the case of 80386 and higher CPUs, this address is loaded into the EIP register.

Named formats

The original 4-bit/8-bit Intellec Hex Paper Tape Format and Intellec Hex Computer Punched Card Format supported only record types and.
The Extended Intellec Hex Format additionally supports [|record type].
Special names are sometimes used to denote the formats of HEX files that employ specific subsets of record types. For example:
This example shows a file that has four data records followed by an end-of-file record:

Variants

Besides Intel's own extension, several third-parties have also defined variants and extensions of the Intel hex format, including Digital Research, Zilog, Texas Instruments, Microchip, and c't. These can have information on program entry points and register contents, a swapped byte order in the data fields, and other differences.
The Digital Research hex format for 8086 processors supports segment information by adding record types to distinguish between code, data, stack, and extra segments.
Most assemblers for CP/M-80 don't use record type 01h to indicate the end of a file, but use a zero-length data type 00h entry instead. This eases the concatenation of multiple hex files.
Texas Instruments defines a variant where addresses are based on the bit-width of a processor's registers, not bytes.
Microchip defines variants INTHX8S, INHX8M, INHX16 and INHX32 for their PIC microcontrollers.
Alfred Arnold's cross-macro-assembler AS, Werner Hennig-Roleff's 8051-emulator SIM51, and Matthias R. Paul's cross-converter BINTEL are also known to define extensions to the Intel hex format.