Commodore BASIC


Commodore BASIC, also known as PET BASIC or CBM-BASIC, is the dialect of the BASIC programming language used in Commodore International's 8-bit home computer line, stretching from the PET of 1977 to the C128 of 1985.
The core is based on 6502 Microsoft BASIC, and as such it shares many characteristics with other 6502 BASICs of the time, such as Applesoft BASIC. Commodore licensed BASIC from Microsoft in 1977 on a "pay once, no royalties" basis after Jack Tramiel turned down Bill Gates' offer of a $3 per unit fee, stating, "I'm already married," and would pay no more than $25,000 for a perpetual license.
The original PET version was very similar to the original Microsoft implementation with few modifications. BASIC 2.0 on the C64 was also similar, and was also seen on some C128s and other models. Later PETs featured BASIC 4.0, similar to the original but added a number of commands for working with floppy disks. BASIC 3.5 was the first to really deviate, adding a number of commands for graphics and sound support on the C16 and Plus/4. Several later versions were based on 3.5, but saw little use. The last, BASIC 10.0, was part of the unreleased Commodore 65.

History

Commodore took the source code of the flat-fee BASIC and further developed it internally for all their other 8-bit home computers. It was not until the Commodore 128 that a Microsoft copyright notice was displayed. However, Microsoft had built an easter egg into the version 2 or "upgrade" Commodore Basic that proved its provenance: typing the command WAIT 6502, 1 would result in Microsoft! appearing on the screen.
The popular Commodore 64 came with BASIC v2.0 in ROM despite the computer being released after the PET/CBM series that had version 4.0 because the 64 was intended as a home computer, while the PET/CBM series were targeted at business and educational use where their built-in programming language was presumed to be more heavily used. This saved manufacturing costs, as the V2 fit into smaller ROMs.

Technical details

A convenient feature of Commodore's ROM-resident BASIC interpreter and KERNAL was the full-screen editor. Although Commodore keyboards only featured two cursor keys which alternated direction when the shift key was held, the screen editor allowed users to enter direct commands or to input and edit program lines from anywhere on the screen. If a line was prefixed with a line number, it was tokenized and stored in program memory. Lines not beginning with a number were executed by pressing the key whenever the cursor happened to be on the line. This marked a significant upgrade in program entry interfaces compared to other common home computer BASICs at the time, which typically used line editors, invoked by a separate EDIT command, or a "copy cursor" that truncated the line at the cursor's position.
It also had the capability of saving named files to any device, including the cassette – a popular storage device in the days of the PET, and one that remained in use throughout the lifespan of the 8-bit Commodores as an inexpensive form of mass storage. Most systems only supported filenames on diskette, which made saving multiple files on other devices more difficult. The user of one of these other systems had to note the recorder's counter display at the location of the file, but this was inaccurate and prone to error. With the PET, files from cassettes could be requested by name. The device would search for the filename by reading data sequentially, ignoring any non-matching filenames. The file system was also supported by a powerful record structure that could be loaded or saved to files. Commodore cassette data was recorded digitally, rather than less expensive analog methods used by other manufacturers. Therefore, the specialized Datasette was required rather than a standard tape recorder. Adapters were available that used an analog-to-digital converter to allow use of a standard recorder, but these cost only a little less than the Datasette.
The command may be used with the optional parameter which will load a program into the memory address contained in the first two bytes of the file. If the parameter is not used, the program will load into the start of the BASIC program area, which widely differs between machines. Some Commodore BASIC variants supplied BLOAD and BSAVE commands that worked like their counterparts in Applesoft BASIC, loading or saving bitmaps from specified memory locations.
The PET does not support relocatable programs and the command will always load at the first two bytes contained in the program file. This created a problem when trying to load BASIC programs saved on other Commodore machines as they would load at a higher address than where the PET's BASIC expected the program to be, there were workarounds to "move" programs to the proper location. If a program was saved on a CBM-II machine, the only way to load it on a PET was by modifying the first two bytes with a disk sector editor as the CBM-II series had their BASIC program area at $0, which would result in a PET attempting to load into the zero page and locking up.
Like the original Microsoft BASIC interpreter, on which it is based, Commodore BASIC is slower than native machine code. Test results have shown that copying 16 kilobytes from ROM to RAM takes less than a second in machine code, compared to over a minute in BASIC. To execute faster than the interpreter, programmers started using various techniques to speed up execution. One was to store often-used floating point values in variables rather than using literal values, as interpreting a variable name was faster than interpreting a literal number. Since floating point is default type for all commands, it's faster to use floating point numbers as arguments, rather than integers. When speed was important, some programmers converted sections of BASIC programs to 6502 or 6510 assembly language that was loaded separately from a file or POKEd into memory from DATA statements at the end of the BASIC program, and executed from BASIC using the SYS command, either from direct mode or from the program itself. When the execution speed of machine language was too great, such as for a game or when waiting for user input, programmers could poll by reading selected memory locations to delay or halt execution.
Commodore BASIC keywords could be abbreviated by entering first an unshifted keypress, and then a shifted keypress of the next letter. This set the high bit, causing the interpreter to stop reading and parse the statement according to a lookup table. This meant that the statement up to where the high bit was set was accepted as a substitute for typing the entire command out. However, since all BASIC keywords were stored in memory as single byte tokens, this was a convenience for statement entry rather than an optimization.
In the default uppercase-only character set, shifted characters appear as a graphics symbol; e.g. the command, GOTO, could be abbreviated G. Most such commands were two letters long, but in some cases they were longer. In cases like this, there was an ambiguity, so more unshifted letters of the command were needed, such as GO being required for GOSUB. Some commands had no abbreviated form, either due to brevity or ambiguity with other commands. For example, the command, INPUT had no abbreviation because its spelling collided with the separate INPUT# keyword, which was located nearer to the beginning of the keyword lookup table. The heavily used PRINT command had a single ? shortcut, as was common in most Microsoft BASIC dialects. Abbreviating commands with shifted letters is unique to Commodore BASIC.
By abbreviating keywords, it was possible to fit more code on a single line. This allowed for a slight saving on the overhead to store otherwise necessary extra program lines, but nothing more. All BASIC commands were tokenized and took up 1 byte in memory no matter which way they were entered. Such long lines could be difficult to edit. The LIST command displayed the entire command keyword - extending the program line beyond the 2 or 4 screen lines which could be entered into program memory.
A unique feature of Commodore BASIC is the use of control codes to perform tasks such as clearing the screen or positioning the cursor within a program; these can be invoked either by issuing a command where X corresponds to the control code to be issued or by pressing the key in question between quote marks, thus pressing following a quote mark will cause BASIC to display the visual representation of the control code which is then acted upon at program execution. This is in comparison to other implementations of BASIC which typically have dedicated commands to clear the screen or move the cursor.
BASIC 3.5 and up have proper commands for clearing the screen and moving the cursor.
Program lines in Commodore BASIC do not require spaces anywhere, e.g.,, and it was common to write programs with no spacing. This feature was added to conserve memory since the tokenizer never removes any space inserted between keywords: the presence of spaces results in extra 0x20 bytes in the tokenized program which are merely skipped during execution. Spaces between the line number and program statement are removed by the tokenizer.
Program lines can be 80 characters total on most machines, but machines with 40 column text would cause the line to wrap around to the next line on the screen, and on the VIC-20, which had a 22 column display, program lines could occupy as many as four. BASIC 7.0 on the Commodore 128 increased the limit of a program line to 160 characters. By using abbreviations such as instead of, it is possible to fit even more on a line. BASIC 7.0 displays a error if the user enters a program line over 160 characters in length. Earlier versions do not produced an error and simply display the READY prompt two lines down if the line length is exceeded. The line number is counted in the number of characters in the program line, so a five digit line number will result in four fewer characters allowed than a one digit number.
The order of execution of Commodore BASIC lines was not determined by line numbering; instead, it followed the order in which the lines were linked in memory. Program lines were stored in memory as a singly linked list with a pointer, a line number, and then the tokenized code for the line. While a program was being entered, BASIC would constantly reorder program lines in memory so that the line numbers and pointers were all in ascending order. However, after a program was entered, manually altering the line numbers and pointers with the POKE commands could allow for out-of-order execution or even give each line the same line number. In the early days, when BASIC was used commercially, this was a software protection technique to discourage casual modification of the program.
Line numbers can range from 0 to 65520 and take five bytes to store regardless of how many digits are in the line number, although execution is faster the fewer digits there are. Putting multiple statements on a line will use less memory and execute faster.
and statements will search downward from the current line to find a line number if a forward jump is performed, in case of a backwards jump, they will return to the start of the program to begin searching. This will slow down larger programs, so it is preferable to put commonly used subroutines near the start of a program.
Variable names are only significant to 2 characters; thus the variable names VARIABLE1, VARIABLE2, and VA all refer to the same variable.
Commodore BASIC also supports bitwise operators , and, although this feature was part of the core Microsoft 6502 BASIC code, it was usually omitted in other implementations such as Applesoft BASIC.
The native number format of Commodore BASIC, like that of its parent MS BASIC, was floating point. Most contemporary BASIC implementations used one byte for the characteristic and three bytes for the mantissa. The accuracy of a floating point number using a three-byte mantissa is only about 6.5 decimal digits, and round-off error is common. 6502 implementations of Microsoft BASIC utilized 40-bit floating point arithmetic, meaning that variables took five bytes to store unlike the 32-bit floating point found in BASIC-80.
While 8080/Z80 implementations of Microsoft BASIC supported integer and double precision variables, 6502 implementations were floating point only.
Although Commodore BASIC supports signed integer variables in the range -32768 to 32767, in practice they are only used for array variables and serve the function of conserving memory by limiting array elements to two bytes each. Denoting any variable as integer simply causes BASIC to convert it back to floating point, slowing down program execution and wasting memory as each percent sign takes one additional byte to store. Also, it is not possible to or memory locations above 32767 with address defined as a signed integer.
A period can be used in place of the number 0, this will execute slightly faster.
The statement, used to start machine language programs, was added by Commodore and was not in the original Microsoft BASIC code, which featured only the USR function for invoking machine language routines. It automatically loads the CPU's registers with the values in --this can be used to pass data to machine language routines or as a means of calling kernal functions from BASIC.
Since Commodore 8-bit machines other than the C128 cannot automatically boot disk software, the usual technique is to include a BASIC stub like to begin program execution. It is possible to automatically start software after loading and not require the user to type a statement, this is done by having a piece of code that hooks the BASIC "ready" vector at.
As with most other versions of Microsoft BASIC, if an array is not declared with a statement, it is automatically set to ten elements. Larger arrays must be declared or BASIC will display an error when the program is run and an array cannot be re-dimensioned in a program unless all variables are wiped via a CLR statement. Numeric arrays are automatically filled with zeros when they are created, there may be a momentary delay in program execution if a large array is dimensioned.
String variables are represented by tagging the variable name with a dollar sign. Thus, the variables AA$, AA, and AA% would each be understood as distinct. Array variables are also considered distinct from simple variables, thus and do not refer to the same variable. The size of a string array merely refers to how many strings are stored in the array, not the size of each element, which is allocated dynamically. Unlike some other implementations of Microsoft BASIC, Commodore BASIC does not require string space to be reserved at the start of a program.
Unlike other 8-bit machines such as the Apple II, Commodore's machines all have a built-in clock that is initialized to 0 at power on and updated with every tick of the PIA/VIA/TED/CIA timer, thus 60 times per second. It is assigned two system variables in BASIC, and, which both contain the current time. TI is read-only and cannot be modified; doing so will result in a Syntax Error message. may be used to set the time via a six number string. The clock is not a very reliable method of timekeeping since it stops whenever interrupts are turned off and accessing the IEC port will slow the clock update by a few ticks.
The function in Commodore BASIC can use the clock to generate random numbers; this is accomplished by, however it is of relatively limited use as only numbers between 0 and 255 are returned. Otherwise, works the same as other implementations of Microsoft BASIC in that a pseudo-random sequence is used via a fixed 5-byte seed value stored at power on in memory locations on the C64. with any number higher than 0 will generate a random number amalgamated from the value included with the function and the seed value, which is updated by 1 each time an RND function is executed. with a negative number goes to a point in the sequence of the current seed value specified by the number.
Since true random number generation is impossible with the statement, it is more typical on the C64 and C128 to utilize the SID chip's white noise channel for random numbers.
BASIC 2.0 notoriously suffered from extremely slow garbage collection of strings. Garbage collection is automatically invoked any time a function is executed and if there are many string variables and arrays that have been manipulated over the course of a program, clearing them can take more than an hour under the worst conditions. It is also not possible to abort garbage collection as BASIC does not scan the RUN/STOP key while performing this routine. BASIC 4.0 introduced an improved garbage collection system with back pointers and all later implementations of Commodore BASIC also have it.
The function in BASIC 2.0 suffered from another technical flaw in that it cannot handle signed numbers over 32768, thus if the function is invoked on a C64, a negative amount of free BASIC memory will be displayed. The PET and VIC-20 never had more than 32k of total memory available to BASIC, so this limitation did not become apparent until the C64 was developed. The function on BASIC 3.5 and 7.0 corrected this problem and on BASIC 7.0 was also "split" into two functions, one to display free BASIC program text memory and the other to display free variable memory.
Many BASIC extensions were released for the Commodore 64, due to the relatively limited capabilities of its native BASIC 2.0. One of the most popular extensions was the DOS Wedge, which was included on the Commodore 1541 Test/Demo Disk. This 1 KB extension to BASIC added a number of disk-related commands, including the ability to read a disk directory without destroying the program in memory. Its features were subsequently incorporated in various third-party extensions, such as the popular Epyx FastLoad cartridge. Other BASIC extensions added additional keywords to make it easier to code sprites, sound, and high-resolution graphics like Simons' BASIC.
Although BASIC 2.0's lack of sound or graphics features was frustrating to many users, some critics argued that it was ultimately beneficial since it forced the user to learn machine language.
The limitations of BASIC 2.0 on the C64 led to use of built-in ROM machine language from BASIC. To load a file to a designated memory location, the filename, drive, and device number would be read by a call: ; the location would be specified in the X and Y registers: ; and the load routine would be called:.
A disk magazine for the C64, Loadstar, was a venue for hobbyist programmers, who shared collections of proto-commands for BASIC, called with the command.
From a modern programming point of view, the earlier versions of Commodore BASIC presented a host of bad programming traps for the programmer. As most of these issues derived from Microsoft BASIC, virtually every home computer BASIC of the era suffered from similar deficiencies. Every line of a Microsoft BASIC program was assigned a line number by the programmer. It was common practice to increment numbers by some value to make inserting lines during program editing or debugging easier, but bad planning meant that inserting large sections into a program often required restructuring the entire code. A common technique was to start a program at some low line number with an jump table, with the body of the program structured into sections starting at a designated line number like 1000, 2000, and so on. If a large section needed to be added, it could just be assigned the next available major line number and inserted to the jump table.
Later BASIC versions on Commodore and other platforms included a and command, as well as an AUTO line numbering command that would automatically select and insert line numbers according to a selected increment. In addition, all variables are treated as global variables. Clearly defined loops are hard to create, often causing the programmer to rely on the command. Flag variables often needed to be created to perform certain tasks. Earlier BASICs from Commodore also lack debugging commands, meaning that bugs and unused variables are hard to trap. structures, a standard part of Z80 Microsoft BASICs, were added to BASIC 3.5 after being unavailable in earlier versions of Commodore BASIC.

Use as user interface

In common with other home computers, Commodore's machines booted directly into the BASIC interpreter. BASIC's file and programming commands could be entered in direct mode to load and execute software. If program execution was halted using the RUN/STOP key, variable values would be preserved in RAM and could be PRINTed for debugging. The 128 even dedicated its second 64k bank to variable storage, allowing values to persist until a NEW or RUN command was issued. This, along with the advanced screen editor included with Commodore BASIC gave the programming environment a REPL-like feel; programmers could insert and edit program lines at any screen location, interactively building the program. This is in contrast to business-oriented operating systems of the time like CP/M or MS-DOS, which typically booted into a command line interface. If a programming language was required on these platforms, it had to be loaded separately.
While some versions of Commodore BASIC included disk-specific DLOAD and DSAVE commands, the version built into the Commodore 64 lacked these, requiring the user to specify the disk drive's device number to the standard LOAD command, which otherwise defaulted to tape. Another omission from the Commodore 64s BASIC 2.0 was a DIRECTORY command to display a disk's contents without clearing main memory. On the 64, viewing files on a disk was implemented as loading a "program" which when listed showed the directory as a pseudo BASIC program, with the file's block size as the line number. This had the effect of overwriting the currently loaded program. Addons like the DOS Wedge overcame this by rendering the directory listing direct to screen memory.

Versions and features

A list of CBM BASIC versions in chronological order, with successively added features:

Released versions