Magnetic tape data storage


Magnetic tape data storage is a system for storing digital information on magnetic tape using digital recording. Initially, large open reels were the most common format, but modern magnetic tape is most commonly packaged in cartridges and cassettes, such as the widely supported Linear Tape-Open.
The device that performs the writing or reading of data is called a tape drive, and autoloaders and tape libraries are often used to automate cartridge handling.
Although magnetic tape was initially primarily for data storage, newer uses included system backup, data archive and data exchange.

Open reels

Initially, magnetic tape for data storage was wound on reels. This de facto standard for large computer systems persisted through the late 1980s, with steadily increasing capacity due to thinner substrates and changes in encoding. Tape cartridges and cassettes were available starting in the mid-1970s and were frequently used with small computer systems. With the introduction of the IBM 3480 cartridge in 1984, described as
"about one-fourth the size... yet it stored up to 20 percent more data," large computer systems started to move away from open reel tapes and towards cartridges.

UNIVAC

Magnetic tape was first used to record computer data in 1951 on the Eckert-Mauchly UNIVAC I. The UNISERVO drive recording medium was a thin metal strip of wide nickel-plated phosphor bronze. Recording density was 128 characters per inch on eight tracks at a linear speed of, yielding a data rate of 12,800 characters per second. Of the eight tracks, six were data, one was a parity track, and one was a clock, or timing track. Making allowances for the empty space between tape blocks, the actual transfer rate was around 7,200 characters per second. A small reel of mylar tape provided separation from the metal tape and the read/write head.

IBM formats

used ferric oxide coated tape similar to that used in audio recording. IBM's technology soon became the de facto industry standard. Magnetic tape dimensions were wide and wound on removable reels up to in diameter. Different tape lengths were available with and on mil and one half thickness being somewhat standard. During the 1980s, longer tape lengths such as became available using a much thinner PET film. Most tape drives could support a maximum reel size of. CDC used IBM compatible 1/2 inch magnetic tapes, but also offered a 1 inch wide variant, with 14 tracks in the CDC 626 drive.
A so-called mini-reel was common for smaller data sets, such as for software distribution. These were reels, often with no fixed length—the tape was sized to fit the amount of data recorded on it as a cost-saving measure.
Early IBM tape drives, such as the IBM 727 and IBM 729, were mechanically sophisticated floor-standing drives that used vacuum columns to buffer long u-shaped loops of tape. Between servo control of powerful reel motors, a low-mass capstan drive, and the low-friction and controlled tension of the vacuum columns, fast start and stop of the tape at the tape-to-head interface could be achieved: 1.5 ms from stopped tape to full speed of. The fast acceleration is possible because the tape mass in the vacuum columns is small; the length of tape buffered in the columns provides time to spin the high inertia reels. When active, the two tape reels thus fed tape into or pulled tape out of the vacuum columns, intermittently spinning in rapid, unsynchronized bursts resulting in visually striking action. Stock shots of such vacuum-column tape drives in motion were widely used to represent "the computer" in movies and television.
Early half-inch tape had seven parallel tracks of data along the length of the tape, allowing six-bit characters plus one bit of parity written across the tape. This was known as seven-track tape. With the introduction of the IBM System/360 mainframe, nine-track tapes were introduced to support the new 8-bit characters that it used.
Recording density increased over time. Common seven-track densities started at 200 six-bit characters per inch, then 556, and finally 800. Nine-track tapes had densities of 800, then 1600, and finally 6250. This translates into about 5 megabytes to 140 megabytes per standard length reel of tape. The end of a file was designated by a special recorded pattern called a tape mark, and end of the recorded data on a tape by two successive tape marks. The physical beginning and end of usable tape was indicated by reflective adhesive strips of aluminum foil placed on the back side.
Effective density also increased as the interblock gap decreased from a nominal 0.75-inch on seven-track tape reel to a nominal 0.30-inches on a 6250 bpi nine track tape reel.
At least partly due to the success of the S/360, and the resultant standardization on 8-bit character codes and byte addressing, nine-track tapes were very widely used throughout the computer industry during the 1970s and 1980s. IBM no longer introduced reel-to-reel products beginning with its 1984 introduction of the cartridge based 3480 family.

DEC format

, and its derivative, DECtape, were variations on this "round tape". They were essentially a personal storage medium. The tape was wide and featured a fixed formatting track which, unlike standard tape, made it feasible to read and rewrite blocks repeatedly in place. LINCtapes and DECtapes had similar capacity and data transfer rate to the diskettes that displaced them, but their "seek times" were on the order of thirty seconds to a minute.

Cartridges and cassettes

In the context of magnetic tape, the term cassette or cartridge means a length of magnetic tape in a plastic enclosure with one or two reels for controlling the motion of the tape. The type of packaging is a large determinant of the load and unload times as well as the length of tape that can be held. In a single reel cartridge there is a takeup reel in the drive while a dual reel cartridge has both takeup and supply reels in the cartridge. A tape drive uses one or more precisely controlled motors to wind the tape from one reel to the other, passing a read/write head as it does.
data cartridge can hold up to 10GiB uncompressed.
A different type is the endless tape cartridge, which has a continuous loop of tape wound on a special reel that allows tape to be withdrawn from the center of the reel and then wrapped up around the edge, and therefore does not need to rewind to repeat. This type is similar to a cassette in that there is no take-up reel inside the tape drive.
The IBM 7340 Hypertape drive, introduced in 1961, used a dual reel cassette with a wide tape capable of holding 2 million six-bit characters per cassette.
In the 1970s and 1980s, audio Compact Cassettes were frequently used as an inexpensive data storage system for home computers, or in some cases for diagnostics or boot code for larger systems such as the Burroughs B1700. Compact cassettes were logically, as well as physically, sequential; they had to be rewound and read from the start to load data. Early cartridges were available before personal computers had affordable disk drives, and could be used as random access devices, automatically winding and positioning the tape, albeit with access times of many seconds.
Experienced computer gamers could tell a lot by listening to the loading noise from the tape.
In 1984 IBM introduced the 3480 family of single reel cartridges and tape drives which were then manufactured by a number of vendors thru at least 2004. Initially providing 200 megabytes per cartridge the family capacity increased over time to 2.4 gigabytes per cartridge. DLT, also a cartridge based tape, was beginning 1984 but as of 2007 future development was stopped in favor of LTO.
In 2003 IBM introduced the IBM 3592 family to supersede the IBM 3590. While the name is similar, there is no compatibility between the 3590 and the 3592. Like the 3590 and 3480 before it, this tape format has half inch tape spooled into a single reel cartridge. Initially introduced to support 300 gigabytes, the current sixth generation released in 2018 supports a native capacity of 20 terabytes.
LTO single reel cartridge was announced in 1997 at 100 megabytes and in its eighth generation supports 12 terabytes in the same sized cartridge. LTO has completely displaced all other tape technologies in computer applications, with the exception of some IBM 3592 family at the high-end.

Technical details

Linear density

Recording density for computer tapes is described with the acronym BPI, sometimes written bpi.

Bytes Per Inch is the metric for the density at which data is stored on magnetic media. The term BPI can refer to bits per inch, but more often refers to Bytes per inch.
The term BPI can mean bytes per inch when the tracks of a particular format are byte-organized, as in 9-track tapes.

Tape width

The width of the media is the primary classification criterion for tape technologies. Half-inch has historically been the most common width of tape for high-capacity data storage. Many other sizes exist and most were developed to either have smaller packaging or higher capacity.

Recording method

Recording method is also an important way to classify tape technologies, generally falling into two categories:

Linear

The linear method arranges data in long parallel tracks that span the length of the tape. Multiple tape heads simultaneously write parallel tape tracks on a single medium. This method was used in early tape drives. It is the simplest recording method, but also has the lowest data density.
A variation on linear technology is linear serpentine recording, which uses more tracks than tape heads. Each head still writes one track at a time. After making a pass over the whole length of the tape, all heads shift slightly and make another pass in the reverse direction, writing another set of tracks. This procedure is repeated until all tracks have been read or written. By using the linear serpentine method, the tape medium can have many more tracks than read/write heads. Compared to simple linear recording, using the same tape length and the same number of heads, data storage capacity is substantially higher.

Scanning

Scanning recording methods write short dense tracks across the width of the tape medium, not along the length. Tape heads are placed on a drum or disk which rapidly rotates while the relatively slow-moving tape passes it.
An early method used to get a higher data rate than the prevailing linear method was transverse scan. In this method, a spinning disk with the tape heads embedded in the outer edge is placed perpendicular to the path of the tape. This method is used in Ampex's DCRsi instrumentation data recorders and the old Ampex quadruplex videotape system. Another early method was arcuate scan. In this method, the heads are on the face of a spinning disk which is laid flat against the tape. The path of the tape heads forms an arc.
Helical scan recording writes short dense tracks in a diagonal manner. This method is used by virtually all current videotape systems and several data tape formats.

Block layout and speed matching

In a typical format, data is written to tape in blocks with inter-block gaps between them, and each block is written in a single operation with the tape running continuously during the write. However, since the rate at which data is written or read to the tape drive is not deterministic, a tape drive usually has to cope with a difference between the rate at which data goes on and off the tape and the rate at which data is supplied or demanded by its host.
Various methods have been used alone and in combination to cope with this difference. If the host cannot keep up with the tape drive transfer rate, the tape drive can be stopped, backed up, and restarted. A large memory buffer can be used to queue the data. In the past, the host block size affected the data density on tape, but on modern drives, data is typically organized into fixed sized blocks which may or may not be compressed and/or encrypted, and host block size no longer affects data density on tape. The Linear Tape-Open article covers this. Modern tape drives offer a speed matching feature, where the drive can dynamically decrease the physical tape speed as needed to avoid shoe-shining.
In the past, the size of the inter-block gap was constant, while the size of the data block was based on host block size, affecting tape capacity – for example, on count key data storage. On most modern drives, this is no longer true. Linear Tape-Open type drives use a fixed-size block for tape, independent of the host block size, and the inter-block gap is variable to assist with speed matching during writes. On drives with compression, the compressibility of the data will affect the capacity.

Sequential access to data

Tape is characterized by sequential access to data. While tape can provide fast sequential data transfers, it takes tens of seconds to load a cassette and position the tape head to an arbitrary place. By contrast, hard disk technology can perform the equivalent action in tens of milliseconds and can be thought of as offering random access to data.
Logical filesystems require data and metadata to be stored on the data storage medium. Storing metadata in one place and data in another requires lots of slow repositioning activity on most tape systems. As a result, most tape systems use a trivial filesystem in which files are addressed by number, not by filename. Metadata such as file name or modification time is typically not stored at all. Tape labels store such metadata, and they are used for interchanging data between systems. File archiver and backup tools have been created to pack multiple files along with the related metadata into a single 'tape file'. Serpentine tape drives can improve access time by switching to the appropriate track; tape partitions were used for directory information. The Linear Tape File System is a method of storing file metadata on a separate part of the tape. This makes it possible to copy and paste files or directories to a tape as if it were just like another disk, but does not change the fundamental sequential access nature of tape.

Access time

Tape has quite a long latency for random accesses since the deck must wind an average of one-third the tape length to move from one arbitrary data block to another. Most tape systems attempt to alleviate the intrinsic long latency, either using indexing, where a separate lookup table is maintained which gives the physical tape location for a given data block number, or by marking blocks with a tape mark that can be detected while winding the tape at high speed.

Data compression

Most tape drives now include some kind of lossless data compression. There are several algorithms which provide similar results: LZ, IDRC, ALDC and DLZ1. Embedded in tape drive hardware, these compress a relatively small buffer of data at a time, so cannot achieve extremely high compression even of highly redundant data. A ratio of 2:1 is typical, with some vendors claiming 2.6:1 or 3:1. The ratio actually obtained with real data is often less than the stated figure; the compression ratio cannot be relied upon when specifying the capacity of equipment, e.g., a drive claiming a compressed capacity of 500GB may not be adequate to back up 500GB of real data. Data that is already stored efficiently may not allow any significant compression; a sparse database may offer much larger factors. Software compression can achieve much better results with sparse data, but uses the host computer's processor, and can slow the backup if it is unable to compress as fast as the data is written.
The compression algorithms used in low-end products are not the most effective known today, and better results can usually be obtained by turning off hardware compression AND using software compression instead.
Plain text, raw images, and database files typically compress much better than other types of data stored on computer systems. By contrast, encrypted data and pre-compressed data would normally increase in size, if data compression was applied. In some cases this data expansion could be as much as 15%.

Encryption

Standards exist to encrypt tapes. Encryption is used so that even if a tape is stolen, the thieves cannot use the data on the tape. Key management is crucial to maintain security. Encryption is more efficient if done after compression, as encrypted data cannot be compressed effectively. Some enterprise tape drives can quickly encrypt data. Symmetric streaming encryption algorithms can also provide high performance.

Cartridge memory and self-identification

Some tape cartridges, notably LTO cartridges, have small associated data storage chips built into the cartridges to record metadata about the tape, such as the type of encoding, the size of the storage, dates and other information. It is also common for tape cartridges to have bar codes on their labels in order to assist an automated tape library.

Viability

Tape remains viable in modern data centers because:
  1. it is the lowest cost medium for storing large amounts of data and
  2. as a removable medium it allows the creation of an air gap which can prevent data from being hacked, encrypted or deleted and
  3. its longevity allows for extended data retention which may be required by regulatory agencies.
The lowest cost storage tiers of cloud storage can also be tape.

High-density magnetic media

announced, in 2014, that they had developed, using a new vacuum thin-film forming technology able to form extremely fine crystal particles, a tape storage technology with the highest reported magnetic tape data density, 148 Gbit/in², potentially allowing a native tape capacity of 185 TB. It was further developed by Sony, with announcement in 2017, about reported data density of 201 Gbit/in², giving standard compressed tape capacity of 330 TB.
In May 2014, Fujifilm followed Sony and made an announcement that it will develop a 154 TB tape cartridge in conjunction with IBM, which will have an areal data storage density of 85.9 GBit/in² on linear magnetic particulate tape. The technology developed by Fujifilm, called NANOCUBIC, reduces the particulate volume of BaFe magnetic tape, simultaneously increasing the smoothness of the tape, increasing the signal to noise ratio during read and write while enabling high frequency response.

Chronological list of tape formats