DBM (computing)


In computing, a DBM is a library and file format providing fast, single-keyed access to data. A key-value database from the original Unix, dbm is an early example of a NoSQL system.

History

The original dbm library and file format was a simple database engine, originally written by Ken Thompson and released by AT&T in 1979. The name is a three letter acronym for DataBase Manager, and can also refer to the family of database engines with APIs and features derived from the original dbm.
The dbm library stores arbitrary data by use of a single key in fixed-size buckets and uses hashing techniques to enable fast retrieval of the data by key.
The hashing scheme used is a form of extendible hashing, so that the hashing scheme expands as new buckets are added to the database, meaning that, when nearly empty, the database starts with one bucket, which is then split when it becomes full. The two resulting child buckets will themselves split when they become full, so the database grows as keys are added.
The dbm library and its derivatives are pre-relational databases they manage associative arrays, implemented as on-disk hash tables. In practice, they can offer a more practical solution for high-speed storage accessed by key, as they do not require the overhead of connecting and preparing queries. This is balanced by the fact that they can generally only be opened for writing by a single process at a time. An agent daemon can handle requests from multiple processes, but introduces IPC overhead.

Implementations

The original AT&T dbm library has been replaced by its many successor implementations. Notable examples include:
The following databases are dbm-inspired, but they do not directly provide a dbm interface, even though it would be trivial to wrap one:
As of 2001, the ndbm implementation of DBM was standard on Solaris and IRIX, whereas gdbm is ubiquitous on Linux. The Berkeley DB implementations were standard on some free operating systems. After a change of licensing of the BDB to GNU AGPL in 2013, projects like Debian have moved to LMDB.

Reliability

A 2018 AFL fuzzing test against many DBM-family databases exposed many problems in implementations when it comes to corrupt or invalid database files. Only freecdb by Daniel J. Bernstein showed no crashes. The authors of gdbm, tdb, and lmdb were prompt to respond. Berkeley DB fell behind due to the sheer amount of other issues; the fixes would be irrelevant to OSS users anyways due to the licensing change locking them back on an old version.