Two-phase locking
In databases and transaction processing, two-phase locking is a concurrency control method that guarantees serializability.
It is also the name of the resulting set of database transaction schedules. The protocol utilizes locks, applied by a transaction to data, which may block other transactions from accessing the same data during the transaction's life.
By the 2PL protocol, locks are applied and removed in two phases:
- Expanding phase: locks are acquired and no locks are released.
- Shrinking phase: locks are released and no locks are acquired.
Data-access locks
A lock is a system object associated with a shared resource such as a data item of an elementary type, a row in a database, or a page of memory. In a database, a lock on a database object may need to be acquired by a transaction before accessing the object. Correct use of locks prevents undesired, incorrect or inconsistent operations on shared resources by other concurrent transactions. When a database object with an existing lock acquired by one transaction needs to be accessed by another transaction, the existing lock for the object and the type of the intended access are checked by the system. If the existing lock type does not allow this specific attempted concurrent access type, the transaction attempting access is blocked. In practice, a lock on an object does not directly block a transaction's operation upon the object, but rather blocks that transaction from acquiring another lock on the same object, needed to be held/owned by the transaction before performing this operation. Thus, with a locking mechanism, needed operation blocking is controlled by a proper lock blocking scheme, which indicates which lock type blocks which lock type.Two major types of locks are utilized:
- Write-lock is associated with a database object by a transaction before writing this object.
- Read-lock is associated with a database object by a transaction before reading this object.
- An existing write-lock on a database object blocks an intended write upon the same object by another transaction by blocking a respective write-lock from being acquired by the other transaction. The second write-lock will be acquired and the requested write of the object will take place after the existing write-lock is released.
- A write-lock blocks an intended read by another transaction by blocking the respective read-lock .
- A read-lock blocks an intended write by another transaction by blocking the respective write-lock.
- A read-lock does not block an intended read by another transaction. The respective read-lock for the intended read is acquired immediately after the intended read is requested, and then the intended read itself takes place.
Two-phase locking and its special cases
Two-phase locking
According to the two-phase locking protocol, a transaction handles its locks in two distinct, consecutive phases during the transaction's execution:- Expanding phase : locks are acquired and no locks are released.
- Shrinking phase : locks are released and no locks are acquired.
Typically, without explicit knowledge in a transaction on end of phase 1, it is safely determined only when a transaction has completed processing and requested commit. In this case, all the locks can be released at once.
Conservative two-phase locking
The difference between 2PL and C2PL is that C2PL's transactions obtain all the locks they need before the transactions begin. This is to ensure that a transaction that already holds some locks will not block waiting for other locks. Conservative 2PL prevents deadlocks.Strict two-phase locking
To comply with the S2PL protocol, a transaction needs to comply with 2PL, and release its write locks only after it has ended, i.e., being either committed or aborted. On the other hand, read locks are released regularly during phase 2. This protocol is not appropriate in B-trees because it causes Bottleneck.Strong strict two-phase locking
or Rigorousness, or Rigorous scheduling, or Rigorous two-phase lockingTo comply with strong strict two-phase locking the locking protocol releases both write and read locks applied by a transaction only after the transaction has ended, i.e., only after both completing executing and becoming either committed or aborted. This protocol also complies with the S2PL rules. A transaction obeying SS2PL can be viewed as having phase 1 that lasts the transaction's entire execution duration, and no phase 2. Thus, only one phase is actually left, and "two-phase" in the name seems to be still utilized due to the historical development of the concept from 2PL, and 2PL being a super-class. The SS2PL property of a schedule is also called Rigorousness. It is also the name of the class of schedules having this property, and an SS2PL schedule is also called a "rigorous schedule". The term "Rigorousness" is free of the unnecessary legacy of "two-phase," as well as being independent of any mechanism. The property's respective locking mechanism is sometimes referred to as Rigorous 2PL.
SS2PL is a special case of S2PL, i.e., the SS2PL class of schedules is a proper subclass of S2PL.
SS2PL has been the concurrency control protocol of choice for most database systems and utilized since their early days in the 1970s. It is proven to be an effective mechanism in many situations, and provides besides Serializability also Strictness, which is instrumental for efficient database recovery, and also Commitment ordering for participating in distributed environments where a CO based distributed serializability and global serializability solutions are employed. Being a subset of CO, an efficient implementation of distributed SS2PL exists without a distributed lock manager, while distributed deadlocks are resolved automatically. The fact that SS2PL employed in multi database systems ensures global serializability has been known for years before the discovery of CO, but only with CO came the understanding of the role of an atomic commitment protocol in maintaining global serializability, as well as the observation of automatic distributed deadlock resolution. As a matter of fact, SS2PL inheriting properties of Recoverability and CO is more significant than being a subset of 2PL, which by itself in its general form, besides comprising a simple serializability mechanism, in not known to provide SS2PL with any other significant qualities. 2PL in its general form, as well as when combined with Strictness, i.e., Strict 2PL, are not known to be utilized in practice. The popular SS2PL does not require marking "end of phase 1" as 2PL and S2PL do, and thus is simpler to implement. Also, unlike the general 2PL, SS2PL provides, as mentioned above, the useful Strictness and Commitment ordering properties.
Many variants of SS2PL exist that utilize various lock types with various semantics in different situations, including cases of lock-type change during a transaction. Notable are variants that use Multiple granularity locking.
Comments:
- SS2PL vs. S2PL: Both provide Serializability and Strictness. Since S2PL is a superclass of SS2PL it may, in principle, provide more concurrency. However, no concurrency advantage is typically practically noticed, and the overhead of dealing with an end-of-phase-1 mechanism in S2PL, separate from transaction end, is not justified. Also, while SS2PL provides Commitment ordering, S2PL does not. This explains the preference of SS2PL over S2PL.
- Especially before 1990, but also after, in many articles and books, e.g.,, the term "Strict 2PL" has been frequently defined by the locking protocol "Release all locks only after transaction end," which is the protocol of SS2PL. Thus, "Strict 2PL" could not be there the name of the intersection of Strictness and 2PL, which is larger than the class generated by the SS2PL protocol. This has caused confusion. With an explicit definition of S2PL as the intersection of Strictness and 2PL, a new name for SS2PL, and an explicit distinction between the classes S2PL and SS2PL, the articles and have intended to clear the confusion: the first using the name "rigorousness," and the second "SS2PL."
- A more general property than SS2PL exists, Strict commitment ordering, which as well provides both serializability, strictness, and CO, and has similar locking overhead. Unlike SS2PL, SCO does not block upon a read-write conflict at the cost of a possible delayed commit, and upon such conflict type SCO has shorter average transaction completion time and better performance than SS2PL. While SS2PL obeys the lock compatibility table above, SCO has the following table:
Summary - Relationships among classes
Deadlocks in 2PL
Locks block data-access operations. Mutual blocking between transactions results in a deadlock, where execution of these transactions is stalled, and no completion can be reached. Thus deadlocks need to be resolved to complete these transactions' executions and release related computing resources. A deadlock is a reflection of a potential cycle in the precedence graph, that would occur without the blocking. A deadlock is resolved by aborting a transaction involved with such potential cycle, and breaking the cycle. It is often detected using a wait-for graph, which indicates which transaction is "waiting for" lock release by which transaction, and a cycle means a deadlock. Aborting one transaction per cycle is sufficient to break the cycle. If a transaction has been aborted due to deadlock resolution, it is up to the application to decide what to do next. Usually, an application will restart the transaction from the beginning but may delay this action to give other transactions sufficient time to finish in order to avoid causing another deadlock.In a distributed environment an atomic commitment protocol, typically the Two-phase commit protocol, is utilized for atomicity. When recoverable data partitioned among 2PC participants, then distributed deadlocks, deadlocks involving two or more participants in 2PC, are resolved automatically as follows:
When SS2PL is effectively utilized in a distributed environment, then global deadlocks due to locking generate voting-deadlocks in 2PC, and are resolved automatically by 2PC. For the general case of 2PL, global deadlocks are similarly resolved automatically by the synchronization point protocol of phase-1 end in a distributed transaction, and being propagated to the participants in a distributed transaction the same way as a decision point in atomic commitment; in analogy to decision point in CO, a conflicting operation in 2PL cannot happen before phase-1 end synchronization point, with the same resulting voting-deadlock in the case of a global data-access deadlock; the voting-deadlock.
Comment: