Gbcast
Gbcast is a reliable multicast protocol that provides ordered, fault-tolerant message delivery in a group of receivers within a network of machines that experience crash failure. The protocol is capable of solving Consensus in a network of unreliable processors, and can be used to implement state machine replication. Gbcast can be used in a standalone manner, or can support the virtual synchrony execution model, in which case Gbcast is normally used for group membership management while other, faster, protocols are often favored for routine communication tasks.
History
Introduced in 1985, Gbcast was the first widely deployed reliable multicast protocol to implement state machine replication with dynamically reconfigurable membership. Although this problem had been treated theoretically under various models in prior work, Gbcast innovated by showing that the same multicasts used to update replicated data within the state machine can also be used to dynamically reconfigure the group membership, which can then evolve to permit members to join and leave at will, in addition to being removed upon failure. This functionality, together with a state transfer mechanism used to initialize joining members, represents the basis of the virtual synchrony process group execution model.The term state machine replication was first suggested by Leslie Lamport and was widely adopted after publication of a survey paper written by Fred B. Schneider. The model covers any system in which some deterministic object is replicated in such a way that a series of commands can be applied to the replicas fault-tolerantly. A reconfigurable state machine is one that can vary its membership, adding new members or removing old ones. Some state machine protocols can also ride out the temporary unavailability of a subset of the current members without requiring reconfiguration when such situations arise, including Gbcast and also Paxos, Lamport's widely cited protocol for state machine replication.
State machine replication is closely related to the distributed Consensus problem, in which a collection of processes must agree upon some decision outcome, such as the winner of an election. In particular, it can be shown that any solution to the state machine replication problem would also be capable of solving distributed consensus. As a consequence, impossibility results for distributed consensus apply to solutions to the state machine replication problem. Implications of this finding are discussed under [|liveness].
Gbcast is somewhat unusual in that most solutions to the state machine replication problem are closely integrated with the application being replicated. Gbcast, in contrast, is designed as a multicast API and implemented by a library that delivers messages to group members. Lamport, Malkhi and Zhou note that few reliable multicast protocols have the durability properties required to correctly implement the state machine model. Gbcast does exhibit the necessary properties.
The Gbcast protocol was first described in a 1985 publication that discussed infrastructure supporting the virtual synchrony model in the Isis Toolkit. Additional details were provided in a later 1987 journal article, and an open-source version of the protocol was released by the Cornell developers in November of that year. Isis used the protocol primarily for maintaining the membership of process groups but also offered an API that could be called directly by end-users. The technology became widely used starting in 1988, when the Isis system was commercialized and support became available. Commercial support for the system ended in 1998 when Stratus Computer, then the parent of Isis Distributed Systems, refocused purely on hardware solutions for the telecommunications industry.
Examples of systems that used Isis in production settings include the New York Stock Exchange, where it was employed for approximately a decade to manage a configurable, fault-tolerant and self-healing reporting infrastructure for the trading floor, to relay quotes and trade reports from the "back office" systems used by the exchange to overhead display. The French Air Traffic Control System continues to use Isis; since 1996 the system has been employed to create fault-tolerant workstation clusters for use by air traffic controllers and to reliably relay routing updates between air traffic control centers; over time the French technology has also been adopted by other European ATC systems. The US Navy AEGIS has used Isis since 1993 to support a reliable and self-healing communication infrastructure. Isis also had several hundred other production users in the financial, telecommunications, process control, SCADA and other critical infrastructure domains. More details can be found in.
Problem statement
The fundamental problem solved by Gbcast is this: we are given an initial set of group members and wish to support a multicast abstraction, permitting members of the group to send messages that encode various commands or requests. The protocol must agree on the messages to deliver, and on their ordering, so that if any member of the group delivers a message, every member of the group that doesn't fail will deliver that message and in the same order with respect to other delivered messages.The set of group members changes each time a member fails or joins, and Gbcast is also used to maintain group membership by means of special multicasts that are delivered to the application as "new view" events, but that also adjust the group membership list maintained by the Gbcast protocol library. The application thus sees a series of membership views that start with an "initial view" when a particular group member joins, and then evolve over time, and that are ordered with respect to other view-changing events and multicast messages. These multicasts are delivered to all the non-failed members listed in the view during which delivery is scheduled, a property referred to as virtual synchrony.
Network partitions can split a group into two or more disjoint subgroups, creating the risk of split brain behavior, in which some group members take a decision without knowing that some other partition of the group has taken a different, conflicting decision. Gbcast offers protection against this threat: the protocol ensures that progress occurs only in a single primary partition of the group. Thus, should a network partition arise, at most one subgroup of members will continue operations, while the other is certain to stall and shut down.
Should a failed member recover, after communication is restored, that member can rejoin. An incarnation number is used to avoid ambiguity: a counter that will be incremented each time a process joins the group, and is treated as part of the process identifier. Any given tuple joins the group at most once, then remains in the group until it fails, or is forced to leave because a time out occurred.
Any dynamically reconfigurable system, including both Gbcast and Paxos, can enter states from which no further progress is possible. For example, this could happen if operational processes are wrongly removed from the configuration, and then too many real crashes occur within the remaining members of the view. In such situations, the data center management infrastructure is responsible for restarting the entire application. This is in contrast to the behavior of non-reconfigurable Paxos, which can tolerate disruptions of unlimited duration and then will resume once enough group members are accessible, without intervention of the management infrastructure. The following terms are used in the detailed protocol description.
Processes
- Processes run on processors that operate at arbitrary speed.
- Processes may experience crash failures.
- A process is uniquely identified by a three-tuple:.
- Processes with stable storage may re-join the protocol after failures, after incrementing the incarnation number.
- Processes do not collude, lie, or otherwise attempt to subvert the protocol.
Network
- All processes in the system can send messages to all other processes in the system.
- Messages are sent asynchronously: there is no time bound on message delivery.
- Messages may be lost, reordered, or duplicated.
- Messages are delivered without corruption.
To simplify the presentation, we assume that a TCP-like acknowledgement / retransmission scheme is employed, creating the illusion of a reliable, sequenced, non-repeating message channel between each pair of processes. A timeout occurs if this channel abstraction retries repeatedly and is unable to obtain an acknowledgement for some message. Using the same TCP-like channels, we can also support a 1-to-all capability, whereby a single process sends some message over its channels to all the other members of some view of some group. This is done by mapping the 1-to-all request into multiple 1-to-1 messages. Notice that these 1-to-all channels lack any atomicity guarantee: if the sender fails while a message is being sent, it might reach just some of the destinations.
Process Groups and Views
Multicast Messages
Roles
Gbcast is best understood in terms of a set of roles.Application
Leader
Failure detection
Failed process
New Leader
Quorums
Quorums are used to guarantee the safety properties of Gbcast by ensuring that there is a single globally agreed-upon sequence of group views and multicast messages and by preventing progress in more than one partition if a group becomes fragmented into two or more partitions. Quorums are defined for a specific view.Given view i with n members, a quorum of the view is any majority subset of the members of that view. Notice that this is in contrast to the way the term is defined in systems that have a static underlying membership: for Gbcast, the quorum size will change over time as the membership of a group changes and new views become defined.
Safety and liveness properties
In order to guarantee safety, Gbcast defines three safety properties and ensures they hold, regardless of the pattern of failures:Non-triviality
Consistency
Conditional liveness
Basic Gbcast
This protocol is the one used under normal conditions.Recall that in Gbcast, each operational process has a current view, and each view defines a leader. Only a process that believes itself to be the leader in the current view can initiate a new multicast; other members must relay multicasts by sending them to the leader, over 1-to-1 connections, and then waiting for the leader to run the protocol.
Should the leader fail while some member that is not the leader is attempting to relay a multicast, the sender must determine the status of its pending request. This is accomplished as follows: Notice that members observe the delivery of their own multicasts. Accordingly, if a new view becomes defined in which the old leader has failed, either the multicast has been delivered, or the delivery of the new view allows it to conclude that the leader failed to relay the pending message, and that it should be resent by asking the new leader to relay it.
Prepare step
Promise step
Commit step
Delivery step
Message flow: Basic Gbcast, simplest case
Error cases in basic Gbcast
The simplest error cases are those in which one or more members fail, but a quorum remains active. In the example below, the group consists of with A playing the leader role. fails during the promise phase and a timeout occurs within the reliable channel from the leader to process. The leader therefore commits the delivery of M, but simultaneously initiates a protocol to remove from the group, which commits, creating the new view. If C has not actually failed, it can now rejoin the group but with a new incarnation number: in effect, C must rejoin as C'. Any messages from C to A or B will be rejected from the instant that each learns of the apparent failure: C will be shunned by A and B.Message flow: Basic Gbcast, failure of member other than the Leader
Notice that the Commit and the new Proposal are combined into a single message. This ensures that any process that commits an action after a new failure has been sensed simultaneously learns of that failure and will shun the associated process, and that the process will quickly be removed from the view. If C hasn't crashed, it can rejoin by incrementing its incarnation number and then requesting that it be added back into the group by the leader. It will be appended to the membership list with its new name, and will have the highest rank among members of the view.
Message flow: Basic Gbcast, add members {D,E,F}, failure of member other than the Leader
In the example shown below, a group that initially contains members is asked to add, but member C fails during the protocol. Membership change requests are treated as a special kind of multicast and the sequence of events is the same. The example is thus nearly identical to the prior one, but now a series of new view events are delivered to the application.At the end of the protocol, the new active view is view3= and the new quorum size is 3. But notice that there was an "intermediate" view, view2= with quorum size of 4. Had the leader not received 4 promises to the proposal phase that removed C, it would not have been able to run the commit phase for view3. This illustrates a basic policy: the quorum required to commit a new view is always based on the size of the prior view.
Takeover protocol, used when the leader fails
The next failure case is when a leader fails, resulting in a new leader. To take over as the leader, the new leader first runs a takeover protocol, and then the new leader can run basic Gbcast as above. The takeover protocol is as follows:Inquiry Step
Promise-List Step
Repeat If Necessary
Check for Quorums
Start as New Leader
Dueling Leaders
Failure Suspicions Piggyback on Outgoing Messages
Message flow: Basic Gbcast, failure of Leader, TakeOver, Basic Gbcast by the new leader
Message flow: Basic Gbcast, Add members {D,E,F}, failure of the Leader
As an example of a more complex case, here the leader fails in the middle of a commit that increases the size of the viewIn this example we see the inquiry iteration "in action": B learns of the protocol that adds in a first phase of the inquiry, hence it repeats the inquiry, this time contacting D, E and F. There is no need to repeat the inquiry at C since this would simply return the same information previously obtained.
In this example, the final commit actually causes two views to be delivered in succession at members B and C. Even though the two proposals were sent concurrently, the commit for view2 requires a promise from a quorum of view1, whereas the commit for view3 requires a quorum response from the members of view2. Although the sending of initial views isn't explicitly shown in the diagram, the joining members don't participate in the 1.1 protocol because they don't join the group until view2. Notice that at members B and C a pipelining effect arises: events associated with view2 are already being proposed even as events in view1 are still being committed.
Correctness
To show that Gbcast satisfies non-triviality we start by tracing backwards from an arbitrary delivery action to the point at which a client requested the corresponding event; clearly, only messages that were legitimately sent will be delivered. However, nontriviality for this protocol goes further: we must also show that messages from a given member are delivered only while that member is still a live participant in some view. Accordingly, we look at the case in which the leader initiates some multicast but then fails before it is delivered. Here, the new leader either discovers the pending proposal, and will order it before the view-change event, or the new leader fails to discover the pending proposal, in which case all members of the new view will shun any late-arriving incoming message from the old leader. Thus either a multicast message is delivered while the view in which it was sent is still pending, or it will not be delivered at all.To establish consistency we begin by analysis of the case in which there is just a single leader that never fails or loses connectivity with a quorum. Since the leader sequences the events and includes each member starting with the first view that contains that member, all members deliver the identical messages starting from the view in which they were added to the system.
When a new leader takes over, the inquiry is required to reach a quorum of members for the most recent committed view. This quorum necessarily will include at least one process that received any proposal that the old leader could have committed. Thus the new leader will learn of any potentially committed proposal and include it as a preflix to its own new proposals. It follows that if any process delivers any event, then if the system makes progress, every surviving member will eventually deliver that same event and in the same order.
We can show that a joining member will receive its initial view by analysis of the two relevant cases. If the leader doesn't fail, it sends the initial view on an eventually reliable channel. If the leader does fail and some member lacks its initial view, the new leader sends that view after receipt of the "promise-list" response to its inquiry-phase message.
A logical partitioning of the group is impossible because of the shunning rule. In order to commit any new view, the old leader must obtain promises from a quorum of the current view. A new leader, taking over, will learn of any view that could have become committed. To commit its own proposed next view, it will thus be required to interact with a quorum of that intermediary view, if any. In a scenario that could lead to partitioning, the leader, A, might have timed out on B and gone on to create a sequence of new views and events that excluded B. But in this case a majority of the old or of the intermediary view members will have learned that A believes B to have failed, and will shun B when it inquires. In either case, B is prevented from obtaining a quorum and hence cannot make progress. A symmetric argument shows that if B succeeds in defining a new view that excludes A, A would be unable to obtain a quorum for any other new view that it might attempt to propose.
Liveness
The Gbcast protocol will make progress provided that at all times in the execution, if view holds at time, then less than a quorum of members of fail within some subset of the members of the view. To maximize progress, it is important that excluded but still live members rejoin the group, so that erroneous failure detections don't cause the view to shrink in a persistent manner. However, the protocol will not recover and make progress if at any time, every process suspects more than a quorum of members of the current view of having failed.This property is similar to but "stronger" than <>W, the "weakest failure detector" for achieving consensus, as defined by Chandra and Toueg. To see this, consider a run in which a mutually suspecting quorum arises "too quickly" for processes that have been wrongly excluded from the view to rejoin it. Gbcast will not make progress and, indeed, the group will need to shut down and restart.
Arguably, such runs would be unlikely in the kinds of data centers where Gbcast is typically used, but clearly they can be constructed in an adversarial manner.
Discussion: Failure sensing
The Gbcast protocol presumes that the probability of incorrect failure suspicions will be low; the scheme breaks down if failure suspicions occur frequently and operational processes are often suspected as faulty. By analogy, consider the TCP protocol, in which the failure to receive an acknowledgement will eventually cause a connection to break. TCP is used nearly universally, a tremendous disruption to the Web would result if TCP connections frequently were to break when neither endpoint has failed. Thus timeouts are set conservatively. A similar assumption is required for systems that use Gbcast.In contrast, there are other failure detection schemes, such as the one explored by Chandra and Toueg, that can yield high rates of incorrect failure suspicions. Some protocols, including Paxos, are able to tolerate incorrect failure suspicions without any costly consequence. Whether one approach is inherently better than the other is beyond the scope of this discussion. We simply underscore that the approaches differ, and that Gbcast would be ineffective if timeouts are set overly aggressively.
One extreme scenario is worthy of further mention: network partitioning events. Modern data centers and networks often experience events in which a single machine and all the processes on it becomes transiently partitioned from a larger pool of machines that remain connected to one another. Such cases are treated as failures in Gbcast, but if the surviving, connected members include a sufficiently large number of processes, the majority portion of the system will simply reconfigure itself to exclude the disconnected member. It can reconnect and rejoin the group later when the partition heals.
A more extreme kind of partitioning is sometimes seen in data centers: in this situation, a network switch might fail, causing a collection of machines to become disconnected from the Internet and from the remainder of the data center. In such cases one could imagine a group in which all members begin to suspect all other members; Gbcast will not progress in this case and the management infrastructure would need to relaunch the entire application. On the other hand, in most large data centers, the operating systems of the machines experiencing such a failure would also shut down, restarting only when connectivity is restored. Thus in practice, the restart of the system is unavoidable. This said, there are protocols, such as Paxos, that could ride out such an outage if the machines themselves were to remain operational and later regained adequate connectivity.
The Transis system explored extensions to the Gbcast protocol that permit multiple partitions to form, to make independent progress, and then to remerge. This topic, however, is beyond the scope of the present discussion.
Discussion: Dueling leaders
In the Paxos protocol, a situation can arise in which two or more leaders "duel" by proposing different commands for the same slot. This can also occur in Gbcast.In the normal sequence of events, one leader takes over because the prior leader has failed, learns of any proposals the prior leader made during the inquiry phase, and then repeats those same proposals, extended with new ones. Thus no duel over the content of slots arises because the same proposals are repeated in the same slots.
The closest situation to a duel is seen if the old leader has become partitioned from the majority and the new leader, taking over, is unable to contact some set of members. Here the new leader may be unaware of some proposals that the old leader made, or might still issue, if those reach only the members the new leader didn't contact.
The shunning mechanism resolves such duels. When the new leader obtained a quorum during the INQUIRE phase, it also blocked the old leader from ever again achieving a quorum for any new PROPOSE it might initiate: a majority of members are now shunning the old leader. Thus if any proposal is missed by the new leader it necessarily is a proposal that didn't reach a quorum of members, and won't reach a quorum in the future. Moreover, members aware of such a proposal will be shunned by the new leader, since it considers them to have failed. Any member learning of new proposals from the new leader will shun them as well.
Shunning of leaders in Gbcast occurs in the pre-determined order of leader ranks: a higher-ranking leader only shuns a lower-ranking leader when it tries to take-over its place. The Paxos ballots mechanism serves precisely the same purpose, but differs in allowing participants to attempt to take-over repeatedly, eaach time assuming a new ballot. The result is that, one the one hand, Paxos leader demotion is reversible, and on the other, dueling leaders could theoretically continue forever.
Bi-simulation equivalence to Paxos
Although superficially quite different, upon close study Gbcast is seen to be surprisingly similar to Paxos. Indeed, Paxos can be "transformed" into Gbcast with the following sequence of steps. For brevity we describe these steps informally and omit a detailed proof.Note that this transformation does not address durability. Gbcast treats durable state as a property of the application, not the protocol, whereas Paxos logs events to a set of durable command logs, and hence can still recover its state even after the whole service is shut down and restarted. The equivalent behavior with Gbcast involves having the application log all received messages, but that case will not be considered here.
- Start with the basic Paxos protocol. Add a process incarnation number to distinguish a rejoining process from one that has been continuously a member of the view. Impose an age-based ordering on the members of the group, designate the oldest member as the leader. Non-leaders issue requests through the leader.
- Both protocols permit batching of requests: Basic Paxos has a concurrency parameter, alpha: a leader can concurrently run a maximum of alpha instances of the protocol. Gbcast permits the leader to propose multiple events in a single protocol instance, which could be message deliveries or view events.
- Paxos does not normally require reliable, ordered communication. Modify the protocol to run over the reliable one-to-one channel abstraction. We can now assume that any message sent will either be received and delivered in order, or that a timeout will occur at the sender side.
- The Paxos slot number will become the Gbcast sequence number. The Paxos ballot number is, in effect, transformed into the proposing leader-number used to discriminate between conflicting proposals during the inquire step.
- Define a category of view-modifying commands that operate by adding or removing processes from the group membership. Introduce a failure detection mechanism as used in Gbcast, asking the leader to remove any timed-out members. A member removed from the group that reestablishes connectivity to the group should rejoin with a new incarnation number. Report views by upcalls to the application.
- Basic Paxos can propose a multicast to just a quorum of group members, hence a typical member may have gaps in its command list. This is why, in Paxos, a learner must read a quorum of members and merge their command lists. In our modified protocol, any multicast is proposed to all non-failed members, while failed members are dropped from the view. Thus unlike Paxos, our modified protocol has the property that any single live member has the full committed event list. In effect, the protocol has a write quorum equal to the current membership view size, and a read quorum of 1. This can be convenient when building applications that maintain the actual state of a database or object and for which it is inconvenient to represent state as a series of updates in command lists that must be merged to learn the actual sequence of events.
It follows that Gbcast and Paxos can be transformed, each to the other, without changing assumptions and with the identical correctness properties. Obviously, the protocols don't look very similar, but they have a deep connection. Indeed, one can make a stronger claim: any sequence of delivery events exhibited by Gbcast can also arise in some run of Paxos, and vice versa: any sequence of learned events from Paxos can also arise in some run of Gbcast.
The type of proof outlined above is formally called a bi-simulation: one shows that any pair that one protocol can exhibit is also possible with the other protocol. Notice that in carrying out a bisimulation, features that one protocol supports but the other lacks can be ignored if they are not considered to be part of the "behavior" being studied. For example, the Gbcast reporting of new views are not treated as output events here.
Summary of differences between Paxos and Gbcast
- Gbcast has no durable state: the protocol does not maintain a log of events on disk, and durability is treated as an application-specific property. In contrast, Paxos guarantees durability: after recovering from a complete shutdown of the system, a Paxos application will still be able to learn the full log of received messages.
- In the propose phase, Gbcast must wait for responses from all participants, instead of making progress with the fastest quorum. In Gbcast, the cost of a failure suspicion is high and the protocol may cease to make progress if too many failures are suspected, forcing a management layer to restart the entire application group. Thus, in practice, Gbcast requires conservative timeout settings relative to Paxos.
- With Gbcast, if an error does occur, that process must drop out. With Paxos, if f>0, should a process be unable to participate in a protocol instance, it can continue to participate in subsequent protocol instances without error.
- Operational members of a view will never have gaps in their command lists with Gbcast. Operational members can have gaps in their command lists when using Paxos.
- With Paxos, to propose multiple commands we use alpha>1, but in this case commands can be committed in a different order from the order in which they were initiated. With Gbcast, the leader can initiate multiple commands by issuing a single propose that describes a series of actions. The group will be committed all at once, hence the order of initiation will be respected.
- With Gbcast, a command is delivered in the view in which it was initiated. Reconfigurable Paxos can commit a command in a slot associated with a membership view prior to the active membership view at the time when the commit occurs. Thus, in Paxos, if an application is in some way view sensitive, commands must carry a view identifier, so that recipients can determine whether or not the command is still executable.
- Gbcast does not require that the protocol be halted when changing configurations: the rate of new proposals can be constant even across membership changes. For many implementations of reconfigurable Paxos, this would not be the case.
- With both Gbcast and Paxos, reconfiguration is only possible if a quorum of the prior view is accessible and can acknowledge the new view. However, in Paxos, the requirement also extends to learning the outcomes of commands proposed for slots associated with the old view. In practice, this can cause the Paxos reconfiguration computation to extend over a longer period than for Gbcast, in which any state is stored within the application, not a long-lived command list: Paxos cannot discard the state associated with an old view until the new view is active and any replicas have learned the old state.
- Gbcast does not require a garbage collection protocol because, as each message or view is committed and reported to the application it can be discarded. Paxos maintains state using a quorum scheme in the command logs at its acceptors, and requires a garbage collection protocol to free these command slots once the outcome is committed and all learners have learned the outcome.
Liveness comparison
In Gbcast, progress will never resume if a circle of mutual suspicions arises, as noted above: once a quorum of mutually shunning processes arises, the shunning mechanism makes it impossible for any leader to obtain a quorum of promises.
With an Paxos protocol, this problem will not arise: once the excessive level of mutual suspicions ends, progress resumes. Thus Paxos makes progress with any failure-detection mechanism satisfying the <>W condition, even if periods arise during which more than a quorum of mutual suspicions occur.
For example, if we start with a group containing and cause an extended network partition, Paxos would resume when the network partition resolves but Gbcast will shut down permanently and some form of management infrastructure may need to restart the system. If it is necessary to preserve group state across the failure, such an infrastructure would identify the last member to fail and restart the group using some form of checkpoint stored by that last member.
In Paxos deployments, it is common to require human operator intervention to reconfigure Paxos. In such settings, Gbcast may be able to make progress during period when Paxos cannot. Suppose that a group has membership that slowly drops to less than a quorum of the original group size. Gbcast can continue to operate with even a single member. Paxos would cease to make progress during periods when less than a quorum of its view are active.
Need for state transfer
Systems such as Isis that implement Gbcast typically provide a state transfer mechanism: at the instant the new view showing some joining member is delivered, some existing member makes a checkpoint of its copy of the group state. This is then copied to the new member, which loads the checkpoint as the initial group state as of the instant it joined.. State transfer is needed because in Gbcast, once a member is dropped from a group, it will no longer receive updates. Gbcast is typically used by applications that maintain their state in memory and apply updates one by one as received, hence once a gap arises, a replica is no longer useful.Notice that this is in contrast to Paxos. In that protocol, gaps can arise as a consequence of the basic quorum update scheme, which doesn't guarantee that every member will see every update and can run over unreliable message passing layers that might never deliver some messages. The Paxos learner algorithm reads multiple histories and combines them to fill such gaps. Thus Paxos will normally ride out transient failures, continuing to operate without actually dropping the failed member from the group. The failed member misses updates, yet state transfer is not needed unless a group is being reconfigured.
Which dynamically reconfigurable state machine replication protocol came first?
The Gbcast protocol was published early in a period when several state machine protocols capable of managing their own membership were introduced: Gbcast, View-Stamped Replication, Basic Paxos, the partial synchrony protocol of Dwork, Lynch and Stockmeyer, etc. Among these, Gbcast was the first to be published, in papers that appeared in 1985 and 1987; the others were published starting in 1988. One could thus argue that Gbcast was really the first Paxos protocol. Such a statement, however, treats "Paxos" as a fairly broad term covering a family of protocols that all implement state machine replication, all support dynamic reconfiguration of their membership, and have identical correctness properties but vary in their liveness conditions. Under this definition, Gbcast is a Paxos protocol.If equivalence is formalized using bisimulation, in which any run that one protocol can exhibit is also exhibited by the other, and in which the assumptions made and the conditions for progress are identical, the comparison becomes more complex. Under this definition, Gbcast is not a Paxos protocol: although each can exhibit the same runs as the other, they have similar, but not identical, liveness conditions. However, this sort of stringent definition poses a different problem: if one adopts it, some versions of Paxos are not Paxos protocols. For example, "Cheap Paxos" and "Vertical Paxos" are not bisimulation-equivalent to Basic Paxos.
Thus the question has no answer unless one makes it more specific, and has a different answer depending upon the definition of equivalence one uses.