Graph traversal


In computer science, graph traversal refers to the process of visiting each vertex in a graph. Such traversals are classified by the order in which the vertices are visited. Tree traversal is a special case of graph traversal.

Redundancy

Unlike tree traversal, graph traversal may require that some vertices be visited more than once, since it is not necessarily known before transitioning to a vertex that it has already been explored. As graphs become more dense, this redundancy becomes more prevalent, causing computation time to increase; as graphs become more sparse, the opposite holds true.
Thus, it is usually necessary to remember which vertices have already been explored by the algorithm, so that vertices are revisited as infrequently as possible. This may be accomplished by associating each vertex of the graph with a "color" or "visitation" state during the traversal, which is then checked and updated as the algorithm visits each vertex. If the vertex has already been visited, it is ignored and the path is pursued no further; otherwise, the algorithm checks/updates the vertex and continues down its current path.
Several special cases of graphs imply the visitation of other vertices in their structure, and thus do not require that visitation be explicitly recorded during the traversal. An important example of this is a tree: during a traversal it may be assumed that all "ancestor" vertices of the current vertex have already been visited. Both the depth-first and breadth-first graph searches are adaptations of tree-based algorithms, distinguished primarily by the lack of a structurally determined "root" vertex and the addition of a data structure to record the traversal's visitation state.

Graph traversal algorithms

Note. — If each vertex in a graph is to be traversed by a tree-based algorithm, then the algorithm must be called at least once for each connected component of the graph. This is easily accomplished by iterating through all the vertices of the graph, performing the algorithm on each vertex that is still unvisited when examined.

Depth-first search

A depth-first search is an algorithm for traversing a finite graph. DFS visits the child vertices before visiting the sibling vertices; that is, it traverses the depth of any particular path before exploring its breadth. A stack is generally used when implementing the algorithm.
The algorithm begins with a chosen "root" vertex; it then iteratively transitions from the current vertex to an adjacent, unvisited vertex, until it can no longer find an unexplored vertex to transition to from its current location. The algorithm then backtracks along previously visited vertices, until it finds a vertex connected to yet more uncharted territory. It will then proceed down the new path as it had before, backtracking as it encounters dead-ends, and ending only when the algorithm has backtracked past the original "root" vertex from the very first step.
DFS is the basis for many graph-related algorithms, including topological sorts and planarity testing.

Pseudocode

procedure DFS is
label v as explored
for all edges e in G.incidentEdges do
if edge e is unexplored then
wG.adjacentVertex
if vertex w is unexplored then
label e as a discovered edge
recursively call DFS
else
label e as a back edge

Breadth-first search

A breadth-first search is another technique for traversing a finite graph. BFS visits the sibling vertices before visiting the child vertices, and a queue is used in the search process. This algorithm is often used to find the shortest path from one vertex to another.

Pseudocode

procedure BFS is
create a queue Q
enqueue v onto Q
mark v
while Q is not empty do
wQ.dequeue
if w is what we are looking for then
return w
for all edges e in G.adjacentEdges do
xG.adjacentVertex
if x is not marked then
mark x
enqueue x onto Q
return null

Applications

Breadth-first search can be used to solve many problems in graph theory, for example:
The problem of graph exploration can be seen as a variant of graph traversal. It is an online problem, meaning that the information about the graph is only revealed during the runtime of the algorithm. A common model is as follows: given a connected graph with non-negative edge weights. The algorithm starts at some vertex, and knows all incident outgoing edges and the vertices at the end of these edges—but not more. When a new vertex is visited, then again all incident outgoing edges and the vertices at the end are known. The goal is to visit all n vertices and return to the starting vertex, but the sum of the weights of the tour should be as small as possible. The problem can also be understood as a specific version of the travelling salesman problem, where the salesman has to discover the graph on the go.
For general graphs, the best known algorithms for both undirected and directed graphs is a simple greedy algorithm:
A universal traversal sequence is a sequence of instructions comprising a graph traversal for any regular graph with a set number of vertices and for any starting vertex. A probabilistic proof was used by Aleliunas et al. to show that there exists a universal traversal sequence with number of instructions proportional to for any regular graph with n vertices. The steps specified in the sequence are relative to the current node, not absolute. For example, if the current node is vj, and vj has d neighbors, then the traversal sequence will specify the next node to visit, vj+1, as the ith neighbor of vj, where 1 ≤ id.