Graph Depth-First Search (DFS)
A graph is a data structure that represents a network. A graph consists of nodes called vertices (an intersection of connections), and connections between nodes called edges.
Each vertex in a graph tracks its own edges by maintaining an adjacency list of other connected vertices.
DFS vs. BFS
One method of traversing a graph is called Depth-First Search (DFS). In DFS, you search for a target vertex by following each possible unique path through the graph until the end of that path.
This is different from Breadth-First Search (BFS), where you consider each of the immediately-adjacent vertices in turn first before choosing to traverse to any one of them.
BFS looks at all possible immediately-adjacent vertices before choosing one (imagine fungus spreading in a petri dish), and DFS goes as far as it can for a given option before backtracking and considering another (imagine a bolt of lightning).
DFS is ideal when you need to determine whether a complete path to a target vertex exists, because you fully explore each path to see if it qualifies before moving to the next possible path. One example would be to determine if a maze contains a viable path from the entrance to the exit.
BFS is ideal when you’re looking for a target vertex which is likely to be nearby the starting node. An example might be traversing a social network to find someone who knows your friend.
Tracking progress with a set
For DFS, we’ll use an auxiliary data structure to track which nodes we’ve already visited. We need to know if we’ve already seen a vertex because a graph can be cyclical. A cycle in the graph is a path through that forms a loop and returns you to a node you’ve already seen.
A set is ideal for this, because we can check in constant time whether a reference to the current vertex exists in the set. If so, then we’ve already been here before and can skip it.
Pseudocode
- Initialize a set
already_visited
to track vertices we’ve already visited. - For a given starting vertex
V1
and a given target vertexVt
, - Check if
V1
isVt
. If so, return True because we’ve found the target. - Otherwise, check whether we’ve already visited
V1
by checking whether it’s in thealready_visited
set. If so, it’s not the target, so return False. - Otherwise, iterate over the vertices in
V1.adjacency_list
and recursively call this function on each in turn, returning the result.
This guarantees that we either return True if the target node Vt
exists in the graph, or False if it does not.
For an iterative solution, you can use a stack as another auxiliary data structure to simulate the call stack from a recursive approach.
Implementation
class Vertex():
def __init__(self, id, adjacency_list=None):
if adjacency_list is None:
adjacency_list = []
self.id = id
self.adjacency_list = adjacency_list
def add_edge_to(self, vertex):
self.adjacency_list.append(vertex)
class Graph():
def __init__(self):
self.vertices = {}
self.last_vertex_id = 0
def add_vertex(self, adjacency_list=None):
new_vertex_id = self.last_vertex_id + 1
self.last_vertex_id = new_vertex_id
new_vertex = Vertex(new_vertex_id, adjacency_list)
self.vertices[new_vertex_id] = new_vertex
return new_vertex
def remove_vertex_by_id(self, vertex_id):
if vertex_id not in self.vertices.keys():
return False
self.vertices.pop(vertex_id)
return True
def dfs(self, current_vertex, target_vertex_id, already_visited=None):
if already_visited is None:
already_visited = set()
# Are we already looking at the vertex we want?
if current_vertex.id == target_vertex_id:
return True
# Have we been to this vertex already?
if current_vertex.id in already_visited:
return False
# Mark that we've been here,
already_visited.add(current_vertex.id)
# Keep looking for the target ID until we reach the end of a recursive branch.
for vertex in current_vertex.adjacency_list:
if self.dfs(vertex, target_vertex_id, already_visited):
return True
return False