# Depth-First Search (DFS) Algorithm

In this lesson, we'll take a look at one of the two complementary, fundamental and simplest algorithms for Graph traversal - ** Depth-First Search (DFS)**. It's the most commonly used algorithm alongside the related

**given their simplicity. After going over the main idea used for DFS, we'll implement it in Python on a Graph representation - an**

*Breadth-First Search (BFS)***.**

*adjacency list*### Depth-First Search - Theory

** Depth-First Search (DFS)** is an algorithm used to

*traverse*or locate a

*target*node in a graph or tree data structure. It priorities

*depth*and searches along one branch, as far as it can go - until the end of that branch. Once there, it

**backtracks**to the first possible divergence from that branch, and searches until the end of

*that*branch, repeating the process.

Given the nature of the algorithm, you can easily implement it recursively - and you can always implement a recursive algorithm iteratively as well:

The **start node** is the **root node** for *tree data structures*, while with more generic graphs - it can be any node.

DFS is widely-used as a part of many other algorithms that resolve graph-represented problems. From cycle searches, path finding, topological sorting, to finding articulation points and strongly connected components. The reason behind this widespread use of the DFS algorithm lies in its overall simplicity and easy recursive implementation.

### The DFS Algorithm

The DFS algorithm is pretty simple and consists of the following steps:

- Mark the current node as
**visited**. - Traverse the
**neighboring**nodes that*aren't visited*and recursively call the DFS function for that node.

The algorithm stops either when the target node is found, or the whole graph has been **traversed** (all nodes are visited).

**Note:** Since graphs can have **cycles**, we'll need a system to avoid them so we don't fall into infinity loops. That's why we "mark" every node we pass as **visited** by adding them to a `Set`

containing only unique entries.

By marking nodes as "visited", if we ever encounter that node again - we're in a loop! Endless computational power and time have been wasted on loops, lost in the aether.

#### Pseudocode

Given these steps, we can summarize DFS in pseudocode:

```
DFS(G, u):
# Input processing
u.visited = true
for each v in G.adj[u]:
if !v.visited:
DFS(G, v)
# Output processing
```

**Input** and **output** processing is performed depending on the purpose of the graph search. Our input processing for DFS will be checking if the **current** node is equal to the **target** node.

With this view, you can really start to appreciate just how simple yet useful this algorithm is.

### Depth-First Search - Implementation

The first thing we need to consider before diving into the implementation of the DFS algorithm itself is *how to implement a graph.* As in any other article from this series, we've opted to implement it using a fairly basic `Graph`

class. It contains a graph representation and a couple of methods you need to operate with graphs:

```
class Graph:
# Constructor
def __init__(self, num_of_nodes, directed=True):
self.m_num_of_nodes = num_of_nodes
self.m_nodes = range(self.m_num_of_nodes)
# Directed or Undirected
self.m_directed = directed
# Graph representation - Adjacency list
# We use a dictionary to implement an adjacency list
self.m_adj_list = {node: set() for node in self.m_nodes}
# Add edge to the graph
def add_edge(self, node1, node2, weight=1):
self.m_adj_list[node1].add((node2, weight))
if not self.m_directed:
self.m_adj_list[node2].add((node1, weight))
# Print the graph representation
def print_adj_list(self):
for key in self.m_adj_list.keys():
print("node", key, ": ", self.m_adj_list[key])
```

Let's just quickly recap how does the `Graph`

class work. The `__init__()`

method defines a constructor. It consists of *a number of nodes*, *a set of nodes*, and the *graph representation*. In this case, a graph is represented by an *adjacency list* - effectively a Python dictionary with each node of the graph set to be one key. A *set of adjacent nodes* is assigned to each node (key).

**Note:** An *adjacency list* is a type of graph representation in code, it consists of *keys* that represent each node, and a *set* of values for each of them containing nodes that are connected to the key node with an edge.

Using a dictionary for this is the easiest way to quickly represent a graph in Python, though you could also define your own `Node`

classes and add them to a `Graph`

instance instead.

As its name suggests, the `add_edge()`

method is used to add edges to the graph representation. Each edge is represented as in any usual adjacency list. For example, edge `1-2`

is represented by adding `2`

as the adjacent node of node `1`

in the adjacency list. Additionally, our implementation enables you to assign a *weight* to any edge.

Lastly, we've created the method that prints the graph representation - `print_adj_list()`

.

Now we can implement the DFS algorithm in the `Graph`

class. Depth-First Search implementation is usually **recursive** in code given how natural of a pair that is, but it can also be easily implemented non-recursively. We'll be using the recursive method:

```
def dfs(self, start, target, path = [], visited = set()):
path.append(start)
visited.add(start)
if start == target:
return path
for (neighbour, weight) in self.m_adj_list[start]:
if neighbour not in visited:
result = self.dfs(neighbour, target, path, visited)
if result is not None:
return result
path.pop()
return None
```

We added the start node to the beginning of our traversal path and marked it as **visited** by adding it to a set of `visited`

nodes. Then, we traversed the start node's neighbors that **aren't** already visited and called the function recursively for each of them. Recursive calls result in going as deep as we can along one "branch".

We then saved a reference to the result. In the case the function returns `None`

, that means that the target node was not found in this branch and that we should try another. If the recursive call, in fact, does not return `None`

, that means we have found our target node and we return the traversal path as result.

In the end, if we find ourselves outside of the `for`

loop, it means that all the neighbor branches of the current node have been visited and none of them lead to our target node. So, we remove the current node from the path and return `None`

as result.

### Running DFS

Let's illustrate how the code works through an example. As we've stated before, we'll be using `Graph`

class to represent the graph. Internally, it represents a graph as an adjacency list. Here's the graph we'll be using in the following example:

We'll be searching for a path from node `0`

to node `3`

, if it exists, the path will be saved into a set of visited nodes, called `traversal_path`

so we can reconstruct it for printing.

The first thing we need to do is to create an instance of the `Graph`

class. Our example graph is *undirected* and has 5 nodes, so we'll create its representation in the following way:

```
graph = Graph(5, directed=False)
```

This will create the instance of the `Graph`

representing undirected graph with 5 nodes. Next, we need to add all edges from the example graph into our graph representation:

```
graph.add_edge(0, 1)
graph.add_edge(0, 2)
graph.add_edge(1, 3)
graph.add_edge(2, 3)
graph.add_edge(3, 4)
```

Now, we've created the complete representation of the example graph. Let's take a look at how does the `Graph`

class internally store our example graph:

```
graph.print_adj_list()
```

Which outputs the following:

```
node 0 : {(1, 1), (2, 1)}
node 1 : {(0, 1), (3, 1)}
node 2 : {(0, 1), (3, 1)}
node 3 : {(1, 1), (4, 1), (2, 1)}
node 4 : {(3, 1)}
```

This shows the structure of the dictionary used to represent an adjacency list. It's perfectly in line with the implementation of an adjacency list provided in the lesson about graph representations in Python.

Finally, we can find a path from node `0`

to node `3`

:

```
traversal_path = []
traversal_path = graph.dfs(0, 3)
print(traversal_path)
```

This will output the found path:

```
[0, 1, 3]
```

Now it would be useful to take a look at the steps our algorithm took:

- Add node
`0`

to the traversal path and mark it as visited. Check if node`0`

is equal to target node`3`

, since it's not, continue and traverse its neighbors (`1`

and`2`

). - Is neighbor
`1`

visited? - No. Then, the algorithm calls the function recursively for that node.- Recursive call for node
`1`

: Add node`1`

to the traversal path and mark it as visited. Is`1`

equal to our target node`3`

? - No, continue and traverse its neighbors (`0`

and`3`

). - Is neighbor
`0`

visited? - Yes, move on to the next one. - Is neighbor
`3`

visited? - No, call the function recursively for this node.- Recursive call for node
`3`

: Add node`3`

to the traversal path and mark it as visited. Is`3`

equal to our target node`3`

? - Yes, the target node has been found, return the traversal path.

- Recursive call for node

- Recursive call for node

Current Node | Path | Visited |
---|---|---|

0 | [0] | {0} |

1 | [0, 1] | {0, 1} |

3 | [0, 1, 3] | {0, 1, 3} |

The algorithm stops and our program prints out the resulting traversal path from node `0`

to node `3`

:

```
[0, 1, 3]
```

After the search, the marked nodes on the graph represent the path we took to get to the target node:

In case there was no path between the start and target node, the traversal path would be **empty**.

**Note**: Graphs can also be **disconnected**, meaning that there are at least two nodes that cannot be connected by a path. In this case, DFS would ignore the nodes that it can't get to.

For example in this graph, if we were to start DFS from node `0`

to node `4`

, there would be no such path because it has no way of getting to the target node.