# Breadth-First Search (BFS) Algorithm

In this lesson, we will go over the theory behind the algorithm and the Python implementation of *Breadth-First Search and Traversal*. First, we'll be focusing on **node search**, before delving into **graph traversal** using the BFS algorithm, as the two main tasks you can employ it for.

**Note:** We're assuming an adjacency-list implemented graph in the lesson.

### Breadth-First Search - Theory

**Breadth-First Search (BFS)** traverses the graph systematically, *level by level*, forming a BFS tree along the way.

If we start our search from node `v`

(the root node of our graph or tree data structure), the BFS algorithm will first visit all the **neighbors** of node `v`

(it's child nodes, on **level one**), *in the order that is given in the adjacency list*. Next, it takes the child nodes of those neighbors (**level two**) into consideration, and so on.

This algorithm can be used for both graph **traversal** and **search**. When searching for a node that satisfies a certain condition (target node), the path with the **shortest distance** from the starting node to the target node. The distance is defined as the number of branches traversed.

*Breadth-First Search* can be used to solve many problems such as finding the shortest path between two nodes, determining the levels of each node, and even solving puzzle games and mazes.

While it's not the most efficient algorithm for solving large mazes and puzzles - and it's outshined by algorithms such as Dijkstra's Algorithm and A* - it still plays an important role in the bunch and depending on the problem at hand - DFS and BFS can outperform their heuristic cousins.

### Breadth-First Search - Algorithm

When implementing BFS, we usually use a **FIFO** structure like a `Queue`

to store nodes that will be visited next.

**Note:** To use a `Queue`

in Python, we need to import the corresponding `Queue`

class from the `queue`

module.

We need to pay attention to not fall into *infinity loops* by revisiting the same nodes over and over, which can easily happen with graphs that have **cycles**. Having that in mind, we'll be keeping track of the nodes that have been **visited**. That information doesn't have to be explicitly saved, we can simply keep track of the **parent** nodes, so we don't accidentally go back to one after it's been visited.

To sum up the logic, the BFS Algorithm steps look like this:

- Add the root/start node to the
`Queue`

. - For every node, set that they don't have a defined parent node.
- Until the
`Queue`

is empty:- Extract the node from the beginning of the
`Queue`

. - Perform output processing.
- For every neighbor of the current node that
**doesn't**have a defined parent (is not visited), add it to the`Queue`

, and set the current node as their parent.

- Extract the node from the beginning of the

**Output processing** is performed depending on the purpose behind the graph search. When searching for a target node, output processing is usually testing if the current node is equal to the target node. This is the step on which you can get creative!

### Breadth-First Search Implementation - Target Node Search

Let's first start out with *search* - and search for a target node. Besides the target node, we'll need a start node as well. The expected output is a **path** that leads us from the start node to the target node.

With those in mind, and taking the steps of the algorithm into account, we can implement it. We'll define a `Graph`

class to "wrap" the implementation of the BFS search.

`Graph`

contains a graph representation - in this case an adjacency matrix, and all methods you might need when working with graphs. We'll implement both BFS search and BFS traversal as methods of that class:

```
from queue import Queue
class Graph:
# Constructor
def __init__(self, num_of_nodes, directed=True):
self.m_num_of_nodes = num_of_nodes
self.m_nodes = range(self.m_num_of_nodes)
# Directed or Undirected
self.m_directed = directed
# Graph representation - Adjacency list
# We use a dictionary to implement an adjacency list
self.m_adj_list = {node: set() for node in self.m_nodes}
# Add edge to the graph
def add_edge(self, node1, node2, weight=1):
self.m_adj_list[node1].add((node2, weight))
if not self.m_directed:
self.m_adj_list[node2].add((node1, weight))
# Print the graph representation
def print_adj_list(self):
for key in self.m_adj_list.keys():
print("node", key, ": ", self.m_adj_list[key])
```

After implementing a *wrapper* class, we can implement BFS search as one of its methods:

```
def bfs(self, start_node, target_node):
# Set of visited nodes to prevent loops
visited = set()
queue = Queue()
# Add the start_node to the queue and visited list
queue.put(start_node)
visited.add(start_node)
# start_node has not parents
parent = dict()
parent[start_node] = None
# Perform step 3
path_found = False
while not queue.empty():
current_node = queue.get()
if current_node == target_node:
path_found = True
break
for (next_node, weight) in self.m_adj_list[current_node]:
if next_node not in visited:
queue.put(next_node)
parent[next_node] = current_node
visited.add(next_node)
# Path reconstruction
path = []
if path_found:
path.append(target_node)
while parent[target_node] is not None:
path.append(parent[target_node])
target_node = parent[target_node]
path.reverse()
return path
```

When we're reconstructing the path (if it is found), we're going backward from the target node, through its parents, re-tracing all the way to the start node. Additionally, we **reverse** the path for our own intuition of going from the `start_node`

towards the `target_node`

, though, this step is optional.

On the other hand, if there is no path, the algorithm will return an empty list.

With the previously explained implementation in mind, we can test it by running the BFS search on the example graph:

Let's recreate this graph using our `Graph`

class. It is an undirected graph with 6 nodes, so we'll instantiate it as:

```
graph = Graph(6, directed=False)
```

Next, we need to add all edges of the graph to the instance of the `Graph`

class we've created:

```
graph.add_edge(0, 1)
graph.add_edge(0, 2)
graph.add_edge(0, 3)
graph.add_edge(0, 4)
graph.add_edge(1, 2)
graph.add_edge(2, 3)
graph.add_edge(2, 5)
graph.add_edge(3, 4)
graph.add_edge(3, 5)
graph.add_edge(4, 5)
```

Now, let's see how the `Graph`

class internally represents our example graph.

```
graph.print_adj_list()
```

This will print the adjacency list used to represent a graph we've created:

```
node 0 : {(3, 1), (1, 1), (4, 1), (2, 1)}
node 1 : {(0, 1), (2, 1)}
node 2 : {(0, 1), (1, 1), (5, 1), (3, 1)}
node 3 : {(0, 1), (5, 1), (4, 1), (2, 1)}
node 4 : {(0, 1), (5, 1), (3, 1)}
node 5 : {(3, 1), (4, 1), (2, 1)}
```

At this moment, we have created a graph and understand how it's stored as an adjacency matrix. With all that in mind, we can perform a search itself. Say we'd like to search for *node* `5`

starting from *node* `0`

:

```
path = []
path = graph.bfs(0, 5)
print(path)
```

Running this code results in:

```
[0, 3, 5]
```

After taking a quick look at the example graph, we can see that the shortest path between `0`

and `5`

is *indeed* `[0, 3, 5]`

. Though, you could also traverse `[0, 2, 5]`

and `[0, 4, 5]`

. These alternative paths are, fundamentally, the same distance as `[0, 3, 5]`

- however, consider how BFS compares nodes. It "scans" from *left to right* and `3`

is the first node on the left-hand side of the adjacency list that leads to `5`

, so this path is taken instead of the others.

This is a characteristic of BFS you'll want to anticipate. It'll search from left to right - and *won't* find an equally valid path if it's found "after" the first one.

**Note:** There are cases in which a path between two nodes cannot be found. This scenario is typical for **disconnected** graphs, where there are at least two nodes that are not connected by a path.

Here's what a disconnected graph looks like:

If we were to try and perform a search for a path between nodes `0`

and `3`

in this graph, that search would be unsuccessful, and an empty path would be returned.

### Breadth-First Implementation - Graph Traversal

**Breadth-First Traversal** is a special case of Breadth-First Search that traverses the **whole** graph, instead of searching for a target node. The algorithm stays the same as we've defined it before, the difference being that we don't check for a target node and we don't need to find a path that leads to it.

This simplifies the implementation significantly - let's just print out each node being traversed to gain an intuition of how it passes through the nodes:

```
def bfs_traversal(self, start_node):
visited = set()
queue = Queue()
queue.put(start_node)
visited.add(start_node)
while not queue.empty():
current_node = queue.get()
print(current_node, end = " ")
for (next_node, weight) in self.m_adj_list[current_node]:
if next_node not in visited:
queue.put(next_node)
visited.add(next_node)
```

**Note:** This method should be implemented as part of the `Graph`

class implemented before.

Now, let's define the following example graph in the previously shown way:

```
# Create an instance of the `Graph` class
# This graph is undirected and has 5 nodes
graph = Graph(5, directed=False)
# Add edges to the graph
graph.add_edge(0, 1)
graph.add_edge(0, 2)
graph.add_edge(1, 2)
graph.add_edge(1, 4)
graph.add_edge(2, 3)
```

Finally, let's run the code:

```
graph.bfs_traversal(0)
```

Running this code will print nodes in the order that BFS scanned them:

```
0 1 2 4 3
```

#### Step by step

Let's dive into this example a bit deeper and see how the algorithm works step by step. As we start the traversal from the start node `0`

, it is put into the `visited`

set and into the `queue`

as well. While we still have nodes in the `queue`

, we extract the first one, print it, and check all of its neighbors.

When going through the neighbors, we check if each of them is visited, and if not we add them to the `queue`

and mark them as visited:

Steps | Queue | Visited |
---|---|---|

Add start node 0 | [0] | {0} |

Visit 0, add 1 & 2 to Queue | [1, 2] | {0} |

Visit 1, add 4 to Queue | [2, 4] | {0, 2} |

Visit 2, add 3 to Queue | [4, 3] | {0, 1, 2} |

Visit 4, no unvisited neighbours | [3] | {0, 1, 1, 4} |

Visit 3, no unvisited neighbours | [ ] | {0, 1, 2, 4, 3} |

#### Time complexity

During **Breadth-First Traversal**, every node is visited exactly *once*, and every branch is also viewed *once* in case of a **directed** graph, that is, *twice* if the graph is **undirected**. Therefore, the time complexity of the BFS algorithm is ** O(|V| + |E|)**, where

*V*is a set of the graph's nodes, and

*E*is a set consisting of all of its branches (edges).

### Conclusion

In this lesson, we've explained the theory behind the Breadth-First Search algorithm and defined its steps.

We've depicted the Python implementation of both Breadth-First Search and Breadth-First Traversal, and tested them on example graphs to see how they work step by step. Finally, we've explained the time complexity of this algorithm.