Graphs in Java: Minimum Spanning Trees - Prim's Algorithm

Graphs in Java: Minimum Spanning Trees - Prim's Algorithm

Introduction

Graphs are a convenient way to store certain types of data. The concept was ported from mathematics and appropriated for the needs of computer science.

Due to the fact that many things can be represented as graphs, graph traversal has become a common task, especially used in data science and machine learning.

How does Prim's Algorithm Work?

Prim's Algorithm was designed to find a Minimum Spanning Tree (MST) for a connected, weighted undirected graph. This means that the algorithm finds a "tree"(a structure that has no cycles) that connects all of the vertices via a subset of all available edges that have the smallest weight.

Alike Dijkstra's algorithm, Prim's is a greedy algorithm, but Prim's allows negative-weighted edges.

At the end of the algorithm, we'll loop through our array that contains the lowest-cost edges and add them up, getting the value of the MST within our graph.

We'll be discussing how every step of this algorithm works, but a rough sketch of the algorithm can be laid out. Assuming we have a weighted graph G with a set of vertices (nodes) V and a set of edges E:

  • We choose one of the nodes s as the starting node, and set the distance from s to s as 0.
  • We'll assign a number from node s to every other node, marking it as infinity at the beginning. This number will change and update as we progress along the algorithm.
  • Every node s will also have a number representing the "parent" node, from which we connect it in the MST. This number is initialized as -1, and every other node except the starting node will have a number different from -1 associated with it by the end of Prim's algorithm.
  • For every node s we'll find the minimum edge connecting a node that is not already included in the MST. Since Prim's is a greedy algorithm, once we enter the node we're sure that we've chosen the shortest path connecting it to it's parent. We repeat this step until all of the nodes are added to the MST.
  • Finally, we loop through the our MST array and add up the edges, getting the value of the MST.

Visualizing Prim's Algorithm

Let's quickly visualize a simple example - and manually use Prim's Algorithm to find a Minimum Spanning Tree on the following graph:

We'll have 5 nodes, numbered 0 through 4, and on each of the edges the number represents the weight of that edge. Let's describe the INF/-1 pair: -1 at the beginning represents the parent from which there is an edge connecting to the current node that is of weight INF. Of course, as the algorithm progresses, these values will also get updated.

Let's say that 0 will be our starting node. We mentioned earlier that when we choose our starting node, we need to set the distance from itself as 0. Since 0 is the node with the minimal edge to itself, we can safely assume that 0 belongs in the MST and we'll add it. After that little change the graph looks as follows:

White nodes represent the ones we added to the MST.

The next step is the one that makes Prim's algorithm what it is. We loop through all of the neighbors of the node 0, checking for a few things along the way:

  1. If the edge exists at all
  2. If the neighbor node is already added to the MST
  3. If the cost of the edge leading to the neighbor is lower than the current smallest-cost edge leading to that neighbor

The first neighbor of 0 is 1. The edge connecting them has a weight of 1. The edge exists, and the current node 1 is not in the MST, so the only thing left is to check if the edge from 0 to 1 is the smallest weighted edge leading to node 1. Obviously, 1 is less than INF, so we update the distance/parent pair of node 1 to 1/0.

We follow exactly the same steps for every other neighbor of node 0, after which we choose the node with the minimal edge weight to be added to the MST, and mark it blue. That node here is 1.

Now we have the following graph:

The node we're considering now is 1. As we've done with node 0, we check all of the neighbors of node 1.

Node 0 is already added to the MST, so we skip that one.

Node 2 is the next neighbor, and the weight of the edge leading to it from node 1 is 2. This edge has a smaller weight than the one that previously led to that node, which had weight of 5 and came from node 0.

The same is with the other neighbor node 4: the weight of the edge leading to it from node 1 is 1, and previously the smallest weighted edge leading to node 4 from node 0 was 4.

We choose the next node that isn't added to the MST and has the smallest weighted edge from node 1. That node here is node 4.

After the update we have the following graph:

As we consider node 4, we see that we can't update any of the current edges. Namely, both neighbors of node 4 already belong to the MST, so there is nothing to update there, and we just move along in the algorithm without doing anything in this step.

We continue looking for a node that is connected to a node belonging to the MST and has the smallest weighted edge possible. That node is currently 2, and it connects to node 1 via the edge that has the weight of 2. The graph looks as follows:

Both nodes 0 and 1 already belong in the MST, so the only possible node we can go to is 3. The weight of the edge leading to node 3 from node 2 is 4, which is obviously less than the previous 10 leading from node 0. We update that, getting the following graph:

With this, we've visited and added all of the existing nodes to the MST, and because Prim's is a greedy algorithm, this means we've found our MST.

Let's recollect; the edges that were added to the array that keeps track of our MST are the following:

  • Edge 0-1 of weight 1
  • Edge 1-2 of weight 2
  • Edge 1-4 of weight 1
  • Edge 2-3 of weight 4

All that's left is to add up all of the edges making up the MST, after which we get that the value of the MST for the graph in our example is 8, and we wrap up the execution of the algorithm here.

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

The time complexity of Prim's algorithm is O((|E| + |V|)log|V|), where |E| is the number of edges in the graph, and |V| is the number of vertices(nodes) in the graph.

Implementing Prim's Algorithm in Java

With the general idea, and the visualization out of the way - let's implement Prim's algorithm in Java.

As usual, we'll be using the weighted graph implementation from our previous piece: Representing Graphs in Code. However, we'll need to slightly modify it to fit our needs in implementing Prim's algorithm.

In this guide we'll be using the adjacency matrix approach. Note that we can implement Prim's algorithm just as well using adjacency lists, but the matrix approach is just slightly easier, and the code becomes shorter and more readable.

An important thing to note for later on is that, when we've initialized our adjacency matrix, all of the places that don't have a weight assigned to them will automatically be initialized as 0.

Implementation of the Graph Class

First off, we'll start by adding three new arrays to our Graph class:

public class Graph {

    private int numOfNodes;
    private boolean directed;
    private boolean weighted;
    private double[][] matrix;
    
    private double[] edges;
    private double[] parents;
    private boolean[] includedInMST;
    
    private boolean[][] isSetMatrix;
   
	// ...
}

Let's briefly go over what each of these arrays represent:

  • edges represents an array that holds the values of edges belonging to the MST that connect a node to their parent.
  • parents gives us information about the parent of every single node.
  • includedInMST tells us if a node we're checking for already belongs in the MST.

Then, we'll add these to the constructor along with the previously declared variables:

public Graph(int numOfNodes, boolean directed, boolean weighted) {
    this.directed = directed;
    this.weighted = weighted;
    this.numOfNodes = numOfNodes;

    // Simply initializes our adjacency matrix to the appropriate size
    matrix = new double[numOfNodes][numOfNodes];
    isSetMatrix = new boolean[numOfNodes][numOfNodes];
    
    edges = new double[numOfNodes];
    parents = new double[numOfNodes];
    includedInMST = new boolean[numOfNodes];

    for(int i = 0; i < numOfNodes; i++){
        edges[i] = Double.POSITIVE_INFINITY;
        parents[i] = -1;
        includedInMST[i] = false;
    }
}

We've allocated numOfNodes space for each of our individual arrays. An important step here is the initialization:

  • The distance to every single node at the beginning is set to Double.POSITIVE_INFINITY. This essentially means that we haven't yet reached the node from any other node, hence the distance to it is Infinity. This number also represents Infinity as a data type in Java.
  • Since none of the nodes are reached when the algorithm begins, the parent of every single node is set to -1, indicating that the specific node has no parent it's reached from. The reason we can set the value of parents to -1 is that we label the nodes from 0 to n-1 where n is the number of nodes, so it logically makes no sense to have a node -1.
  • At the beginning of the algorithm, none of the nodes belong to the MST, so it's only logical to include none of them, i.e. set the value of every single member in includedInMST to false.

The addEdge() and printMatrix() methods stay the same, as they are both self-explanatory to what they do we won't go into that deeper.

However, we do require additional getters and setters that will allow us to change the aforementioned arrays. Those are the following:

public int getNumOfNodes() {
    return numOfNodes;
}

public double getEdges(int i) {
	return edges[i];
}

public void setEdges(double edge, int node) {
	this.edges[node] = edge;
}

public boolean getIncludedInMST(int i) {
	return includedInMST[i];
}

public void setIncludedInMST(int node) {
	this.includedInMST[node] = true;
}

public double[][] getMatrix() {
	return matrix;
}

public void setParents(double parent, int node) {
	this.parents[node] = parent;
}

public double getParents(int i) { 
   return parents[i]; 
}

If any of these getters/setters isn't intuitive - each of the getters and setters will additionally be explained as we use them when implementing Prim's algorithm.

With this, we've completed the adaptation of the implementation of a weighted Graph, and we can move on to the algorithm itself.

Implementation of Prim's Algorithm

With a Graph ready, we can go ahead and implement the algorithm that'll run on top of it. Let's initialize a Graph with a set of nodes and their edges. We'll use the same set of nodes and edges as in the visualization from an earlier section:

public class Prim {
    public static void main(String[] args){
        Graph graph = new Graph(5, false, true);

        graph.addEdge(0, 1, 1);
        graph.addEdge(0, 2, 5);
        graph.addEdge(0, 3, 10);
        graph.addEdge(0, 4, 4);
        graph.addEdge(1, 2, 2);
        graph.addEdge(1, 4, 1);
        graph.addEdge(2, 3, 4);
     	
        // ...
    }
}

Printing out this matrix using graph.printMatrix() outputs the following:

 /       1.0     5.0    10.0     4.0
 1.0     /       2.0     /       1.0
 5.0     2.0     /       4.0     /
10.0     /       4.0     /       /
 4.0     1.0     /       /       /

We also need a method named minEdgeNotIncluded() that finds the minimum weighted edge leading to a neighbor that isn't already included in the MST:

public static int minEdgeNotIncluded(Graph graph){
    double min = Double.POSITIVE_INFINITY;
    int minIndex = -1;
    int numOfNodes = graph.getNumOfNodes();

    for(int i = 0; i < numOfNodes; i++){
        if(!graph.getIncludedInMST(i) && graph.getEdges(i) < min){
            minIndex = i;
            min = graph.getEdges(i);
        }
    }
    return minIndex;
}

At the beginning, we set min to Infinity indicating that we haven't found the minimum edge yet. Variable minIndex represents the node that the minimum edge we're looking for connects to, and we initialize it to -1 at the beginning. Afterwards, we loop through all the nodes, looking for a node that isn't already included in the MST, after which we check if the edge connecting to that node is smaller than our current min edge.

Finally, we're ready to implement Prim's algorithm:

public class Prim {
    public static void main(String[] args){
        // Initialized and added the graph earlier
        
        int startNode = 0;
        // Distance from the start node to itself is 0
        graph.setEdges(0, startNode); 

        for(int i = 0; i < graph.getNumOfNodes()-1; i++){
            int node = minEdgeNotIncluded(graph);

            graph.setIncludedInMST(node);

            double[][] matrix = graph.getMatrix();
            for(int v = 0; v < graph.getNumOfNodes(); v++){
                if(matrix[node][v] != 0 && 
                   !graph.getIncludedInMST(v) && 
                   matrix[node][v] < graph.getEdges(v)){
                    graph.setEdges(matrix[node][v], v);
                    graph.setParents(node, v);
                }
            }
        }
        
        double cost = 0;
        for(int i = 0; i < graph.getNumOfNodes(); i++){
            if(i != startNode){
                cost += graph.getEdges(i);
            }
        }
        System.out.println(cost);
    }
}

The code itself might be a little confusing, so let's dive into it and explain what every section of it does.

First off, we choose our startNode to be 0. Remember, we need a node to start from, and that node could be any node from the set, but for this example it'll be 0. We set the distance from node 0 to itself to be 0.

In the for loop, for every single i from 0 to n-1 we look for a node s so that the edge i-s is the smallest edge from i. After we've found the corresponding node, since Prim's is a greedy algorithm we're sure that there is no smaller edge from node i to any other node besides s, so we add s to the MST.

The next thing is going through all of the neighbors of node s. Let's recall how non-initialized weights are treated in an adjacency matrix:

All of the places in our adjacency matrix that haven't been assigned a weight will automatically be initialized as 0.

This is important because any (negative or positive) number at the position matrix[i][j] indicates that an edge exists between nodes i and j, while 0 indicates the absence of it.

So, the conditions that need to be met for an edge (and a node) to be added to the MST are the following three:

  1. We check if the value matrix[i][j] is different than 0, and if it is we know the edge exists, and that value represents the weight between nodes i and j.
  2. We check if the neighbor has already been added to the MST. If so, we skip that node and go on to the next neighbor.
  3. If the value of the edge from node i to node j is smaller than the already existing value from a different node to node j, we update the pair of distance/parent to reflect the situation, i.e. distance becomes the value of the edge i-j and the parent from which we arrive at node j is node i.

That about sums up how Prim's algorithm works. All that is left to do is to go through edges array and add up all of the edges that make up the MST, finding its value. That is exactly what the last part of our code does, and stores the result in the cost variable.

Let's wrap up the algorithm with the output of the MST:

System.out.println("MST consists of the following edges:");
    for(int i = 1; i < graph.getNumOfNodes(); i++){
      System.out.println("edge: (" + (int)graph.getParents(i) + ", " + i + "), weight: " + graph.getEdges(i));
}

Let's run it and see the output:

MST consists of the following edges:
edge: (0, 1), weight: 1.0
edge: (1, 2), weight: 2.0
edge: (2, 3), weight: 4.0
edge: (1, 4), weight: 1.0

Conclusion

In this guide, we've covered and explained how to use Prim's Algorithm to find a Minimum-Spanning Tree(MST) in Java.

Prim's, alongside Kruskal's algorithm is one of the two most commonly used to solve this problem, which finds it's use in fields such as designing computer network, telecommunication networks and networks in general.

Last Updated: October 17th, 2021
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Want a remote job?

    Prepping for an interview?

    • Improve your skills by solving one coding problem every day
    • Get the solutions the next morning via email
    • Practice on actual problems asked by top companies, like:
     
     
     

    Make Clarity from Data - Quickly Learn Data Visualization with Python

    Learn the landscape of Data Visualization tools in Python - work with Seaborn, Plotly, and Bokeh, and excel in Matplotlib!

    From simple plot types to ridge plots, surface plots and spectrograms - understand your data and learn to draw conclusions from it.

    © 2013-2021 Stack Abuse. All rights reserved.