2007-12-01

## One Source Shortest Path: Dijkstra’s Algorithm

Posted by scvalexIn this article I describe Dijkstra’s algorithm for finding the shortest path from one source to all the other vertexes in a graph. Afterwards, I provide the source code in C of a simple implementation.

To understand this you should know what a graph is, and how to store one in memory. If in doubt check this and this.

Another solution for this problem is the Bellman-Ford algorithm.

#### The Problem

Given the following graph calculate the length of the shortest path from **node 1** to every other node.

Lets take the nodes **1** and **3**. There are several paths (**1 -> 4 -> 3**, **1 -> 2 -> 3**, etc.), but the shortest of them is **1 -> 4 -> 2 -> 3** of length **9**. Our job is to find it.

#### The Algorithm

Dijkstra’s algorithm is one of the most common solutions to this problem. Even so, it only works on graphs which have **no edges of negative weight**, and the actual speed of the algorithm can vary from **O(n*lg(lg(n)))** to **O(n ^{2})**.

The idea is somewhat simple:

Take the length of the shortest path to all nodes to be infinity. Mark the length of the shortest path to the source as *0*.

Now, we already know that the graph has no edges of negative weight so the a path of length *0* is the best we can come up with. The path to the source is *0*, so it’s optimal.

This algorithm works by making the paths to one more node optimal at each step. So, at the *k*th step, you know for sure that there are at least *k* nodes to which you know the shortest path.

At each step, choose the node, which is not yet optimal, but which is closest to the source; i.e. the node to which the current calculated shortest path is smallest. Then, from it, try to optimise the path to every node connected to it. Finally, mark the said node as *optimal* (visited, if you prefer). In the previous example, the node which is closest to the source and is not yet optimal is the source. From it, you can optimise the path to nodes *2* and *4*.

At this point, the only visited/optimal node is *0*. Now we have to redo this step *4* more times (to ensure that all nodes are optimal).

The next node to consider is *4*:

It’s worthwhile to note that at this step, we’ve also found a better path to node *2*.

Next is node *2*:

Finally, we look at nodes *5* and *3* (none of which offer any optimisations):

The actual code in C looks something like this:

void dijkstra(int s) { int i, k, mini; int visited[GRAPHSIZE]; for (i = 1; i <= n; ++i) { d[i] = INFINITY; visited[i] = 0; /* the i-th element has not yet been visited */ } d[s] = 0; for (k = 1; k <= n; ++k) { mini = -1; for (i = 1; i <= n; ++i) if (!visited[i] && ((mini == -1) || (d[i] < d[mini]))) mini = i; visited[mini] = 1; for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d[mini] + dist[mini][i] < d[i]) d[i] = d[mini] + dist[mini][i]; } }

#### The Programme

Putting the above into context, we get the **O(n ^{2})** implementation. This works well for most graphs (it will

**not**work for graphs with negative weight edges), and it’s quite fast.

Here’s the source code in C (dijkstra.c):

#include <stdio.h> #define GRAPHSIZE 2048 #define INFINITY GRAPHSIZE*GRAPHSIZE #define MAX(a, b) ((a > b) ? (a) : (b)) int e; /* The number of nonzero edges in the graph */ int n; /* The number of nodes in the graph */ long dist[GRAPHSIZE][GRAPHSIZE]; /* dist[i][j] is the distance between node i and j; or 0 if there is no direct connection */ long d[GRAPHSIZE]; /* d[i] is the length of the shortest path between the source (s) and node i */ void printD() { int i; for (i = 1; i <= n; ++i) printf("%10d", i); printf("\n"); for (i = 1; i <= n; ++i) { printf("%10ld", d[i]); } printf("\n"); } void dijkstra(int s) { int i, k, mini; int visited[GRAPHSIZE]; for (i = 1; i <= n; ++i) { d[i] = INFINITY; visited[i] = 0; /* the i-th element has not yet been visited */ } d[s] = 0; for (k = 1; k <= n; ++k) { mini = -1; for (i = 1; i <= n; ++i) if (!visited[i] && ((mini == -1) || (d[i] < d[mini]))) mini = i; visited[mini] = 1; for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d[mini] + dist[mini][i] < d[i]) d[i] = d[mini] + dist[mini][i]; } } int main(int argc, char *argv[]) { int i, j; int u, v, w; FILE *fin = fopen("dist.txt", "r"); fscanf(fin, "%d", &e); for (i = 0; i < e; ++i) for (j = 0; j < e; ++j) dist[i][j] = 0; n = -1; for (i = 0; i < e; ++i) { fscanf(fin, "%d%d%d", &u, &v, &w); dist[u][v] = w; n = MAX(u, MAX(v, n)); } fclose(fin); dijkstra(1); printD(); return 0; }

And here’s a sample input file(dist.txt):

`10`

1 2 10

1 4 5

2 3 1

2 4 3

3 5 6

4 2 2

4 3 9

4 5 2

5 1 7

5 3 4

The graph is given as an edge list:

- the first line contains
*e*, the number of edges - the following
*e*lines contain*3*numbers:*u*,*v*and*w*signifying that there’s an edge from*u*to*v*of weight*w*

That’s it. Good luck and have fun. Always open to comments.

#### Finding the shortest path

**UPDATE** In response to **campOs**‘ comment.

Now we know the distance between the source node and any other node (the distance to the ith node is remembered in **d[i]**). But suppose we also need the path (which nodes make up the path).

Look at the above code. Where is **d** modified? Where is the recorded distance between the source and a node modified? In two places:

Firstly, **d[s]** is initialised to be *0*.

d[s] = 0;

And then, when a new shortest path is found, **d[i]** is updated accordingly:

for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d[mini] + dist[mini][i] < d[i]) d[i] = d[mini] + dist[mini][i];

The important thing to notice here is that **when you update the shortest distance to node i, you know the previous node in the path to i**. This is, of course, **mini**. This suggests the solution to our problem.

For every node **i** other than the source, remember not only the distance to it, but also the previous node in the path to it. Thus we have a new array, **prev**.

Now, we need to make to modifications.

First, we initialise the value of **prev[i]** to something impossible (say *-1*) at the start of **dijkstra()**.

for (i = 1; i <= n; ++i) { d[i] = INFINITY; prev[i] = -1; /* no path has yet been found to i */ visited[i] = 0; /* the i-th element has not yet been visited */ }

Secondly, we update the value of **prev[i]** every time a new shortest path is found to i.

for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d[mini] + dist[mini][i] < d[i]) { d[i] = d[mini] + dist[mini][i]; prev[i] = mini; }

Good. For every node reachable from the source we know which node is just before it in the shortest path. For the above example, we would have the following array:

`i - prev[i]`

1 - -1

2 - 4

3 - 2

4 - 1

5 - 4

Using this, how do you get the path? Let’s say you want to get to *3*. Which node comes right before *3*? Node *2*. Which node comes right before node *2*? Node *4*. Which node comes before *4*? Node *1*. We’ve reached the source, so we’re done. Go through this list backwards and you get the path: *1 -> 4 -> 2 -> 3*. This is easily implemented with recursion.

void printPath(int dest) { if (prev[dest] != -1) printPath(prev[dest]); printf("%d ", dest); }

Here is the updated source: dijkstraWithPath.c.

Good luck.