The asymptotic complexity of a loop which does at most \(N\) constant-time operations is \(O(N)\). \(M\) such loops will then have \(O(M * N)\) time complexity. But not always. If you’re careful, you can sometimes do \(M\) loops of \(O(N)\) in only \(O(M)\) time.
For an example, we’re going to look at a binary counter. Concretely, we have an array of \(N\) bits representing a number in binary form. We will analyze the asymptotic complexity of incr()
which increments the binary counter.
incr(0000_0000) = 0000_0001
incr(0000_0001) = 0000_0010
incr(0000_0010) = 0000_0011
incr(0000_0011) = 0000_0100
The algorithm for incr()
is deceptively simple. Start at the right-most digit. As long as the current digit is \(1\), set it to \(0\) and move left. When we reach a \(0\), set it to \(1\) and stop. If we reach the left end of the array before finding a \(0\), we’ve overflown.
def incr(counter):
i = len(counter) - 1
while i >= 0 and counter[i] == 1:
counter[i] = 0
i -= 1
if i >= 0:
counter[i] = 1
else:
raise OverflowError()
Since the loop executes \(N\) steps in the worst case, the complexity of incr()
is \(O(N)\). Therefore, \(M\) calls to incr()
will have \(O(M * N)\) complexity, right? Strictly speaking yes, but by careful counting, we can tighten the bound to \(O(M)\).
When figuring out the complexity, we normally count the cost of each operation as \(1\): at each iteration, the loop will do \(2\) loop comparisons and at most \(2\) assignments. So, each loop iteration costs \(O(2 + 2) = O(1)\), and \(N\) of them cost \(O(N * 1) = O(N)\).
If we look at the loop more closely, we see that it doesn’t always do \(N\) iterations. In fact, it does exactly as many iterations as there are \(1\)s at the right of the array. In other words, a call to incr()
always does zero or more sequences of i >= 0, counter[i] == 1, counter[i] = 0, i -= 1
and then exactly one i >= 0, counter[i] == 1, i >= 0, counter[i] = 1
. The most important detail is that the number of times it goes through the first sequence is upper bounded by the number of times it previously went through the second one.
With this is mind, let’s revise our way of counting operations such that we pay for some of them in advance. More precisely, we now count executing counter[i] = 1
as \(5\) instead of as \(1\). Since \(O(5) = O(1)\), we haven’t changed the asymptotic complexity of incr()
, but if we consider multiple invocations of incr()
, we can apply the following reasoning. A call to incr()
consists of several runs through the first sequence of cost \(4\), and one final run through the second sequence of revised cost \(3 + 5 = 7\). But, since we started with an array of \(0\)s and only the second sequence sets elements to \(1\), we know that every run through the first sequence must have been preceded by a run through the second one. Furthermore, since we’re overpaying the cost of the second sequence by exactly the cost of the first one, we claim that the every run through the first sequence is pre-payed. So, the cost of the loop is just \(O(1)\), and, by extension, the cost of incr()
is also \(O(1)\).
To recap, we started by analyzing the complexity of a function. We noticed that its execution follows a very specific pattern. We decided to overpay part of the function’s invocation cost such that future invocations will already be partially payed for. In other words, we amortized the cost of the function across multiple calls to it. Finally, we noticed that by doing this, the complexity dropped from \(O(N)\) to \(O(1)\).
It’s like magic: a sleight of hand and the \(N\) is gone. But, like with real magic, the \(N\) isn’t actually gone and we haven’t actually made the function any faster. What we have done is improve our cost model for it.
This is amortized analysis, a way to improve our understanding of a function’s cost. More precisely, this is aggregate analysis. For a more formal presentation of this and other kinds of amortized analysis, consult Introduction to Algorithms.