Sunday, December 11, 2011

Recurrences in Computational Complexity

Manish Sharma asked me an interesting question over chat regarding recurrences and computational complexity estimation. The question was interesting enough that I thought that the answer would be of use to many more people in general so I'm posting it to my blog. In any case, a post to my blog was long overdue. So, here it is:

The original question was:
How do you find the computational complexity of the recurrence:
T(n) = a.T(n . r) + b.T(n (1 - r)) + O(n)
or, when pretty printed:

Well, technically speaking, this question is phrased slightly ambiguously, so I'm phrasing it as below:
T(n) = a T(n r) + b T(n (1-r)) + cn + d
a > 0, b > 0, c > 0, d belonging to R


Pretty print:
and while I'm at it, I'll point out to you that without loss of generality, I can impose the additional restriction that r is in the range of 0.5 to 1.0.
Pretty print:

This is possible because I can simply take a recurrence equation where this isn't true and transform it to a recurrence where this is true by swapping the values of a and b and replacing the value of r with (1-r).

Given that the restrictions that I've imposed onto the original recurrence are valid, here are a few observations:
  1. T(n) is atleast linear. This is because it already contains a cn term in the recurrence.
  2. r > (1-r) in all cases given the valid range of r. A corollary of this fact is that n r > n (1- r) is always valid.
  3. Since T(n) is atleast linear (and positive for n > 1 with a positive first derivative - which are the standard assumptions in such a problem), T(n r) > T(n (1 - r))
Now comes the tough part:

Case 1: T(n) is linear. 

Assume T(n) = A n + B with A > 0
Therefore,
T(n) = a T(nr) + b T(n(1-r)) + cn + d = a { A nr + B } + b { A n(1-r) + B} + cn = {Aar + Ab(1-r) + c} n + {aB + bB + d}

Equating parts of the equation:
Aar + Ab(1-r) + c = A
aB + bB + d = B

Simplifying,
A = c/(1 - ar - b(1-r))
B = d/(1 - (a+b))

To satisfy our original assumption, A > 0 - we require that 1 - ar - b(1-r) > 0. That means, if ar + b(1-r) < 1 then T(n) is linear. This, of course, need not be true for all valid values of a and b, so for those values, our linearity assumption is violated.

Case 2: T(n) is superlinear.

If T(n) is a superlinear increasing function of n, then for sufficiently large values of n,
there are 3 subcases:
  1. a T(nr) > b T(n(1-r)) for n > K1
  2. a T(nr) = b T(n(1-r)) for all n > K2
  3. a T(nr) < b T(n(1-r))  for n > K3
Subcase 1: a T(nr) > b T(n(1-r)) when n -> infinity then

T(n) < 2a T(nr) + cn + d

If 2a < 1 then the solution is O(n) - valid?
If 2a = 1 then the solution is O(n log n)
else the solution is O(n ^ log_(1/r)(2a))

Subcase 2: aT(nr) = b T(n(1-r)) for all n then


T(n) = 2a T(nr) + cn + d

If 2a < 1 then the solution is again O(n) (tight bound) - valid?
If 2a = 1 then the solution is again O(n log n) (tight bound)
else the solution is O(n^log_(1/r)(2a)) (tight bound)

Subcase 3: a T(nr) < b T(n(1-r)) then


the solutions are similar to subcase 1 except that r is replaced by 1-r and a is replaced by b.

Special case: When a = (1-r) and b = r.

The complexity should be O(n) under limit.

Note: Case 2 seems sufficient to solve the problem but I've kept Case 1 for clarity. I think there might be bugs in the analysis too. If you notice any, please comment!