Part 2 - Complexity (1) | rezarezvan.com

DAT038_2

Introduction

What do we mean by an ‘Algorithmic complexity’, a good (but not fully formal) definition would be - “How much resources does the algorithm use in relation to quantitative properties of its input”

And just to be clear, the definition of an algorithm: any well-defined procedure to solve a given problem.

So what kind of resources could we quantify from an input? Of course the runtime of an algorithm is important. How much memory/space an algorithm requires to operate, is often important as well. These are the main two resources that are looked upon when writing new algorithms - but we could also look at, # of comparisons, # of array accesses, # of arithmetic operations performed etc.

The quantitative properties of the input are important here as well, the time it takes to find the first 10 digits of $\pi$ is obviously going to be less than the first 100 000 digits of $\pi$

Therefore, we can view the complexity as a mathematical function; Which takes an input of (size) n - maps it to the function T(n) that outputs the given runtime/space it takes for an input of n.

In pure mathematics (and Computer Science) this type of functions are called ‘Big O Notation’

One thing to remember is - that in the real world the only thing that determines the output isn’t just the output. For example a input of n doesn’t always give T(n) for a given quicksort algorithm.

That’s why computer scientists use Worst-case, Best-case, and Average-case complexities.

As said before the exact runtime of a program depends on so many things - that’s why we introduce the so-called Asymptotic complexity. Which is the order of growth of the complexity function - which holds true for all complexity functions within the same ‘class’. This is often what computer scientists mean when they say that an algorithm has a complexity of $\dots$.

A Computer Scientist Definition of Complexity

Let $f$ and $g$ be functions: $f$ has an order of growth of $g$ if: There are constants $C$ and $n_0$ such that: $$ f(n) \leq C\ g(n) \text{ for } n \geq n_0 $$

What this means if we decipher the math; $f(n)$ is eventually bounded by $C\ g(n)$, after a certain point, $n_0$

With this in our toolkit we can now use the O-notation: $$ f(n) \in \mathcal{O}(g(n)) $$

A Example

Say we have the function $f(n) = 13n + 37$ - I claim that the (Asymptotic) complexity of this function is $\mathcal{O}(n)$

Proof

We need a $C$ and $n_0$ so that $13n + 37 \leq C\ n$ for $n \geq n_0$

If we pick $C = 14$ and say $n_0 = 37$

Then $13n + 37 \leq 13n + n \Longleftrightarrow 13n + 37 \leq 14n$ for $n \geq n_0$

Orders of growth

One with a background in Calculus soon realizes that order of this Big O functions will be the following:

$$ \mathcal{O}(1) \newline \mathcal{O}(log(n)) \newline \dots \newline \mathcal{O}(n) \newline \mathcal{O}(n\ log(n)) \newline \mathcal{O}(n^2) \newline \dots \newline \mathcal{O}(2^n) \newline \dots $$

Rules of (Asymptotic) Complexity

Some arithmetic rules for using the Big O Notation are the following:

Addition

$$ \mathcal{O}(f) + \mathcal{O}(g) = \mathcal{O}(f + g) = \mathcal{O}(max(f,g)) = max(\mathcal{O}(f), \mathcal{O}(g)) $$

Multiplication

$$ \mathcal{O}(f) \times \mathcal{O}(g) = \mathcal{O}(f \times g) $$

And

$$ C \times \mathcal{O}(f) = \mathcal{O}(f) $$

How to find the Complexity of code

Now that we have understood properly what the Big O notation is and how we can operate with it - the next step is being able to find the complexity in code! How we do this properly is going through each line/operation in an algorithm and see what the complexity of each is. This is very time-consuming though - so, luckily, there are shortcuts.

Sequence of statements (arithmetic and logical operations, function calls, etc.), gives us addition between them.

1
func whats_the_complexity(int n):
2
    n = n + 5;                // This takes O(1)
3
    n = n / 10;               // This takes O(1)
4
    n = do_something(n, 100); // Say this take O(n)
5

6
    return n

The total complexity of whats_the_complexity would in this case be: $\mathcal{O}(1) + \mathcal{O}(1) + \mathcal{O}(n)$ and from our first rule that would mean a total (Asymptotic) complexity of, $\mathcal{O}(n)$ for $n \geq 1$

Nested loops gives multiplication

1
func whats_the_complexity(int[][] arr):
2

3
    // Suppose arr is a n x x matrix
4

5
    sum = 0;
6

7
    for i in arr:      // Takes O(n)
8
        for j in i:    // Takes O(n)
9
            sum += j;
10

11
    for i in arr:      // Takes O(n)
12
        sum += i;
13

14
    return n

The total complexity of whats_the_complexity would in this case be: $\mathcal{O}(n) \times \mathcal{O}(n) + \mathcal{O}(n)$ and from our first and second rule that would mean a total (Asymptotic) complexity of, $\mathcal{O}(n^2)$

Relatives of O-notation

What we have gone through is only a part of a bigger set of notations. The ‘real’ definitions are the following:

Let $f$ and $g$ be functions

$f(n) \in \mathcal{O}(g(n))$ if: $$ f(n) \leq C\ g(n) \text{ for } n \geq n_0 $$

Now let’s introduce:

$f(n) \in \Omega(g(n))$ if: $$ f(n) \geq c\ g(n) \text{ for } n \geq n_0 $$

$f(n) \in \Theta(g(n))$ if: $$ c\ g(n) \leq f(n) \leq C\ g(n) \text{ for } n \geq n_0 $$

This means: $f(n) \in \Theta(g(n))$ means $f(n) \in \mathcal{O}(g(n))$ and $f(n) \in \Omega(g(n))$

What does this mean is

$f(n) \in \mathcal{O}(g(n))$ the function $f(n)$ eventually has an upper bound $f(n) \in \Omega(g(n))$ the function $f(n)$ eventually has a lower bound $f(n) \in \Theta(g(n))$ the function $f(n)$ eventually has both a lower and upper bound

Conclusion

This concludes the first part of our complexity journey - complexities are, as we can see, a very powerful tool to identify what kind of algorithm we’re working with and to get a somewhat accurate answer about question one can have about programs, for example, runtime of a program.

Next part will be about dynamic arrays and how we use them - especially how to implement a stack and queues.