taylorialcom/ DataStructures

Introduction to Big-O

Asymptotic Time Complexity Analysis is a technique for determining the performance of an algorithm as the amount of data it operates on becomes infinitely large.

Introduction

Benchmarking

Asymptotic Time Complexity Analysis

T(n) for algorithm on an array

Let's analyze the following algorithm:

Source code for displaying odd numbers in an array

Suppose that it takes:

Let \( n \) be the length of the array. Then:

In general, we will be interested in the run time in the worst case; however, there are times when understanding the run time for the best-case or average-case are useful.1

In the worst case, all of the numbers are odd. In that case, we can calculate \( T(n) \) as: \[ T(n) = c_1 + c_2 + (c_3 + c_4) \times (n + 1) + (c_5 + c_6 + c_7 + c_8) \times n + (c_9 + c_6) \times n \]

which can be rewritten as:

\[ T(n) = c_1 + c_2 + c_3 + c_4 + (c_3 + c_4 + c_5 + 2 c_6 + c_7 + c_8 + c_9) \times n \]

or:

\[ T(n) = a + bn \]

where \( a = c_1 + c_2 + c_3 + c_4 \) and \( b = c_3 + c_4 + c_5 + 2 c_6 + c_7 + c_8 + c_9 \).

Big-O Notation

We say that \( T(n) = O(f(n)) \)2 if there exist two positive constants, \( n_0 \) and \( c \), and a function, \( f(n) \), such that:

\[ T(n) \leq c f(n) \]

for all \( n > n_0 \).

Determining f(n) based on T(n)

Since Big-O notation relies on analysis as \( n\to\infty \), we will only care about the term that grows most rapidly as \( n\to\infty \). Some examples: \[ a = O(1) \] \[ a + b n = O(n) \] \[ a + b n + c n^2 = O(n^2) \] \[ a + b \log n = O(\log n) \]

Note: when using Big-O notation, we will always simplify \( f(n) \) as much as possible. As a result, you should never say \( O(2n) \) or \(O(1.8 + 3 n) \).

Growth Rates

Growth Rates

T(n) for algorithm on a List

Hopefully it is now clear that the displayOdds() method above is \( O(n) \) since any value of \( c \) that is slightly larger than \( b \) so that \( c n > b n \) for large values of \( n \).

Now suppose displayOdds() is modified as follows:

public static void displayOdds(List<Integer> nums) {
    for (int i = 0; i < nums.size(); ++i) {
        if (nums.get(i) % 2 == 1) {
            System.out.println(nums.get(i));
        }
    }
}

How does the analysis change?

It depends on the performance of nums.size() and nums.get(). If these are both \( O(1) \) algorithms, then \( T(n) = a + b n \) where \( a \) and \( b \) differ since the actual time to do nums.size() on a list is different than the time to do nums.length on an array... similarly for nums.get(i) versus nums[i].

T(n) for algorithm on an ArrayList

If an ArrayList<Integer> is passed to displayOdds(), the time complexity remains \( O(n) \) because nums.size() is \( O(1) \) and nums.get() is \( O(1) \).

T(n) when nums.size() is not constant

Suppose the time it takes for nums.size() to complete is \( O(n) \). How does the time complexity for displayOdds() change?

Instead of nums.length taking \( c_4 \) amount of time, calling nums.size() now requires \( T(n)_{size} = \alpha + \beta n \) amount of time (since it is \( O(n) \)). Replacing \( c_4 \) in our equation for \( T(n) \) results in:

\[ T(n) = c_1 + c_2 + c_3 + c_4 + (c_3 + \alpha + \beta n + c_5 + 2 c_6 + c_7 + c_8 + c_9) \times n \]

which simplifies to:

\[ T(n) = c_1 + c_2 + c_3 + c_4 + (c_3 + \alpha + c_5 + 2 c_6 + c_7 + c_8 + c_9) \times n + \beta n^2 \]

or:

\[ T(n) = a + b'n + \beta n^2 \]

where \( a \) is the same as above, \( b' = c_3 + \alpha + c_5 + 2 c_6 + c_7 + c_8 + c_9 \).

As a result, \( T(n) = O(n^2) \).

How do things change if nums.get() was \( O(n) \)?

1

Actually, the average-case is very relevant. However, doing that analysis is more complicated, so we will leave that for a different course.

2

This is really an abuse of mathematical notation. What we are really saying here is \( T(n) \in O(f(n)) \). That is, \( T(n) \) is in the set of functions for which we can find positive constants, \( n_0 \) and \( c \) such that \( T(n) \leq c f(n) \).