diff options
-rw-r--r-- | 04-08.md | 182 |
1 files changed, 182 insertions, 0 deletions
diff --git a/04-08.md b/04-08.md new file mode 100644 index 0000000..e4c4111 --- /dev/null +++ b/04-08.md @@ -0,0 +1,182 @@ +## Outline +- Concept + - Data Structure + - Code Efficiency +- Big-O + +## What is Data Structure? + +- How data is organized in memory (or on disk) + +|Data|Structure| +|----|---------| +|Atomic or composite data|A set of rules that holds the data together| + +- Examples: + - Linked List + - Array + +## Why do we care about Data Structure? + +- Efficiency can refer to either + - Space (importance depending on architecture) + - or Time +- We will focus on **time**, So, we want our operations efficient in terms of time + +## How to measure efficiency? + +- Wall-clock time? + - Not so great + - Not that accurate + - can't compare across computer or programmers + - dependent on circumstances and what's running + +- Count the number of operations! + +### What kind of Operations? + +- insert (add) +- delete (remove) +- access/retrieval +- min/max +- union/merge (less common) + +## A Simple Example + +- We assume each basic operation takes one step +- Example: Initialize an array (a loop setting `a[i]=0`) +- Count the steps ... how many times is ... + +Using a for loop: +``` +for(i = 0; i<n; i++){ + a[i] = 0; +} +``` +- Break down the steps: + 1. `i = 0` (done once) + 2. `i < n` (done 'n+1' times) + 3. `a[i] = 0` (done 'n' times) + 4. `i++` (done 'n' times) +- The total number of steps is `1 + (n+1) + n + n` = `3n+2` + +- This is tedious and error prone, so we will teach how to count and be lazy! + +## When do we care? + +- Turns out we care about what happens **when n is big**, i.e., how our alogirthm behaves as n gets large + +- When n is a million, 2 doesn't matter, 2000 doesn't matter, c doesn't matter for sufficiently larger n + +- We only care about the one with the highest impact on efficiency + +## Big-O Notation + +- To classify the cost of an algorithm we use the big-O notation + - it's called that because we use a capital 'O' + +- To compute the big-O runtime of a function (mathematical function): + - drop all but the fastest-growing term + - drop any coefficient on that remaining term +- So, e.g., `3n+2` is O(n) + - Drop the 2 (not the fastest growing term) and the 3 (coefficient of remaining term) + +- Example: `5n^2 + 3n + 5` -> `5n^2` -> `n^2` + - Would be represented as O(n^2) + +- Example: `600 + 30n` -> `30n` -> `n` + - Would be represented as O(n) + +- Example: `15n + 12nlog(n) + 3` -> `12nlog(n)` -> `nlog(n)` + - Would be represented as O(nlog(n)) + +- Example: `14n^2 + 0.05n^3 + 18000000` -> `0.05n^3` -> `n^3` + - Would be represented as O(n^3) + +## Apply Big-O on Code Analysis + +- Let's analyze some loops and compute the big-O runtimes + - `for(i=0; i<n; i++) x++;` + - `3n+2` -> `n` -> O(n) + - `for(i=n; i>0; i--) x++;` + - O(n) + - `for(i=0; i<(n/2); i++) x++;` + - cut the threshold by half + - O(n/2) is **not** the finalized runtime + - O(n) is the finalized runtime + - `for(i=0; i<n; i=i+2) x++;` + - Double the step size + - O(n/2) is **not** the finalized runtime + - O(n) is the finalized runtime + +## Big-O Add + +- O(n) + O(n) = ? + - `for(i=0; i<n; i++) x++; O(n)` + - `for(i=0; i<n; i++) x++; O(n)` + - O(n) + O(n) = O(n + n) = O(2n) -> O(n) + +- `for(i = 1; i < n; i=i*2) x++` + - Break it down: + - iteration 1: i=1 + - iteration 2: i=2 + - iteration 3: i=4 + - iteration 4: i=8 + - iteration 5: i=16 + - and so on, until i<n + - Notice: iteration x: i=2^(x-1) + - The total number of iterations we need to run is x = (log2(n)+1) + - O(x) = O(log2(n)+1) -> O(log2(n)) -> O(log(n)) + - log base value doesn't matter + +- Example: `for(i=1; i<n; i=i*100) x++;` + - O(log100(n)) -> O(log(n)) +- Example: `for(i=1; i<n; i=i/5) x++;` + - O(log(n)) + +Example: Nested For Loop + +``` +for(i=0; i<n; i++){ + for(j=0; j<n; j++){ + x++; + } +} +``` + +- How many times does this need to run? + +|iteration number|outer loop|inner loop| +|----------------|----------|----------| +|1 |i=0 |n | +|2 |i=1 |n | +|3 |i=2 |n | +|n |i=n-1 |n | + +- You get O(n * n) = O(n^2) + - Essentially, you multiply a big-O (outer loop) by another big-O (inner loop) + +Example: Another Nested For Loop + +``` +for(i=1; i<=n; i++){ + for(j=1; j<1; j++){ + x++; + } +} +``` + +- These loops are no longer independent + +|iteration number|outer loop|inner loop| +|----------------|----------|----------| +|1 |i=1 |0 | +|2 |i=2 |1 | +|3 |i=3 |2 | +|n |i=n |n-1 | + +- Total number of iterations is x = 0+1+2+...+(n-1) +- What is x? + - x = (n-1)+(n-2)+...+2+1+0 + - 2x = (n-1)\*n -> x = ((n-1)\*n)/2 +- O(x) = O(((n-1)\*n)/2) -> O(n^2) |