Post-class 04/22

author: lshprung <lshprung@yahoo.com> 2020-04-22 10:41:03 -0700
committer: lshprung <lshprung@yahoo.com> 2020-04-22 10:41:03 -0700
commit: b80e8d3b7de4d5bda0d20f36612279fc62d431ef (patch)
tree: c194fd1c0eda364e7c726ad78fec2ef068a27bf7 /04-20.md
parent: 183f93c9c02e10001275229536d789c5c304a1fd (diff)
1 files changed, 57 insertions, 0 deletions
diff --git a/04-20.md b/04-20.md
index 7514236..88e0142 100644
--- a/04-20.md
+++ b/04-20.md
@@ -133,3 +133,60 @@ In practice, we don't know the keys in advance, it is very hard to derive a perf
 
 - Even if we choose a good hash function that does a good job distributing the keys across the entire hash table, we will still have collisions in practice and need to deal with them
 - The next step will be to cover **Collision Resolutions**...
+
+## Probing
+
+- A simple way of resolving collisions: **probing**
+- If a key 'y' collids with a key 'x' that is already in the table, 'x' is **not** displaced. Rather 'y' must be placed somewhere else
+- In probing, the colliding key tries another slot in the table (**probes**), if that slot is empty, then is is placed there, otherwise it keeps trying (**probing**) different slots
+- Probing methods covered in this class:
+	- Linear Probing
+	- Quadratic Probing
+	- Double Hashing
+
+## Linear Probing
+
+- We can think of our hash function as having an extra parameter `h(k, i)`, where 'i' is the **probe number** that starts at zero for each key (i.e., it's the number of the try)
+- Simplest type of probing:
+	- `h(k, i) = (h(k) + i) % m`
+	- This is called **linear probing**
+
+## Linear Probing (Insertion)
+
+- If i=0, then `h(k, i) = h(k)`, i.e., the "home" location of the key. If that is occupied, we use the next location to the right, wrapping around to the front of the array
+- Example, for m=11, let's have number between 0 and 100 and map them to the array
+	- k=20 -> go to index 9
+	- k=40 -> go to index 4
+	- k=15 -> go to index 4 (but that location is already taken by k=40)
+		- i++ -> go to index 5
+
+- Another example:
+	- m = 9
+	- k = 0, 9, 18, 27, 36, 45, 54
+	- k=0 -> go to index 0
+	- k=9 -> go to index 0 (but it's taken)
+		- i++ -> go to index 1
+	- k=18 -> go to index 0 (but it's taken)
+		- i++ -> go to index 1 (but it's taken)
+		- i++ -> go to index 2
+	- k=27 -> go to index 0 (but it's taken)
+		- i++ -> go to index 1 (but it's taken)
+		- i++ -> go to index 2 (but it's taken)
+		- i++ -> go to index 3
+	- If all keys map to the same location -> BAD!
+	- What is the big-O runtime for inserting a new key (assume there are already 'n' keys in the table)
+		- O(n) (pretty bad)
+
+- In the **worst case**, if all keys hash to the same location, insertion is **O(n)**
+- General case **linear probing**:
+	- `h(k, i) = (h(k) + c*i) % m`
+	- e.g. c=2, m=7, `h(k, i) = (h(k) + 2*i) % m`, k=0,2,7
+		- k=0 -> go to index 0
+		- k=2 -> go to index 2
+		- k=7 -> go to index 0 (but it's taken)
+			- i++ -> go to index 2 (but it's taken)
+			- i++ -> go to index 4
+
+---
+
+[04/22 ->](04-22.md)
author	lshprung <lshprung@yahoo.com>	2020-04-22 10:41:03 -0700
committer	lshprung <lshprung@yahoo.com>	2020-04-22 10:41:03 -0700
commit	b80e8d3b7de4d5bda0d20f36612279fc62d431ef (patch)
tree	c194fd1c0eda364e7c726ad78fec2ef068a27bf7 /04-20.md
parent	183f93c9c02e10001275229536d789c5c304a1fd (diff)