summaryrefslogtreecommitdiff
path: root/05-29.md
blob: e67f90c9609e8eb58cb222b1f4f79576cc5330a0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
[\<- 05/27](05-27.md)

---

## A Brief Review

- Linear list: each element can have only one successor
	- Queue, stack
- Non-linear list: each element can have more than one successors
	- Tree: Each element (i.e., node) can have only one predecessor
		- General tree
		- Binary tree
			- Binary Search tree
				- AVL tree
			- Binary Heap
	- Graph: Each element can have more than one predecessor

## Trees & Graphs

- A tree is a collection of nodes in which each node has at most **one predecessor** (the parent) but **arbitrarily many successors** (the children)
- A graph is a collection of nodes in which each node can have an **arbitrary number of both predecessors and successors**

## Graphs - Categories

- Directed & Undirected
	- vertex is the same as node

![directed and undirected graphs](05-29_img1.png)

- Weighted Graph
	- There will be a metric on each edge

## Graphs - Cycle / Acyclic

- A **cycle** is a **non-empty** sequence of **distinct** edges from a vertex back to itself
- A graph without cycles is called **acyclic**
- Examples
	- A directed graph is acyclic
	- An undirected graph might have a cycle

- Can you have a cycle in a two-node undirected graph? how about a line?
	- No, "distinct edges" remember?

## Graphs

- A directed acyclic graph is called a **dag**
- Dags are often used to model dependencies
- For example, we have 6 courses as A, B, C, D, E, F. As shown below,
	- Course A requires course C as prerequisite
	- Course B requires course C and F as prerequisite
	- Course C requires course D and E as prerequisite,
	- ...

### Example

- Example dependencies:
	- Get up in the morning -- 1
	- Take a Shower -- 2
	- Put on Shoes -- 3
	- Put on Socks -- 4
	- Put on Underwear -- 5
	- Put on Pants -- 6
	- Put on Shirt -- 7

- Draw the Graph:

![the graph visualized](05-29_img2.png)

- A graph is called **planar** if it can be drawn in a single plane with no edges crossing
	- loved by EE majors for obvious reason

- A graph in which each vertex is connected to every other vertex is called a **clique**. Are the following cliques planar?
	- three-clique? **yes**
	- four-clique? **yes**
	- five-clique? **no**

- When discussing graphs we use two variables
	- V = # of vertices
	- E = # of edges
- Is there any relationship between V and E?

- Assume that our graph is directed. What is the max # of edges given V?
	- V^2 (or V(V-1) if you ignore self loops)
- Assuming it's undirected (and no self-loops)?
	- (V(V-1))/2
- So big-O of E is O(V^2)

- A graph is called **sparse** if E is much less than V^2, and is called **dense** if E is close to V^2
- Question: for any airline map between 30 airports in reality, is it dense?
	- Does, for example, Southwest fly 900 routes?

---

# Graph Representations

## Adjacency Matrix

- **adjacency matrix**:
	- `a[i][j] <> 0 iff i -> j`

- Example:

![example graph](05-29_img3.png)

|     |A|B|C|D|E|F|
|-----|-|-|-|-|-|-|
|**A**|0|0|1|0|0|0|
|**B**|0|0|1|0|0|1|
|**C**|0|0|0|1|1|0|
|**D**|0|0|0|0|0|0|
|**E**|0|0|0|0|0|0|
|**F**|0|0|0|0|1|0|

- The meaning of "sparse" should be starkly clear in the matrix

- How about undirected graph?
	- Always symmetric

- How about space complexity?
	- O(V^2)

## Adjacency List

- **Adjacency List**
	- Using the graph above:
	- A: {C}
	- B: {C, F}
	- C: {D, E}
	- D: {}
	- E: {}
	- F: {E}

- How about undirected graph?
	- A: {C}
	- B: {C, F}
	- C: {A, B, D, E}
	- etc.

- What is the space complexity?
	- Notice that the number of entries reflects E
	- O(E) space complexity for adjacency list

## Exercise

|     |A|B|C|D|E|F|
|-----|-|-|-|-|-|-|
|**A**|0|0|1|0|0|0|
|**B**|0|0|1|0|1|1|
|**C**|0|0|0|1|1|0|
|**D**|0|0|1|0|0|0|
|**E**|0|0|0|0|0|0|
|**F**|0|0|1|0|1|0|

- Given the adjacency matrix, answer the following questions
	1. What is the adjacency list?
		- A: {C}
		- B: {C, E, F}
		- C: {D, E}
		- D: {C}
		- E: {}
		- F: {C, E}

	2. Is the graph directed or undirected?
		- directed

	3. Draw the graph:

![Graph Visualization](05-29_img4.png)

Time Complexity:
	- Given a Vertex A, what are the adjacent nodes?
		- Adjacency Matrix: Always O(V)
		- Adjacency list: Worst-case O(V), depends on the graph sparseness
	- Find if edge A-\>B exist in the graph?
		- Adjacency matrix: O(1)
		- Adjacency list: Worst-case O(V)

## Comparison

- Space complexity:
	- Adjacency matrix: O(V^2)
	- Adjacency list: O(E)
	- Which one could save more space for a sparse graph?
- Time Complexity:
	- Determining the presence of an edge
		- **Adjacency matrix** is faster (O(1))
	- Determining the list of all adjacent nodes
		- **Adjacency list** is faster on average, worst-case O(V)

- Recommendations:
	- If sparse: list
	- If dense: matrix
	- Note that most graphs will be sparse

## Requirements/Expectation

- Given a graph, be able to provide its adjacency matrix & adjacency list
- Given an adjacency matrix (or adjacency list), be able to draw the graph
- Understand the differences in adjacency matrices/lists for directed graph & undirected graph
- Compare adjacency matrix & adjacency list (space and time)

---

## Graphs - When to use it?

- Examples:
	- Maps
	- Online Social Networks
	- Neural Networks

# Graph Traversals

## Graph Algorithms - Traversal

- We want to visit all nodes in a graph
- Challenge: unlike trees
	- graphs can have cycles,
	- and more than one way to visit the same node...
- Both issues demand that we **avoid visiting the same node**
	- We need to avoid cycles and avoid repeats
	- We will solve these problems by marking each vertex

- Review of preorder traversal:

```
void visit(n){
	if(n is not empty){
		print n;
		for each child ch of n do visit(ch);
	}
}
```

- This code will not work on a graph
	- Could visit a node multiple times
	- Could loop forever

- Modification of preorder traversal

```
void visit(n){
	label n as marked
	print n
	for each adjacent node adj of n do{
		if(adj is no marked) visit(adj)
	}
}
```

- This code will work on a graph:
	- A node will only be marked once
	- So we cannot visit a node more than once and accidently print it twice or get stuck in a cycle

- We just covered **depth-first search**/traversal. **DFS**
- So called because you down as far as you can before you backtrack and try another option
- The robot in the maze game used this algorithm: it explored as far as it could before backtracking

![example graph](05-29_img5.png)

- Using the example graph above, show the DFS order starting from A
	- A, C, D, B, F, E
- Not do a DFS from B
	- B, E, F, C, A, D
- Now do a DFS from F
	- F, B, C, A, D, E

- DFS was recursive and therefore used a **stack**
- You can either make your function recursive and let the system take care of it
- Or you can make your own stack (that's what we did for the maze game)
- Another logical choice is to use a **queue**
- In this case, we'll have to make our own queue and maintain it as we go along

- Modification of traversal:

```
void visit(n){
	label n as marked and add it to the queue
	
	while(the queue is not empty) do{
		let n be the result of dequeue
		for each adjacent node adj of n do{
			if(adj is not marked) label adj as marked and add it to the queue
		}
	}
}
```

- This code will work on a graph:
	- It will only add a node to the queue if it hasn't already been marked

- We just covered **breadth-first search**/traversal, **BFS**
- It is so called because instead of going down, you go across
- Proceed all the neighbors at distance 1, then all from distance 2, etc.

![example graph](05-29_img5.png)

- Example: BFS starting from A
	- A, C, B, D, F, E
	- Notice as you print nodes, the nodes are farther and farther away from A
		- A is distance 0
		- C is distance 1
		- B, D, E are distance 2
		- F is distance 3
- Example: BFS starting from E
	- E, C, F, A, B, D
- Example: is the traversal (starting from A) BFS or DFS?
	- A, C, D, B, F, E
	- it is DFS
	- can it also be BFS? NO (it would not output F before E)

- Relation to Trees
	- DFS reflects preorder
	- BFS reflects level by level

## Requirements/Expectations

- Given a graph, be able perform a depth-first and breadth-first search
- Understand when you would choose one over other
- What is the complexity of each algorithm
	- We only print out each node once
	- However, we may have to check a node multiple times
	- How many times? That depends on how many edges lead to it
	- Although we print out each node only once, we must examine every edge in the graph
	- So, the complexity of DFS and BFS is O(V+E). (The V comes from the face that we have to initially unmark all vertices before we start a search)