treble37

9/13/2016 - 5:06 AM

Summary of "Algorithms: Design and Analysis, Part 2" course at Coursera.Org

The course will have six weeks of lectures and assignments, followed by a final exam.

- Distributed shortest-path routing -- sending email messages
- DNA sequence alignment, calculate Needleman-Wunsch score [1970]

- The Algorithm Designer’s Mantra -- "Can we do better?"
- Big-O notation, Asymptotic analysis
- Graph search algorithms -- BFS and DFS, Dijkstra’s Algorithm
- Heaps -- supported operations, runtime performance, applications

- No single “silver bullet” for solving problems -- divide & conquer, randomized, greedy, dynamic programming
- Definition -- iteratively make “myopic” decisions, hope everything works out at the end
- Easy running time analysis, hard to establish correctness
- Many greedy algorithms are not correct, e.g.Dijkstra'a algorithm with negative edge lengths
- Proof of correctness --
*induction*and*exchange argument* - The Optimal Caching Algorithm
- Theorem: [Belady 1960s] The “furthest-in-future” algorithm is optimal (i.e., minimizes the number of cache misses)

- One shared resource, many jobs to do -- in what order should we sequence the jobs?
- Each job has a
*weight*w-j and length i-j,*Completion Time*C-j is sum of all jobs up to and including j - Goal is to minimize the weighted sum of all completion times, min E(j=1-n) w-j * C-j
- If weights are equal must schedule shorter jobs first, but if lengths are equal must schedule larger jobs first
- Assign
*scores*to jobs that are increasing in weight and decreasing in length - Order by decreasing ratio w-j / l-j is always correct, runtime is O(n*log(n)) to do the sorting
- Proof by
*Exchange Argument*

- Connect vertices together while minimizing the path cost
- Input is undirected graph G(V,E) (connected) and cost
*e*for each edge, output is minimum cost tree T that spans all vertices - Applications are clustering and network routing
- Prim's and Kruskal's algorithms, O(m*log(n))
- Proof part 1 -- Prim's algorithm always outputs a spanning tree
- Empty Cut Lemma -- graph is not connected <=> there exists a cut (A,B) with no crossing edges
- Double Crossing Lemma -- if cycle C has an edge that crosses a cut (A,B) => there is another edge that crosses that cut
- Lonely Cut Corollary -- if there is only one edge crossing some cut (A,B) => that edge is not in any cycle
- Proof part 2 -- Prim's algorithm always outputs a
**MST**, minimum-cost spanning tree - Cut Property -- in a cut (A,B) where there is an edge e with minimum cost, that edge is part of MST
- MST is unique if edge costs are distinct
- Naive implementation has running time O(n
*m) but with heaps it drops to O(m*log(n))