treble37
9/13/2016 - 5:06 AM

Summary of "Algorithms: Design and Analysis, Part 2" course at Coursera.Org

Summary of "Algorithms: Design and Analysis, Part 2" course at Coursera.Org

Algorithms: Design and Analysis, Part 2

The course will have six weeks of lectures and assignments, followed by a final exam.

Week 1

I. TWO MOTIVATING APPLICATIONS

  • Distributed shortest-path routing -- sending email messages
  • DNA sequence alignment, calculate Needleman-Wunsch score [1970]

II. SELECTED REVIEW FROM PART I

  • The Algorithm Designer’s Mantra -- "Can we do better?"
  • Big-O notation, Asymptotic analysis
  • Graph search algorithms -- BFS and DFS, Dijkstra’s Algorithm
  • Heaps -- supported operations, runtime performance, applications

III. INTRODUCTION TO GREEDY ALGORITHMS

  • No single “silver bullet” for solving problems -- divide & conquer, randomized, greedy, dynamic programming
  • Definition -- iteratively make “myopic” decisions, hope everything works out at the end
  • Easy running time analysis, hard to establish correctness
  • Many greedy algorithms are not correct, e.g.Dijkstra'a algorithm with negative edge lengths
  • Proof of correctness -- induction and exchange argument
  • The Optimal Caching Algorithm
  • Theorem: [Belady 1960s] The “furthest-in-future” algorithm is optimal (i.e., minimizes the number of cache misses)

IV. A SCHEDULING APPLICATION

  • One shared resource, many jobs to do -- in what order should we sequence the jobs?
  • Each job has a weight w-j and length i-j, Completion Time C-j is sum of all jobs up to and including j
  • Goal is to minimize the weighted sum of all completion times, min E(j=1-n) w-j * C-j
  • If weights are equal must schedule shorter jobs first, but if lengths are equal must schedule larger jobs first
  • Assign scores to jobs that are increasing in weight and decreasing in length
  • Order by decreasing ratio w-j / l-j is always correct, runtime is O(n*log(n)) to do the sorting
  • Proof by Exchange Argument

V. PRIM'S MINIMUM SPANNING TREE ALGORITHM

  • Connect vertices together while minimizing the path cost
  • Input is undirected graph G(V,E) (connected) and cost e for each edge, output is minimum cost tree T that spans all vertices
  • Applications are clustering and network routing
  • Prim's and Kruskal's algorithms, O(m*log(n))
  • Proof part 1 -- Prim's algorithm always outputs a spanning tree
  • Empty Cut Lemma -- graph is not connected <=> there exists a cut (A,B) with no crossing edges
  • Double Crossing Lemma -- if cycle C has an edge that crosses a cut (A,B) => there is another edge that crosses that cut
  • Lonely Cut Corollary -- if there is only one edge crossing some cut (A,B) => that edge is not in any cycle
  • Proof part 2 -- Prim's algorithm always outputs a MST, minimum-cost spanning tree
  • Cut Property -- in a cut (A,B) where there is an edge e with minimum cost, that edge is part of MST
  • MST is unique if edge costs are distinct
  • Naive implementation has running time O(nm) but with heaps it drops to O(mlog(n))