jack-zheng
3/24/2018 - 5:58 AM

algorithm, note

algorithm, note

Chapter1

二分查找, 时间复杂度 O(logn) 顺序查找, 时间复杂度 O(n)

二分查找实现:

In [50]: def binary_search(list, guess):
    ...:     low = 0
    ...:     high = len(list) -1
    ...:     while low<=high:
    ...:         mid = int((low+high)/2) # 中位向下取整
    ...:         item = list[mid]
    ...:         if item == guess:
    ...:             return mid
    ...:         if item < guess: # 中值小于猜测值, 提下限
    ...:             low = mid + 1
    ...:         else:
    ...:             high = mid - 1  # 中值大于猜测值, 缩上限
    ...:      return None

In [51]: index = binary_search(a, 7)

In [52]: index
Out[52]: 7

In [53]: index = binary_search(a, 84)

In [54]: index
Out[54]: 84

常见大 O 运行时间:

  • O(logn) 对数时间
  • O(n) 线性时间
  • O(n*logn) 同快速排序
  • O(n2)
  • O(n!)

Chapter2

内存就像抽屉盒 数组和链表的区别:

  • 数组擅长读取, 链表擅长插入删除
  • 数组内存连续, 链表不连续
  • 读取速度对比, 数组O(1), 链表O(n)
  • 插入删除速度对比, 数组O(n), 链表O(1)

选择排序, 时间复杂度 O(n2)

In [55]: def findSmallest(arr):
    ...:     smallest = arr[0]
    ...:     smallest_index = 0
    ...:     for i in range(1, len(arr)):
    ...:         if arr[i] < smallest:
    ...:             smallest = arr[i]
    ...:             smallest_index = i
    ...:     return smallest_index
    ...:

In [56]: def selectionSort(arr):
    ...:     newArr = []
    ...:     for i in range(len(arr)):
    ...:         smallest = findSmallest(arr)
    ...:         newArr.append(arr.pop(smallest))
    ...:     return newArr
    ...:

import numpy as np

In [57]: testArr = np.random.randint(10, size=20)
In [60]: selectionSort(testArr)
Out[60]: -----> sorted list

PS: 对数组操作时, 操作下标更方便

Chapter3

递归要点:

  1. 确定基线条件
  2. 确定递归条件

e.g. impl of countdown function

In [1]: def countdown(num):
   ...:     print(num)
   ...:     if num == 0:  # 基线条件
   ...:         return
   ...:     else:
   ...:         countdown(num-1) 
   ...:

优点: 递归不一定效率高, 但是逻辑会更清晰 缺点: 占内存 解决方案:

  1. 用循环
  2. 尾递归 (查下实现)
  3. lazy 模式怎么样

Chapter4

分而治之: divide and conquer (D&C), 一种递归思想

e.g. 168*64 的长方形, 计算出分割出来的最大正方形
In [22]: def getSquare(len, wid):
    ...:     if len % wid == 0:
    ...:         return wid
    ...:     else:
    ...:         ret = len%wid
    ...:         len = wid
    ...:         wid = ret
    ...:         return getSquare(len, wid)
    ...:
    ...:

In [23]: getSquare(164, 64)
Out[23]: 4

e.g. 使用递归计算数组加和

In [24]: def sum(arr):
    ...:     if len(arr) == 1:
    ...:         return arr[0]
    ...:     else:
    ...:         return arr.pop(0) + sum(arr)
    ...:

In [25]: sum([1,2,3])
Out[25]: 6

e.g. 快速排序实现

快速排序结合和归纳证明和 D&C 的思想方法, 有点绕, 但是很厉害的样子。 时间复杂度平均为 O(n*logn), 很优秀, 最糟糕时 O(log<sup>n</sup>)
In [28]: def quicksort(arr):
    ...:     if len(arr) < 2:
    ...:         return arr
    ...:     else:
    ...:         pivot = arr[0]
    ...:         less = [i for i in arr[1:] if i < pivot]
    ...:         greater = [i for i in arr[1:] if i > pivot]
    ...:         return quicksort(less) + [pivot] + quicksort(greater)
    ...:

In [29]: import numpy as np

In [30]: arr = np.random.randint(20, size=20)

In [31]: arr
Out[31]:
array([ 7,  5, 18, 17, 12, 18, 10,  0,  2,  5, 18, 12,  9,  0,  6, 14, 14,
       17, 13,  0])

In [32]: quicksort(arr)
Out[32]: [0, 2, 5, 6, 7, 9, 10, 12, 13, 14, 17, 18]

快速排序(quick sort) VS 合并排序(merge sort)

  1. 平时我们讨论算法时间复杂度时是忽略常亮的
  2. 快速排序的时间常量小于合并排序
  3. 快速排序效率取决于你使用的基准值
    • 最糟糕的类似有序数组那第一个数字做基准 O(logn)
    • 最优的类似有序数组, 中值做基准 O(n * log n)

Chapter5

散列表 - hash table - map, dict

决定散列表好坏的因素:

1. 较低的填装因子, 填装因子>0.7, 扩容
2. 良好的散列函数

优秀的散列表相当于集数组链表优势于一身, 差的散列表就是集两者缺点于一身

Chapter6

广度优先搜索(breadth-first search BFS), 用于计算最短距离, 各种意义上的

图 = 节点 + 边 BFS: 用于图的查找算法, 解决两方面问题,

1. 冲节点A 出发, 有前往B的路径吗
2. 冲A出发, 前往B的最短路径

搜索效果怎么都感觉和那个什么 六度分隔理论 很像 图通过dict 实现

BFS 算法实现: 使用dict 存储友人关系, 通过队列存储搜索对象, 先从维度为1的查找, 如果1中没有, 把1中对象背后的2维度关系加入队列

In [38]: graph = {}

In [39]: graph['you'] = ['alice', 'bob', 'clark']

In [40]: graph['bob'] = ['anuj', 'peggy']

In [41]: graph['alice'] = ['peggy']

In [42]: graph['clark'] = ['thom', 'jonny']

In [43]: graph['anuj'] = []

In [44]: graph['peggy'] = []

In [45]: graph['thom'] = []

In [46]: graph['jonny'] = []

In [47]: from collections import deque
In [71]: def BFS():
    ...:     searched = []
    ...:     queue = deque(graph['you'])
    ...:     while queue:
    ...:         person = queue.popleft()
    ...:         if not person in searched:
    ...:             if isSeller(person):
    ...:                 return person
    ...:             else:
    ...:                 searched.append(person)
    ...:                 queue.extend(graph[person])
    ...:
    ...:     print('No such friend')
    ...:
In [72]: BFS()
Out[72]: 'thom'

时间复杂度 O(V+E), V:定点数, E: 边数