sundeepblue
4/4/2014 - 2:58 AM

[epi 19.2] search engine: given a million documents with an average size of 10kb, design a program that can efficiently return the subset of

[epi 19.2] search engine: given a million documents with an average size of 10kb, design a program that can efficiently return the subset of documents containing a given set of words

key: build inverted indices.

tips:
    compression
    caching
    frequency-based optimization
    intersection order (*)
    build multi-level index to improve accuracy