搜索相关知识
IndexSearcher articleSearcher = ...
IndexSearcher commentSearcher = ...
String fromField = "id";
boolean multipleValuesPerDocument = false;
String toField = "article_id";
// This query should yield article with id 2 as result
BooleanQuery fromQuery = new BooleanQuery();
fromQuery.add(new TermQuery(new Term("title", "byte")), BooleanClause.Occur.MUST);
fromQuery.add(new TermQuery(new Term("title", "norms")), BooleanClause.Occur.MUST);
Query joinQuery = JoinUtil.createJoinQuery(fromField, multipleValuesPerDocument, toField, fromQuery, articleSearcher);
TopDocs topDocs = commentSearcher.search(joinQuery, 10);
看Lucene 4.0的代码,是这样实现的
/**
* Method for query time joining.
* <p/>
* Execute the returned query with a {@link IndexSearcher} to retrieve all documents that have the same terms in the
* to field that match with documents matching the specified fromQuery and have the same terms in the from field.
* <p/>
* In the case a single document relates to more than one document the <code>multipleValuesPerDocument</code> option
* should be set to true. When the <code>multipleValuesPerDocument</code> is set to <code>true</code> only the
* the score from the first encountered join value originating from the 'from' side is mapped into the 'to' side.
* Even in the case when a second join value related to a specific document yields a higher score. Obviously this
* doesn't apply in the case that {@link ScoreMode#None} is used, since no scores are computed at all.
* </p>
* Memory considerations: During joining all unique join values are kept in memory. On top of that when the scoreMode
* isn't set to {@link ScoreMode#None} a float value per unique join value is kept in memory for computing scores.
* When scoreMode is set to {@link ScoreMode#Avg} also an additional integer value is kept in memory per unique
* join value.
*
* @param fromField The from field to join from
* @param multipleValuesPerDocument Whether the from field has multiple terms per document
* @param toField The to field to join to
* @param fromQuery The query to match documents on the from side
* @param fromSearcher The searcher that executed the specified fromQuery
* @param scoreMode Instructs how scores from the fromQuery are mapped to the returned query
* @return a {@link Query} instance that can be used to join documents based on the
* terms in the from and to field
* @throws IOException If I/O related errors occur
*/
public static Query createJoinQuery(String fromField,
boolean multipleValuesPerDocument,
String toField,
Query fromQuery,
IndexSearcher fromSearcher,
ScoreMode scoreMode) throws IOException {
switch (scoreMode) {
case None:
TermsCollector termsCollector = TermsCollector.create(fromField, multipleValuesPerDocument);
fromSearcher.search(fromQuery, termsCollector);
// termsCollector.getCollectorTerms()是一个Lucene自己写的高效hashmap, 保存了fromField到ids的映射
return new TermsQuery(toField, fromQuery, termsCollector.getCollectorTerms());
case Total:
case Max:
case Avg:
TermsWithScoreCollector termsWithScoreCollector =
TermsWithScoreCollector.create(fromField, multipleValuesPerDocument, scoreMode);
fromSearcher.search(fromQuery, termsWithScoreCollector);
return new TermsIncludingScoreQuery(
toField,
multipleValuesPerDocument,
termsWithScoreCollector.getCollectedTerms(),
termsWithScoreCollector.getScoresPerTerm(),
fromQuery
);
default:
throw new IllegalArgumentException(String.format(Locale.ROOT, "Score mode %s isn't supported.", scoreMode));
}
}