c******n 发帖数: 4965 | 1 if I search
cat dog
inside the engine, 2 doclists are returned, each sorted in docId, one for
"dog", one for "cat", then the lists are merged. then PageRank is carried
out over the intersection of the 2 sets. is this correct? my question is:
since both doclists can be very long, and their intersection list could
also be very long, each time user does a query, maybe only 100 of the
intersection docIds are finally used, so we have to sort unnecessarily
each time??
how is this solved exactly in google?
thanks |
|