由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
CS版 - a question on search engine/Inverted Index
相关主题
Re: How to understand the "true" and "fa推荐李伟钢博士的《智商 情商 网熵》---有兴趣的可以看系列博文
纳闷: 百度的技术每天都过得很焦虑,生物转CS求建议
关于google (转载)lp_solve 求救! divide by zero error :-(
预测几个值图灵奖的得主c library for matrix
有研究social network方向的吗?问一个 information retrieval 问题。。。
PageRank问题请教inverted page table
百度:Google剽窃了李彦宏的技术才得以发家 (转载)a question on search engine/Inverted Index (转载)
搜索的核心技术, 李彦宏和GOOGLE的LARRY谁是先行者? (转载)问一道L家的题
相关话题的讨论汇总
话题: inverted话题: index话题: doclists话题: engine
进入CS版参与讨论
1 (共1页)
c******n
发帖数: 4965
1
if I search
cat dog
inside the engine, 2 doclists are returned, each sorted in docId, one for
"dog", one for "cat", then the lists are merged. then PageRank is carried
out over the intersection of the 2 sets. is this correct? my question is:
since both doclists can be very long, and their intersection list could
also be very long, each time user does a query, maybe only 100 of the
intersection docIds are finally used, so we have to sort unnecessarily
each time??
how is this solved exactly in google?
thanks
1 (共1页)
进入CS版参与讨论
相关主题
问一道L家的题有研究social network方向的吗?
搜索database按什么算法最快?用index?PageRank问题请教
DocList I made for my dad百度:Google剽窃了李彦宏的技术才得以发家 (转载)
anybody doing Lucene/Solr?搜索的核心技术, 李彦宏和GOOGLE的LARRY谁是先行者? (转载)
Re: How to understand the "true" and "fa推荐李伟钢博士的《智商 情商 网熵》---有兴趣的可以看系列博文
纳闷: 百度的技术每天都过得很焦虑,生物转CS求建议
关于google (转载)lp_solve 求救! divide by zero error :-(
预测几个值图灵奖的得主c library for matrix
相关话题的讨论汇总
话题: inverted话题: index话题: doclists话题: engine