b******7 发帖数: 79 | 1 以前听论坛上有人提过,但是没给过解,不知道哪位大侠能偶赐教啊!
Given a document and a query of K words, how do u find the smallest window
that covers all the words at least once in that document? (given you know
the inverted lists of all K words, that is, for each word, you have a list
of all its occurrrences). This one is really hard. Could someone propose an
algorithm in O(n)? |
p*********9 发帖数: 30 | 2 The question looks not quite clear. If we already know the positions for all
keys, we can find the min cover in O(n). Without the assumption, the
complexity of finding words from document is larger than O(n).
an
【在 b******7 的大作中提到】 : 以前听论坛上有人提过,但是没给过解,不知道哪位大侠能偶赐教啊! : Given a document and a query of K words, how do u find the smallest window : that covers all the words at least once in that document? (given you know : the inverted lists of all K words, that is, for each word, you have a list : of all its occurrrences). This one is really hard. Could someone propose an : algorithm in O(n)?
|
r**u 发帖数: 1567 | 3 my 2 cents:
given: a p o p g o a, find the window that covers all the chars. You can
think it this way:
a p o p g o a
a 1-----------7
p 2---4
o 3-----6
g 5
Here, we have 4 different chars. You can consider the appear of each one
as a segment, as shown above. You need to find a segment which overlaps the
segment of every char. In this example, it should be 1->6. Some
computational geometry problem.
all
【在 p*********9 的大作中提到】 : The question looks not quite clear. If we already know the positions for all : keys, we can find the min cover in O(n). Without the assumption, the : complexity of finding words from document is larger than O(n). : : an
|
p*********9 发帖数: 30 | 4 no. In your example, the min cover should 4->7.
the
【在 r**u 的大作中提到】 : my 2 cents: : given: a p o p g o a, find the window that covers all the chars. You can : think it this way: : a p o p g o a : a 1-----------7 : p 2---4 : o 3-----6 : g 5 : Here, we have 4 different chars. You can consider the appear of each one : as a segment, as shown above. You need to find a segment which overlaps the
|
r**u 发帖数: 1567 | 5 You are right. But, the idea should be correct.
【在 p*********9 的大作中提到】 : no. In your example, the min cover should 4->7. : : the
|
c******a 发帖数: 198 | 6 agree
【在 p*********9 的大作中提到】 : no. In your example, the min cover should 4->7. : : the
|
g*******y 发帖数: 1930 | 7 seems like:
you have K arrays of integers:
example: K=3
arr1: 1, 3, 4, 7, 14
arr2: 2, 5, 8, 15...
arr3: 3, 12, 13, 19 ...
arrK[i] is the position of ith ocurrence of the k-th word in the doc.
question: find min{distance(i,j,k)}
where:
distance(i,j,k) = max{|arr1[i]-arr2[j]|,|arr1[i]-arr3[k]|,|arr2[j]-arr3[k]|}
This is very easy and can be done in one scan of K arrays. Complexity is O(K*N)
Hint: just try plot some dots on the two/three number axis to represent two/three arrays, and keep in mind t |
g*******y 发帖数: 1930 | 8 "Some computational geometry problem."
you are bluffing!
the
【在 r**u 的大作中提到】 : my 2 cents: : given: a p o p g o a, find the window that covers all the chars. You can : think it this way: : a p o p g o a : a 1-----------7 : p 2---4 : o 3-----6 : g 5 : Here, we have 4 different chars. You can consider the appear of each one : as a segment, as shown above. You need to find a segment which overlaps the
|
s******8 发帖数: 4192 | 9 a p o p g o a
a0p0o0g0 count=0
[a] p o p g o a
a1p0o0g0 count=1
[a p] o p g o a
a1p1o0g0 count=2
[a p o] p g o a
a1p1o1g0 count = 3
[a p o p] g o a
a1p2o1g0 count = 3
[a p o p g] o a
a1p2o1g1 count = 4 (mark, pos 0-4, length 5)
a [p o p g] o a
a0p2o1g1 count = 3
a [p o p g o] a
a0p2o2g1 count = 3
a [p o p g o a]
a1p2o2g1 count = 4 (pos 1-6, length 6, worse than previous, ignore)
a p [o p g o a]
a1p1o2g1 count = 4 (pos 2-6, length 5, equal to previous mark, ignore)
a p o [p g o a]
a1p1o |
b******7 发帖数: 79 | 10 谢谢steve888 , 能否把idea讲一下? |
|
|
s*****e 发帖数: 49 | 11 就是一个动态窗口,先移动尾指针,如果覆盖了所有词,则移动头指针,
用全局变量来记录当前找到的最优解。
这个复杂度是O(n^2)吧,O(n*k)的谁来讲讲啊?
【在 b******7 的大作中提到】 : 谢谢steve888 , 能否把idea讲一下?
|
g*******y 发帖数: 1930 | 12 哦,我重新想了下复杂度,是k*logk*n,不过k很小嘛,klogk当作常数好了。
提示你,只往一个方向移动指针。很容易做到k^2*n,如果k也比较大,要提高到 klogk*n,要用两个heap,或者一个红黑树。
【在 s*****e 的大作中提到】 : 就是一个动态窗口,先移动尾指针,如果覆盖了所有词,则移动头指针, : 用全局变量来记录当前找到的最优解。 : 这个复杂度是O(n^2)吧,O(n*k)的谁来讲讲啊?
|
s******8 发帖数: 4192 | 13 我的解好像是n*log(k).
移动指针2n.
数组计数log(k)(如果k很大),或者k(如果k很小)
为什么出来n^2?
【在 s*****e 的大作中提到】 : 就是一个动态窗口,先移动尾指针,如果覆盖了所有词,则移动头指针, : 用全局变量来记录当前找到的最优解。 : 这个复杂度是O(n^2)吧,O(n*k)的谁来讲讲啊?
|
r**u 发帖数: 1567 | 14 Could you explain why "数组计数log(k)(如果k很大),或者k(如果k很小)"?
Thx.
【在 s******8 的大作中提到】 : 我的解好像是n*log(k). : 移动指针2n. : 数组计数log(k)(如果k很大),或者k(如果k很小) : 为什么出来n^2?
|
s******8 发帖数: 4192 | 15 如果k比较小,就建个数组,遇到一个word,就一个个找过去,直到找到这个word对应
的entry. O(k).
如果k比较大,用hash table或者tree,查找entry,就是O(log(k)).
【在 r**u 的大作中提到】 : Could you explain why "数组计数log(k)(如果k很大),或者k(如果k很小)"? : Thx.
|