Twitter 电面 - JobHunting版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

相关主题
● [teradata面经] hadoop engineer	● 哪个高手能指出我程序的问题（20几行的代码）
● 感恩发面经-Amazon第一轮电面	● Twitter电面未通过
● 被印度人压迫的同胞们，请支持一下 #indiansentiment	● Amazon 面经
● 再来一遍：支持梁警官，支持腿特	● Google电面
● 再说推特，以及Peter Liang #justice4liang	● An Old Question -- Top N Occurance Frequance
● DESPERATE: Twitter Now Trying To Quarantine Alt-Right After Failure To Destroy It	● hash_map 的遍历问题
● 如何有效反对S.386行动指南【9/21最新版】 (转载)	● 问个amazon面试题
● Groupon电面	● How to find 10 most frequent strings in 10 billion string list?

相关话题的讨论汇总
话题: hashtag话题: node话题: frequency话题: hashtags话题: dll

进入JobHunting版参与讨论

(共1页)

c********n
发帖数: 26

一位加拿大香港人面的
given a stream of hashtags, find out most frequent hashtag within last W
number of hashtags
应该就是 max sliding window 的变种
挂了 ...

t*********i
发帖数: 20

哪位牛人能不能给讲讲解法?

U***A
发帖数: 849

题目都没有看懂

c******w
发帖数: 1108

用一个circular array存most recent W hashtags
用一个size不超过W的doubleLinkedList of
DLL_Node{
String hashtag;
int frequency;
DLL_Node prev, next;
}
存circular array里的unique hashtag和对应的frequency。并按frequency排序。
用一个HashMap来通过hashtag找到doubleLinkedList里对应的node。
扫进来一个已有的（或新的）hashtag，找到（或create）对应的node，frequency++，
然后比较prev.frequency。如果 > 就swap。repeat till <=
对被挤走的hashtag，frequency--,然后比较next.frequency。如果 < 就swap。repeat
till >=。为0则remove。

w**********o
发帖数: 140

http://www.geeksforgeeks.org/find-the-maximum-repeating-number-
O(n) time and O(1) extra space.
順便求Twitter內推.

b**********5
发帖数: 7881

not the same problem

【在 w**********o 的大作中提到】

: http://www.geeksforgeeks.org/find-the-maximum-repeating-number-
: O(n) time and O(1) extra space.
: 順便求Twitter內推.

n*****5
发帖数: 984

用heap保持数量，用hashmap找点。

c*****m
发帖数: 271

感觉这样应该是对的，还得加一个queue记录最近的W个hashtag，每次有一个新的
hashtag来时，queue.pop_front()，从hashmap中找到index，再在heap中更新吧？

【在 n*****5 的大作中提到】

: 用heap保持数量，用hashmap找点。

s******d
发帖数: 9806

找frequency的部分用bucket（double linked）更好一些。每个node就是frequency是
某一个值的所有hashtag，这样每次操作只需要移动到相邻的bucket，或者创建新的
bucket（因为frequency只有加减1）。

node。

【在 c******w 的大作中提到】

: 用一个circular array存most recent W hashtags
: 用一个size不超过W的doubleLinkedList of
: DLL_Node{
: String hashtag;
: int frequency;
: DLL_Node prev, next;
: }
: 存circular array里的unique hashtag和对应的frequency。并按frequency排序。
: 用一个HashMap来通过hashtag找到doubleLinkedList里对应的node。
: 扫进来一个已有的（或新的）hashtag，找到（或create）对应的node，frequency++，

n*****5
发帖数: 984

谢谢补充，你说的对。

【在 c*****m 的大作中提到】

: 感觉这样应该是对的，还得加一个queue记录最近的W个hashtag，每次有一个新的
: hashtag来时，queue.pop_front()，从hashmap中找到index，再在heap中更新吧？

相关主题
● DESPERATE: Twitter Now Trying To Quarantine Alt-Right After Failure To Destroy It	● 哪个高手能指出我程序的问题（20几行的代码）
● 如何有效反对S.386行动指南【9/21最新版】 (转载)	● Twitter电面未通过
● Groupon电面	● Amazon 面经
进入JobHunting版参与讨论

c******w
发帖数: 1108

这个优化确实好!之前我就在想要是有一堆hashtags的frequency都一样那update就太
expensive了.
用bucket的话doubleLinkedList里面的node应该就是
DLL_Node{
DLL_Node prev, next;
HashSet hashtags;
int frequency;
}
这样一来每次update都只需要O(1)的操作.time complexity肯定比用heap要好了.

【在 s******d 的大作中提到】

: 找frequency的部分用bucket（double linked）更好一些。每个node就是frequency是
: 某一个值的所有hashtag，这样每次操作只需要移动到相邻的bucket，或者创建新的
: bucket（因为frequency只有加减1）。
:
: node。

U***A
发帖数: 849

是要说以下想法还是要写代码？

g****v
发帖数: 971

先建一个size为W的circular array，然后建立一个heap。
circular array的元素map to heap中的元素。
这样insert新元素的时候，复杂度是O(lgW)。

c********n
发帖数: 26

一位加拿大香港人面的
given a stream of hashtags, find out most frequent hashtag within last W
number of hashtags
应该就是 max sliding window 的变种
挂了 ...

t*********i
发帖数: 20

哪位牛人能不能给讲讲解法?

U***A
发帖数: 849

题目都没有看懂

c******w
发帖数: 1108

w**********o
发帖数: 140

http://www.geeksforgeeks.org/find-the-maximum-repeating-number-
O(n) time and O(1) extra space.
順便求Twitter內推.

b**********5
发帖数: 7881

not the same problem

【在 w**********o 的大作中提到】

: http://www.geeksforgeeks.org/find-the-maximum-repeating-number-
: O(n) time and O(1) extra space.
: 順便求Twitter內推.

n*****5
发帖数: 984

用heap保持数量，用hashmap找点。

相关主题
● Google电面	● 问个amazon面试题
● An Old Question -- Top N Occurance Frequance	● How to find 10 most frequent strings in 10 billion string list?
● hash_map 的遍历问题	● 一道fb的题，clone a graph
进入JobHunting版参与讨论

c*****m
发帖数: 271

: 用heap保持数量，用hashmap找点。

s******d
发帖数: 9806

n*****5
发帖数: 984

谢谢补充，你说的对。

【在 c*****m 的大作中提到】

: 感觉这样应该是对的，还得加一个queue记录最近的W个hashtag，每次有一个新的
: hashtag来时，queue.pop_front()，从hashmap中找到index，再在heap中更新吧？

c******w
发帖数: 1108

U***A
发帖数: 849

是要说以下想法还是要写代码？

g****v
发帖数: 971

先建一个size为W的circular array，然后建立一个heap。
circular array的元素map to heap中的元素。
这样insert新元素的时候，复杂度是O(lgW)。

f********c
发帖数: 147

这样改了DLL_NODE之后，HashMap也要相应修改是吗？

【在 c******w 的大作中提到】

: 这个优化确实好!之前我就在想要是有一堆hashtags的frequency都一样那update就太
: expensive了.
: 用bucket的话doubleLinkedList里面的node应该就是
: DLL_Node{
: DLL_Node prev, next;
: HashSet hashtags;
: int frequency;
: }
: 这样一来每次update都只需要O(1)的操作.time complexity肯定比用heap要好了.

S*******w
发帖数: 24236

太难了题目都看不懂

【在 c********n 的大作中提到】

: 一位加拿大香港人面的
: given a stream of hashtags, find out most frequent hashtag within last W
: number of hashtags
: 应该就是 max sliding window 的变种
: 挂了 ...

i*******e
发帖数: 7

用bucket是不是可以放弃double linked list直接上array就好了，array size = W，
index = frequency - 1. 这样是不是省点事？hashmap works as inverted index,
key = hashtag, value = index of bucket array

【在 c******w 的大作中提到】

c****8
发帖数: 76

想问问如果用heap的话，如何在hashtag在window中消失的时候（也就是frequency为0
的时候）删除head里的这个hashtag节点呢？

【在 c*****m 的大作中提到】

: 感觉这样应该是对的，还得加一个queue记录最近的W个hashtag，每次有一个新的
: hashtag来时，queue.pop_front()，从hashmap中找到index，再在heap中更新吧？

相关主题
● binary tree, sum of 2 nodes == given number	● 感恩发面经-Amazon第一轮电面
● Find top K most frequent numbers?	● 被印度人压迫的同胞们，请支持一下 #indiansentiment
● [teradata面经] hadoop engineer	● 再来一遍：支持梁警官，支持腿特
进入JobHunting版参与讨论

j**********3
发帖数: 3211

没看懂题，谁能解释一下

(共1页)

进入JobHunting版参与讨论

相关主题
● How to find 10 most frequent strings in 10 billion string list?	● 再说推特，以及Peter Liang #justice4liang
● 一道fb的题，clone a graph	● DESPERATE: Twitter Now Trying To Quarantine Alt-Right After Failure To Destroy It
● binary tree, sum of 2 nodes == given number	● 如何有效反对S.386行动指南【9/21最新版】 (转载)
● Find top K most frequent numbers?	● Groupon电面
● [teradata面经] hadoop engineer	● 哪个高手能指出我程序的问题（20几行的代码）
● 感恩发面经-Amazon第一轮电面	● Twitter电面未通过
● 被印度人压迫的同胞们，请支持一下 #indiansentiment	● Amazon 面经
● 再来一遍：支持梁警官，支持腿特	● Google电面

相关话题的讨论汇总
话题: hashtag话题: node话题: frequency话题: hashtags话题: dll

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天