y******n 发帖数: 67 | 1 Given an file which consists thousands of lines. each line consists of a
string and several integers.
design an algorithm which take input of several integers and print out the
string of the line that have most matches.
input file
----------
aa 3 4 10 2
bb 9 14 15 21 3
cc 12 1024 200 3 9 4
----------
examples:
input: 3 4 10
output: aa
input: 12 3 4
output: cc
input: 3 9
output: bb
input: 3 9
output: cc
input: 3 4 12
output: cc
Thanks! |
l*********8 发帖数: 4642 | 2 build a hash map:
key is the integer
value is list, which indicates which lines contain this integer.
scan the file and construct the hashmap.
when matching the input integers
create a BST to store the number of lines and how many matches it has.
【在 y******n 的大作中提到】 : Given an file which consists thousands of lines. each line consists of a : string and several integers. : design an algorithm which take input of several integers and print out the : string of the line that have most matches. : input file : ---------- : aa 3 4 10 2 : bb 9 14 15 21 3 : cc 12 1024 200 3 9 4 : ----------
|
m***n 发帖数: 2154 | 3 这不就是two sigma的原题么。我自认为都作对了,他都不理我了。 |
l*********8 发帖数: 4642 | 4 How did you answer the question?
【在 m***n 的大作中提到】 : 这不就是two sigma的原题么。我自认为都作对了,他都不理我了。
|
l*********8 发帖数: 4642 | 5 what kind of company two sigma is?
【在 m***n 的大作中提到】 : 这不就是two sigma的原题么。我自认为都作对了,他都不理我了。
|
y******n 发帖数: 67 | 6 so every time when doing a match, a BST is created? if that has to be the
way, how about creating a max-heap?
can we not creating BST frequently? I am looking for something that is more
efficient in run-time! creating BST in run time is memory-consuming and time
-consuming given large # of data sets.
thanks! |
l*********8 发帖数: 4642 | 7 Maybe we can only create the BST once. And we clear the BST every time
before matching a new group of integers.
more
time
【在 y******n 的大作中提到】 : so every time when doing a match, a BST is created? if that has to be the : way, how about creating a max-heap? : can we not creating BST frequently? I am looking for something that is more : efficient in run-time! creating BST in run time is memory-consuming and time : -consuming given large # of data sets. : thanks!
|