由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
JobHunting版 - programming pearls 上一题讨论
相关主题
请问一道面试题Amazon电面面经
求助:bitmap的问题general solution for missing number(s) problem
两个programming pearls的习题请教Interview street上的一题求助
看programming pearl进行时的感想一道算法题目
问一道排序题着急请大家帮我看看怎么弄
Leetcode书中missing range一题的答案是不是错的?CSC ADV PP Receipt Number 分析
再出一题Bitmap是怎么回事啊?
Amazon(4)昨天面试的题目
相关话题的讨论汇总
话题: range话题: 32话题: numbers话题: number话题: input
进入JobHunting版参与讨论
1 (共1页)
c******n
发帖数: 4965
1
常见的。 为严谨起见我逐字抄下来
given a sequential file that contains ____ at most four billion 32-bit
integers -_____ in random order, find a 32-bit integer that isn't in the
file
简单的办法,如果有4G内存, 就做一个bitmap 可以找到。 书里讨论如果只有几百
byte内存, 但可以写文件, 怎么弄。
下面剧透。
书上说,把要找的range 分两半 , scan 过整个文件,看每一个range 里有多少, 肯
定有一个range 不够满,那就继续找那个range, 这样range 指数变小,最后为size 1,
就是要找的missing number.
我开始理解错,以为是有N numbers,N>2^32, but only 2^32-m of them are
distinct, where 1<=m < 2^32. 这样的话对每个小range 计数就不能认定是否有
missing number. 比如
input is
[0, 1 , 3 ]
(assume input range is 2^2 instead of 2^32 )
最开始range 是0--3
先在 0--1 range 找, 有两个数 (0,1), 再去2--3 找, 有一个数, range 就变
为2--3
然后找到2 missing.
如果是说条件只是限制unique numbers, 总数不管, 那sample input 变成
[0,1,3,3]
这样两个sub range 都有足够的数,就不行了。
用前面bitmap , 新的条件也可以解, 但binary search 的思路办法似乎就不成了。
h**********c
发帖数: 4120
2
算连续数的sha1,n^2
h**********c
发帖数: 4120
3
n! mod \bigpi a_i equal to the missed number.
n! needs nlogn bits
c******n
发帖数: 4965
4
first of all mul and div are too slow
even if u agree to use mul and div, this method is still only as good as the
one given in the book, and can't handle duplicate numbers , which is the
further case I raised

【在 h**********c 的大作中提到】
: n! mod \bigpi a_i equal to the missed number.
: n! needs nlogn bits

c******n
发帖数: 4965
5
more details please ?

【在 h**********c 的大作中提到】
: 算连续数的sha1,n^2
h**********c
发帖数: 4120
6
还没想清楚,先站个坑
不过假设你只有4k内寸,可以先捞1~k
没有missed,再捞k+1~8K

【在 c******n 的大作中提到】
: more details please ?
c******n
发帖数: 4965
7


【在 h**********c 的大作中提到】
: 还没想清楚,先站个坑
: 不过假设你只有4k内寸,可以先捞1~k
: 没有missed,再捞k+1~8K

c******n
发帖数: 4965
8
就思路来说,这个办法肯定可以在面试上加分, 基本上就是“combine/hybrid
existing 2 approaches", 但实际不work , 4B-->4K you have to chop the input
file into 1mil small files and do in-memory search for 1mil times.

【在 h**********c 的大作中提到】
: 还没想清楚,先站个坑
: 不过假设你只有4k内寸,可以先捞1~k
: 没有missed,再捞k+1~8K

c******n
发帖数: 4965
9
常见的。 为严谨起见我逐字抄下来
given a sequential file that contains ____ at most four billion 32-bit
integers -_____ in random order, find a 32-bit integer that isn't in the
file
简单的办法,如果有4G内存, 就做一个bitmap 可以找到。 书里讨论如果只有几百
byte内存, 但可以写文件, 怎么弄。
下面剧透。
书上说,把要找的range 分两半 , scan 过整个文件,看每一个range 里有多少, 肯
定有一个range 不够满,那就继续找那个range, 这样range 指数变小,最后为size 1,
就是要找的missing number.
我开始理解错,以为是有N numbers,N>2^32, but only 2^32-m of them are
distinct, where 1<=m < 2^32. 这样的话对每个小range 计数就不能认定是否有
missing number. 比如
input is
[0, 1 , 3 ]
(assume input range is 2^2 instead of 2^32 )
最开始range 是0--3
先在 0--1 range 找, 有两个数 (0,1), 再去2--3 找, 有一个数, range 就变
为2--3
然后找到2 missing.
如果是说条件只是限制unique numbers, 总数不管, 那sample input 变成
[0,1,3,3]
这样两个sub range 都有足够的数,就不行了。
用前面bitmap , 新的条件也可以解, 但binary search 的思路办法似乎就不成了。
h**********c
发帖数: 4120
10
算连续数的sha1,n^2
相关主题
Leetcode书中missing range一题的答案是不是错的?Amazon电面面经
再出一题general solution for missing number(s) problem
Amazon(4)Interview street上的一题求助
进入JobHunting版参与讨论
h**********c
发帖数: 4120
11
n! mod \bigpi a_i equal to the missed number.
n! needs nlogn bits
c******n
发帖数: 4965
12
first of all mul and div are too slow
even if u agree to use mul and div, this method is still only as good as the
one given in the book, and can't handle duplicate numbers , which is the
further case I raised

【在 h**********c 的大作中提到】
: n! mod \bigpi a_i equal to the missed number.
: n! needs nlogn bits

c******n
发帖数: 4965
13
more details please ?

【在 h**********c 的大作中提到】
: 算连续数的sha1,n^2
h**********c
发帖数: 4120
14
还没想清楚,先站个坑
不过假设你只有4k内寸,可以先捞1~k
没有missed,再捞k+1~8K

【在 c******n 的大作中提到】
: more details please ?
c******n
发帖数: 4965
15


【在 h**********c 的大作中提到】
: 还没想清楚,先站个坑
: 不过假设你只有4k内寸,可以先捞1~k
: 没有missed,再捞k+1~8K

c******n
发帖数: 4965
16
就思路来说,这个办法肯定可以在面试上加分, 基本上就是“combine/hybrid
existing 2 approaches", 但实际不work , 4B-->4K you have to chop the input
file into 1mil small files and do in-memory search for 1mil times.

【在 h**********c 的大作中提到】
: 还没想清楚,先站个坑
: 不过假设你只有4k内寸,可以先捞1~k
: 没有missed,再捞k+1~8K

h**********c
发帖数: 4120
17
Took some time think over this and I made an assumption, suppose there is no
repeating numbers. So we want to find the single missing number ranging
from 0 to 2^32-1 unsinged integers in a random sequence.
First, need background of unique factorial theorem.
Second, search prime numbers 32 bit in Internet, you will find we have total
about 200,xxx,xxx prime numbers in this scope.
Here is the deal.
suppose we have a_1 ... a_m prime numbers. a_1*...*a_m is the ceiling of 2^
32 -1 with any extra prime number needed. I think the count of a_1 to a_m
will be far less than 200,xxx,xxx. Think over it?
So put a_1 and a_m in a hashmap. The value will be the count of given prime
number.
From 0 to 2^32-1 - 1, the hashmap above will have a unique sequence.
Then we process the input, factor each number in the input data. Count the
presence of each prime number (add to the according value in the hashmap).
We will detect the prime numbers fail to present, compare to an total
sequence of 0 to 2^32-1. Their multiple is the single missed numbers.
If there are multiple missed numbers, we can use a second pass, use the
different combinations of the missed primes, screening the input data again.
So it will be good the OS has a local prime table.
Just for discussion for potential improvement of the situation. Please feel
free to comment. Thank off irrelevants!
MIT License.

1,

【在 c******n 的大作中提到】
: 常见的。 为严谨起见我逐字抄下来
: given a sequential file that contains ____ at most four billion 32-bit
: integers -_____ in random order, find a 32-bit integer that isn't in the
: file
: 简单的办法,如果有4G内存, 就做一个bitmap 可以找到。 书里讨论如果只有几百
: byte内存, 但可以写文件, 怎么弄。
: 下面剧透。
: 书上说,把要找的range 分两半 , scan 过整个文件,看每一个range 里有多少, 肯
: 定有一个range 不够满,那就继续找那个range, 这样range 指数变小,最后为size 1,
: 就是要找的missing number.

j**********3
发帖数: 3211
18
这书现在还有看的意义么?
C****t
发帖数: 53
19
Probably we can do in this way.
Build an array rec of length 32 which save the digit sum of all numbers.
for num in nums:
for i in range(32):
rec[i] += num>>i &1
if there is no missing number at all, then rec[i] == 2^31 for all i
if there is some missing numbers, check the first index i such that rec[i] <
2^31, then the missing number if of the form
XXXXXXXXXXXX 1 00000000
where digit 1 is located at the i-th place from the right.
We may repeat this process for 32 times.
1 (共1页)
进入JobHunting版参与讨论
相关主题
昨天面试的题目问一道排序题
贡献几道A家intern的题Leetcode书中missing range一题的答案是不是错的?
多年不打鱼连网都不会用了再出一题
是不是被老印耍了。Amazon(4)
请问一道面试题Amazon电面面经
求助:bitmap的问题general solution for missing number(s) problem
两个programming pearls的习题请教Interview street上的一题求助
看programming pearl进行时的感想一道算法题目
相关话题的讨论汇总
话题: range话题: 32话题: numbers话题: number话题: input