One Phone Interview Problem - JobHunting版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

JobHunting版 - One Phone Interview Problem

相关主题
● 专家们，find the longest common substring of two strings	● 面试题目：有２个字符串，消除第一个字符串中第二个字符串包含的所有字母。例如： string1: helloworld string2: abcdef output: hlloworld 面
● Amazon	● 问到题
● 给一个大俗之一的面经吧。	● 请教一道leetcode的题目
● 一个基本的string问题	● 请问一个题目
● amazon 第一轮电话面试	● 一道string matching的题目
● 回馈本版，贴GOOGLE电话面经	● g电面
● 关于anagram的老题？	● ［轉載］amazon online test面经，现在真变态，开摄像头，考gre逻辑。
● 一周多了。。。等的太不淡定了。。。说两个面经吧	● 请问facebook末位淘汰是怎样的哦有offer了但是怕去了不能survive

相关话题的讨论汇总
话题: seq1话题: seq2话题: abcdefg话题: problem话题: window

进入JobHunting版参与讨论

(共1页)

s****n
发帖数: 1237

Give you two sequences of length N, how to find the max window of matching
patterns. The patterns can be mutated.
For example, seq1 = "ABCDEFG", seq2 = "DBCAPFG", then the max window is 4. (
ABCD from seq1 and DBCA from seq2). 起始位置无需相同。
我一点头绪都没有，就想出了brutal force的办法。对方说可以利用都是N的特性，可
以sort，还是不懂。请教一下应该怎么做。

h**6
发帖数: 4160

是否一定要相同位置呢？比如第一个字符串的1~4和第二个字符串的5~8可不可以？

b********e
发帖数: 693

下面的方法,可能稍微优化,但是不太想正确答案
1 .如果两个字符串size相同,那么先sort,然后比较, 如果相同,就是最大值, 否则
2 .在seq2里面, 找到第一个不在seq1里面的字符, 假设位置是X
那么最大值就在(0,x)和(x+1,n-1)这两个数组里面, 假设这两个字符串是A和B
如果所有字符都出现了, 那么这2个字符串是相等的

(

【在 s****n 的大作中提到】

: Give you two sequences of length N, how to find the max window of matching
: patterns. The patterns can be mutated.
: For example, seq1 = "ABCDEFG", seq2 = "DBCAPFG", then the max window is 4. (
: ABCD from seq1 and DBCA from seq2). 起始位置无需相同。
: 我一点头绪都没有，就想出了brutal force的办法。对方说可以利用都是N的特性，可
: 以sort，还是不懂。请教一下应该怎么做。

b********e
发帖数: 693

应该是可以的

【在 h**6 的大作中提到】

: 是否一定要相同位置呢？比如第一个字符串的1~4和第二个字符串的5~8可不可以？

s*********t
发帖数: 1663

dp[i] = set up to position i
for i = 0 to n
for j = 0 to i
if dp1[i]-dp1[j] == dp2[i] - dp2[j]
then result.append()
O(n^2)

(

【在 s****n 的大作中提到】

s****n
发帖数: 1237

起始位置不用相同

【在 h**6 的大作中提到】

: 是否一定要相同位置呢？比如第一个字符串的1~4和第二个字符串的5~8可不可以？

s*********t
发帖数: 1663

得，那我做的根本不对

【在 s****n 的大作中提到】

: 起始位置不用相同

s****n
发帖数: 1237

这个问题我觉得他也没有讲清楚，所以我以我的理解来阐述题目。
每个seq里面的字符应该是可以重复的
另外第一步你考虑整个str,只有C(N,N)=1种。
但是如果你说在(0,x)里面的时候，就有C(N,x)种组合，这个复杂度和我brute force有
点象

【在 b********e 的大作中提到】

: 下面的方法,可能稍微优化,但是不太想正确答案
: 1 .如果两个字符串size相同,那么先sort,然后比较, 如果相同,就是最大值, 否则
: 2 .在seq2里面, 找到第一个不在seq1里面的字符, 假设位置是X
: 那么最大值就在(0,x)和(x+1,n-1)这两个数组里面, 假设这两个字符串是A和B
: 如果所有字符都出现了, 那么这2个字符串是相等的
:
: (

b********e
发帖数: 693

还是能有点优化的
假设第一次分开两个数组, 一个X大小, 一个Y大小
原始的SEQ1是N大小.
当我们比较SEQ1和X的时候, 我们就开是否SEQ1里面的字符都在X出现, 否则找到一个没
出现的字符, 对SEQ1进行分化成两个数组

【在 s****n 的大作中提到】

: 这个问题我觉得他也没有讲清楚，所以我以我的理解来阐述题目。
: 每个seq里面的字符应该是可以重复的
: 另外第一步你考虑整个str,只有C(N,N)=1种。
: 但是如果你说在(0,x)里面的时候，就有C(N,x)种组合，这个复杂度和我brute force有
: 点象

b********e
发帖数: 693

或者我猜,会不会是DP阿

(

【在 s****n 的大作中提到】

相关主题
● 回馈本版，贴GOOGLE电话面经	● 面试题目：有２个字符串，消除第一个字符串中第二个字符串包含的所有字母。例如： string1: helloworld string2: abcdef output: hlloworld 面
● 关于anagram的老题？	● 问到题
● 一周多了。。。等的太不淡定了。。。说两个面经吧	● 请教一道leetcode的题目
进入JobHunting版参与讨论

s****n
发帖数: 1237

问题是如果不一样长，就不能直接判断了。
假设seq1 = ABCDE， X=ACE。
X里面的字符都出现在seq1里面，但是max window size不是3是1.

【在 b********e 的大作中提到】

: 还是能有点优化的
: 假设第一次分开两个数组, 一个X大小, 一个Y大小
: 原始的SEQ1是N大小.
: 当我们比较SEQ1和X的时候, 我们就开是否SEQ1里面的字符都在X出现, 否则找到一个没
: 出现的字符, 对SEQ1进行分化成两个数组

s****n
发帖数: 1237

应该是DP一类的，但是怎么构建这个DP我当时没有想出

【在 b********e 的大作中提到】

: 或者我猜,会不会是DP阿
:
: (

b********e
发帖数: 693

这个时候就是用X去比较Seq1,从头开始
Seq1之后下面几个组合
ABC
BCD
CDE
每次比较都可以重复前面的2步骤, sort, 然后分割
回避bruceforce少不少的
应为最后的结果是递归出来的, 所以在每个分叉上找到最大值就行了

【在 s****n 的大作中提到】

: 问题是如果不一样长，就不能直接判断了。
: 假设seq1 = ABCDE， X=ACE。
: X里面的字符都出现在seq1里面，但是max window size不是3是1.

b********e
发帖数: 693

看到DP头疼...

【在 s****n 的大作中提到】

: 应该是DP一类的，但是怎么构建这个DP我当时没有想出

s****n
发帖数: 1237

你的那个解法应该就是DP的思路了，牛啊。

【在 b********e 的大作中提到】

: 看到DP头疼...

b********e
发帖数: 693

我的哪个有点divide and conque呵呵
回头再想想

【在 s****n 的大作中提到】

: 你的那个解法应该就是DP的思路了，牛啊。

b********h
发帖数: 119

你这个难道不可以扩展到二维吗？
dp[i][j] = set of chars from pos i to pos j
for len in [n, 1]
for i in [0, n-len)
if dp1[i][i+len] == dp2[i][i+len]
return len

【在 s*********t 的大作中提到】

: dp[i] = set up to position i
: for i = 0 to n
: for j = 0 to i
: if dp1[i]-dp1[j] == dp2[i] - dp2[j]
: then result.append()
: O(n^2)
:
: (

B*****t
发帖数: 335

先找出两个seq的suffix array，然后在SA上找max window，找max window的过程和
已知中序，后续遍历，重建二叉树的方法很象。复杂度O(nlogn)

of matching
window is 4. (

【在 s****n 的大作中提到】

y*********e
发帖数: 518

这个题目我能想到的就是Dynamic Programming了。
比较2个bag of chars，可以把它们转化成一种shorten form，which are sorted
alphabets with frequencies.
比如：
AABEDEF -> A2 B1 D1 E2 F1
FFACBDX -> A1 B1 C1 D1 X1 F2
Create, sort and copare two shorten forms would take O(N). Also, find
differences between two shorten forms would take O(logN)。
现在，把2个string各自转化成shorten form。如果是一样的，那么答案就已经出来了。
否则，从string1里面找不存在于string2的字符，把string1截取成若干个小的字符串
。同样地，从string2里面找不存在于string1的字符，把string2截成若干个小的字符
串。这样，我们就有2个字符串数组，比较每一个pair，（这是和本题同样的问题但是
规模小了一些）

f**********w
发帖数: 93

A[n], B[m], 转化成矩阵K[n，m],如果a[i] == b[j], 设k[i,j]=1,否则设0,找矩阵的
最大diagonal为一的连续子矩阵，因该有O（n^2)的算法

相关主题
● 请问一个题目	● ［轉載］amazon online test面经，现在真变态，开摄像头，考gre逻辑。
● 一道string matching的题目	● 请问facebook末位淘汰是怎样的哦有offer了但是怕去了不能survive
● g电面	● shorten url 单机解法抛砖引玉
进入JobHunting版参与讨论

d****t
发帖数: 6

this finds the same substring. not anagrams.
but it can be modified to get an O(n^2) solution.
do the same matrix computation.
but instead of looking for the sub-square that has all 1s in its diagonal,
look for the max sub-square that has no row that is all 0s and no column
that is all 0s.
To give an example: string a = "ABCDEFG", string b = "KDACBPG"
A B C D E F G
K 0 0 0 0 0 0 0
D 0 0 0 1 0 0 0
A 1 0 0 0 0 0

【在 f**********w 的大作中提到】

: A[n], B[m], 转化成矩阵K[n，m],如果a[i] == b[j], 设k[i,j]=1,否则设0,找矩阵的
: 最大diagonal为一的连续子矩阵，因该有O（n^2)的算法

s*******t
发帖数: 248

(

【在 s****n 的大作中提到】

c***2
发帖数: 838

Let’s first simplify the problem:
1) Assume the elements of both sequences contain only 26 upper case letters
(A..Z)
2) Assume both sequences don't have duplicate elements,
for example, "AABBC" is not allowed
With these in mind, it will be quite easy:
1) Map each sequence to a unsigned int32 (lower 26 bits)
seq1 = "ABCDEFG",
maps/hashes to b1= (000000) 00000000000000000001111111
seq2 = "DBCAPFG",
maps/hashes to b2=(000000) 00000000001000000001101111
2) c=b1&b2
3) Find the longest

h**k
发帖数: 3368

可以用二分法来确定最大长度x，x应该在区间（1，min(str1_len, str2_len))里。
对于给定长度x，我们可以用O(n)来判断是否在str1和str2中各存在一个长度为x的子串
互为anagram。基本思路是对于str1和str2中的每个可能长度为x的substring，生成
counter array，记录里面每个字符出现的次数。比如baca的counter table就是count[
a]=2, count[b]=1, count[c]=1。然后把array的值存入hash table。这样就能找出互
为anagram的两个子串。

(

【在 s****n 的大作中提到】

A*********r
发帖数: 564

我怎么觉得没有那么复杂啊，貌似复杂度就是sort的复杂度。。
用楼主的例子:
seq1: ABCDEFG
seq2: DBCAPFG
可以得到seq2中每个字母在seq1对应的坐标依次为， O（n)可以实现吧。
3 1 2 0 -1 5 6
P字母没有在seq1里面出现，所以标记为-1, 然后sort这个坐标组里面足用-1分割开的
每个段，变为
0 1 2 3 -1 5 6
然后找最长的strictly increment 的一组数就可以了。。
关于这个间隔sort的可行性和复杂度，可以再讨论一下。。

(

【在 s****n 的大作中提到】

p********7
发帖数: 549

我觉得这个办法最好，实现起来最简单。平均复杂度和最差复杂度都是N^2，其他办法
最差复
杂度还是N^2，估计平均复杂度少一些

【在 d****t 的大作中提到】

: this finds the same substring. not anagrams.
: but it can be modified to get an O(n^2) solution.
: do the same matrix computation.
: but instead of looking for the sub-square that has all 1s in its diagonal,
: look for the max sub-square that has no row that is all 0s and no column
: that is all 0s.
: To give an example: string a = "ABCDEFG", string b = "KDACBPG"
: A B C D E F G
: K 0 0 0 0 0 0 0
: D 0 0 0 1 0 0 0

c*******w
发帖数: 63

假如:
Seq1: ABXCDEFG
Seq2: DBCAPFGX
Seq中的每个字母坐标:
4 1 3 0 (-1) 6 7 2
sort之后
0 1 3 4 -1 2 6 7
Window[A,B], Window[C,D]等都不是valid的candidate啊?
不知道我有没有正确理解AprilFlower的办法.

【在 A*********r 的大作中提到】

: 我怎么觉得没有那么复杂啊，貌似复杂度就是sort的复杂度。。
: 用楼主的例子:
: seq1: ABCDEFG
: seq2: DBCAPFG
: 可以得到seq2中每个字母在seq1对应的坐标依次为， O（n)可以实现吧。
: 3 1 2 0 -1 5 6
: P字母没有在seq1里面出现，所以标记为-1, 然后sort这个坐标组里面足用-1分割开的
: 每个段，变为
: 0 1 2 3 -1 5 6
: 然后找最长的strictly increment 的一组数就可以了。。

c*******w
发帖数: 63

Counter Example:
Seq1: ABCDEFG
Seq2: ABXCDYZ
maps/hashes to b1= (000000) 00000000000000000001111111
maps/hashes to b2= (000000) 11100000000000000000001111
Is the answer: ABCD? It is not correct.

letters

【在 c***2 的大作中提到】

: Let’s first simplify the problem:
: 1) Assume the elements of both sequences contain only 26 upper case letters
: (A..Z)
: 2) Assume both sequences don't have duplicate elements,
: for example, "AABBC" is not allowed
: With these in mind, it will be quite easy:
: 1) Map each sequence to a unsigned int32 (lower 26 bits)
: seq1 = "ABCDEFG",
: maps/hashes to b1= (000000) 00000000000000000001111111
: seq2 = "DBCAPFG",

c*******w
发帖数: 63

请问, 找max sub square maxtrix的过程, 怎么用O(n^2)的算法实现啊?
谢谢!

【在 p********7 的大作中提到】

: 我觉得这个办法最好，实现起来最简单。平均复杂度和最差复杂度都是N^2，其他办法
: 最差复
: 杂度还是N^2，估计平均复杂度少一些

(共1页)

进入JobHunting版参与讨论

相关主题
● 请问facebook末位淘汰是怎样的哦有offer了但是怕去了不能survive	● amazon 第一轮电话面试
● shorten url 单机解法抛砖引玉	● 回馈本版，贴GOOGLE电话面经
● A onsite被拒,面经,求分析失败原因	● 关于anagram的老题？
● 问一道 C/C++ 题	● 一周多了。。。等的太不淡定了。。。说两个面经吧
● 专家们，find the longest common substring of two strings	● 面试题目：有２个字符串，消除第一个字符串中第二个字符串包含的所有字母。例如： string1: helloworld string2: abcdef output: hlloworld 面
● Amazon	● 问到题
● 给一个大俗之一的面经吧。	● 请教一道leetcode的题目
● 一个基本的string问题	● 请问一个题目

相关话题的讨论汇总
话题: seq1话题: seq2话题: abcdefg话题: problem话题: window

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天