由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
JobHunting版 - 问一下Repeated DNA Sequences
相关主题
献码:Repeated DNA Sequences问道G题(4)
问几道面试题Google到底比Microsoft强多少啊?
请教一道 G 家 DNA edit distance的题求解一道面试题 snake sequence
Bloomberg电面题目+ 攒RP for onsite请教一个面试题
L一个电面题问道题
two job openings for research assistant ( next generation sequencing related )T problem
怎样求common divisor 最快问个puzzle
一道google 题,谁给翻译一下意思,多谢。工程类转行cs,一定需要学位吗?
相关话题的讨论汇总
话题: sequences话题: 0x3fffffff话题: dna话题: repeated话题: string
进入JobHunting版参与讨论
1 (共1页)
a***e
发帖数: 413
1
自己做的总是memory exceeded,
看了讨论,
https://oj.leetcode.com/discuss/24478/i-did-it-in-10-lines-of-c
还是不懂为啥m[t = t << 3 & 0x3FFFFFFF | s[i] & 7]++ == 1就能达到rolling hash
的效果呢?
i只是一个char啊,这个是哪里推导出来的?多谢!
Neat idea. The additional 1 bit per letter still encode each substring in
10x3 = 30 bits, just under 4 bytes for a 32-bit integer.
Your code could be further simplified. By observing that s[i] & 7 is never 0
, each of the first nine substrings with length < 10 will have unique hash
key and will never collide with other 10-letter long sequences. Therefore
the first loop could be removed and be compacted into a single loop.
vector findRepeatedDnaSequences(string s) {
unordered_map m;
vector r;
for (int t = 0, i = 0; i < s.size(); i++)
if (m[t = t << 3 & 0x3FFFFFFF | s[i] & 7]++ == 1)
r.push_back(s.substr(i - 9, 10));
return r;
}
s****a
发帖数: 794
2
因为他说ACTG在ASC里面后三位不一样吧
其实你就说A001 C010 T011 G100啥都没有是000 一样的
无非赶得巧省几行代码
p*******e
发帖数: 4
3
你不一定非要用这种方法,一个字母用2个bit表示,就不会memory exceed了
m[t = t << 3 & 0x3FFFFFFF | s[i] & 7]++ == 1
这里的s[i] & 7 是取下一个字母N,
t << 3 & 0x3FFFFFFF 是保留当前10-letter string的后9个letter S,
| 是把 N 合并到S中,这样就更新了key
1 (共1页)
进入JobHunting版参与讨论
相关主题
工程类转行cs,一定需要学位吗?L一个电面题
G 家面经two job openings for research assistant ( next generation sequencing related )
8 queens问题最好解法是什么?时间复杂度?怎样求common divisor 最快
[合集] business casual是什么着装啊?一道google 题,谁给翻译一下意思,多谢。
献码:Repeated DNA Sequences问道G题(4)
问几道面试题Google到底比Microsoft强多少啊?
请教一道 G 家 DNA edit distance的题求解一道面试题 snake sequence
Bloomberg电面题目+ 攒RP for onsite请教一个面试题
相关话题的讨论汇总
话题: sequences话题: 0x3fffffff话题: dna话题: repeated话题: string