boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Programming版 - implement a simple regular expression match?
相关主题
gmail越发难用为毛wildcard之类的search至今不能用?
请教Regular Expression,
请教一个变态的regular expression 替换
怎么用lex处理DFA?
关于regular expression
regular expression
请教怎样尽快看明白同事的程序?
简单的perl正则表达式文本替换多个空行问题求教
有没有谁做 text mining 的?
Python 缩进的syntax
相关话题的讨论汇总
话题: regex话题: nw话题: match话题: expression话题: simple
进入Programming版参与讨论
1 (共1页)
g*********s
发帖数: 1782
1
the regular expression only includes a-z, '*', and '.'
support:
1. "x*", aka 0 or more continuous 'x'
2. ".", aka wildcard matching any letter.
it seems back-trace is inevitable because of the following ambiguity:
match("b", "b*.") == true;
match("b", "b*") == true;
X****r
发帖数: 3557
2
Any formal regular expression can be mapped to finite state
machine, so you don't need to back-trace, i.e. you can scan
input string only once to match for any given regular expression.

【在 g*********s 的大作中提到】
: the regular expression only includes a-z, '*', and '.'
: support:
: 1. "x*", aka 0 or more continuous 'x'
: 2. ".", aka wildcard matching any letter.
: it seems back-trace is inevitable because of the following ambiguity:
: match("b", "b*.") == true;
: match("b", "b*") == true;

g*********s
发帖数: 1782
3
yes, regex -> nfa -> dfa can completely solve the regex issue.
but what i expect here is some simplified solution for this simple regex.
or this is not simple as long as "*" is introduced, and a full fa based
solution is required?

【在 X****r 的大作中提到】
: Any formal regular expression can be mapped to finite state
: machine, so you don't need to back-trace, i.e. you can scan
: input string only once to match for any given regular expression.

X****r
发帖数: 3557
4
For this simple regex you don't need to explicitly convert to DFA.
e.g. you can maintaining a list of possible 'leads' so far.
Of course, this won't be as efficient as pre-compute the
state table for the given regex, but if you want a simple one:
#define MAX_STARS 256
int match(const char *input, const char *regex) {
int p[2][MAX_STARS] = {0};
int c[2] = {1};
int w = 0, nw = 1, i, j;
char ch, nch;
for (; *input; input++) {
c[nw] = 0;
for (i = 0; i < c[w]; i++) {
j = p[w][i];
while (ch = regex[j]) {
nch = regex[j+1];
if (nch == '*') {
if (ch == '.' || ch == *input) {
p[nw][c[nw]++] = j;
}
j += 2;
} else {
if (ch == '.' || ch == *input) {
p[nw][c[nw]++] = j+1;
}
break;
}
}
}
w = nw;
nw = !nw;
}
for (i = 0; i < c[w]; i++) {
for (j = p[w][i]; regex[j] && regex[j+1] == '*'; j+= 2);
if (!regex[j]) {
return 1;
}
}
return 0;
}
(edit: damn BBS ruined the indent format, had to manually space...)

regex.
based

【在 g*********s 的大作中提到】
: yes, regex -> nfa -> dfa can completely solve the regex issue.
: but what i expect here is some simplified solution for this simple regex.
: or this is not simple as long as "*" is introduced, and a full fa based
: solution is required?

g*********s
发帖数: 1782
5
found one in Beautiful Code by K&R's K.
so this is not a trivial question.

【在 X****r 的大作中提到】
: For this simple regex you don't need to explicitly convert to DFA.
: e.g. you can maintaining a list of possible 'leads' so far.
: Of course, this won't be as efficient as pre-compute the
: state table for the given regex, but if you want a simple one:
: #define MAX_STARS 256
: int match(const char *input, const char *regex) {
: int p[2][MAX_STARS] = {0};
: int c[2] = {1};
: int w = 0, nw = 1, i, j;
: char ch, nch;

c*****t
发帖数: 1879
6
The question is, why write this kind of routine when scanf can
already do it?

【在 g*********s 的大作中提到】
: found one in Beautiful Code by K&R's K.
: so this is not a trivial question.

g*********s
发帖数: 1782
7
why u learn classical algorithms when most of them already have been
implemented by stl and boost?

【在 c*****t 的大作中提到】
: The question is, why write this kind of routine when scanf can
: already do it?

c*****t
发帖数: 1879
8
STL/Boost are for people to use. Trying to implement these yourselves
are quite silly.
Like goodbug once said, what was the point of learning algorithms when
one could just google + copy/paste? Having the right google skill
is far more important :)
I'd suggest you to do some real projects than working on some interview
questions or some specific functions.

【在 g*********s 的大作中提到】
: why u learn classical algorithms when most of them already have been
: implemented by stl and boost?

1 (共1页)
进入Programming版参与讨论
相关主题
Python 缩进的syntax
python的一大缺点
有人用Haskell吗
lisper
jun rao说kafka已经开始用Java代码重写部分code了
面向对象的语言
tracing和logging有什么区别?
How does YAHOO calculate RSI? (转载)
请教template和factory有啥区别?
map是用什么data structure来implement的?
相关话题的讨论汇总
话题: regex话题: nw话题: match话题: expression话题: simple