l*****v 发帖数: 498 | 1 看见好几个人遇到regex 的interview question了,比如说从大量的page里面找电话号
码。interview的时候是要给出regex expression还是说知道要用regex就行了? |
|
t******7 发帖数: 396 | 2 REGEX
1.Use one REGEX to match the last line of any US address
Must be able to match all 4 examples below
Must be able to match with or without the comma.
Ex: Stamford, CT 06901
Stamford, Connecticut 06901
Stamford, CT 06901-2064
Stamford, Connecticut 06901-2064
2.For each date example below, use a REGEX to match the format:
Tue, 10 Dec 2012 19:45:45
2012年12月10日 |
|
wy 发帖数: 14511 | 3 (?:regex) Non-capturing parentheses group the regex
so you can apply regex operators, but do not capture anything
and do not create backreferences.
vim里面有办法写这个么? |
|
d***a 发帖数: 316 | 4 【 以下文字转载自 CS 讨论区 】
发信人: dunfa (蹲着发财), 信区: CS
标 题: help about regex
发信站: BBS 未名空间站 (Sun May 15 19:02:39 2011, 美东)
Why the following three are not covered by my regex? I am using egrep.
Thank you very much.
gsaunix:*:1604:megb1,paul
rainlab:*:1608:megb1,michal
mgis:*:1597:megb1,michal,nafisa
my regex
.+:\*:[0-9]+:[a-z]+,[a-z]+ |
|
d***a 发帖数: 316 | 5 Why the following three are not covered by my regex? I am using egrep.
Thank you very much.
gsaunix:*:1604:megb1,paul
rainlab:*:1608:megb1,michal
mgis:*:1597:megb1,michal,nafisa
my regex
.+:\*:[0-9]+:[a-z]+,[a-z]+ |
|
b***y 发帖数: 2799 | 6 ☆─────────────────────────────────────☆
nkw (非死非活) 于 (Sat Jul 19 03:08:26 2008) 提到:
今天浪费很多时间在一个很简单的regex上,原来是grep没有\t (tab).有没有什么
option让这些程序使用一个统一的regex?对不常用的人要记住这些细微差别不容易。
还遇到一怪事,我的数据是三个字段以tab分割的结构,第三个日期字段很多为空字段。
xxxx{tab}yyyy{tab}12/31/2007
xxxx{tab}yyyy{tab}
xxxx{tab}yyyy{tab}11/30/2007
xxxx{tab}yyyy{tab}
当时我不知道grep的tab输入时不得不先用sed把\t换成|。
sed 's/\t/|/g' file|grep "[^|]*|[^|]*|.+" 得不到任何结果。
sed 's/\t/|/g' file|grep "[^|]*|[^|]*|[0-9]+" 就能有输出。
前一个有什么问题?
☆──────────────────────────────────── |
|
|
|
发帖数: 1 | 9 我试了一下,题目的要求是说要把dept,course code, semester和年份分别parse出来
你这样是把整条record找出来了,当然也很好,其实后续已经可以用python来处理了,只
是这个面试比较奇葩,只能用regex,我在想是不是可能无解
再说,regex对于full text search应该是不可用的吧?假如full text很大,比如整个
wikipedia |
|
|
j******4 发帖数: 116 | 11 记得版上提到过。就是简化啦的regex, 只考虑* 和 . 及a-z.
感觉要做nfs/dfs 转化等等对面试不现实。有大牛给指点一下吧。 |
|
j******4 发帖数: 116 | 12 不是啊。 那个是wildcard match, 一个loop 就可以了。
这个是 regex.
没人有兴趣? 自己顶一下。 |
|
j*****g 发帖数: 223 | 13 好像没什么不同呀
regex limited to . and * 不就是wildcard matching? wildcard match一个loop就可
以吗?再想想看。。。 |
|
|
t****s 发帖数: 1 | 15 a Java version for regex match as below, let me know if there is any issue.
boolean isMatch(String s, String p) throws Exception{
if (p==null || p.length()==0)
return s==null || s.length()==0;
return isMatch(s,p,0,0);
}
boolean isMatch(String s, String p, int is, int ip) throws Exception{
if (ip>=p.length())
return s==null || is>=s.length();
if (ip==p.length()-1 || p.charAt(ip+1)!='*'){
if (p.charAt(ip)=='*')
throw new Exception("illegal");
if (is>=s.lengt... 阅读全帖 |
|
b****o 发帖数: 387 | 16 the string does not contain duplicated specific word, such as "SYMBOL"
inside.
e.g. "a SYMBOL is OK, but a second SYMBOL is not ok".
This above string contains a duplicated "SYMBOL".
How to design a regex? |
|
j********2 发帖数: 82 | 17 没错。DP只能用来解wildcard,不能解 regex.即使是wildcard,也比递归要难写(反正
我肯定当场白板写不出来)。 |
|
h******8 发帖数: 55 | 18 这个不对吧。* in regex can also be done in the following way
if *(pat+1) == ‘*’, regex_match(str, pat) =
• regex_match(str, pat+2) OR // match 0 occurrence – “ab”, “a*
ab”
• regex_match(str+1, pat) if equal_char(*str, *pat) // match 1
occurrence – “aab”, “a*b”
And the above can be achieved with DP:
match[i][j] is the array (to match string[i] with pattern[j]), which is of (
string length + 1)*(pattern length + 1).
Initialize:
match[0][0] = true
match[*][0] = false if str is not empty... 阅读全帖 |
|
j********2 发帖数: 82 | 19 wildcard:
'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).
regex:
‘.’ Matches any single character.
‘*’ Matches zero or more of the preceding element.
假设用recursion, 怎样的examples会导致exponentional time?
Note that for pattern "******", we can use a loop to skip those. Thus, ("
aaaaaaaab", "*****ab") is not such an example. |
|
j********2 发帖数: 82 | 20 wildcard:
'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).
regex:
‘.’ Matches any single character.
‘*’ Matches zero or more of the preceding element.
假设用recursion, 怎样的examples会导致exponentional time?
Note that for pattern "******", we can use a loop to skip those. Thus, ("
aaaaaaaab", "*****ab") is not such an example. |
|
r**********g 发帖数: 22734 | 21 ^(\w+\s*)+\,?\s*\w\w+\s+\d\d\d\d\d(-\d\d\d\d)? : \d{5} may not be
supported by all regex libs
第二个自己看着写吧 |
|
|
b**********5 发帖数: 7881 | 23 别的面经看到。。。
"第三轮 leetcode原题,regax match的那题,面试前一周刚做过,但是面试那会有点
不是很清楚怎么dp了,模模糊糊的想起来了点思路刚讲一半,面试官说dp不够好。。让
我想别的方法,最后无果,dp也想不起来咋写了,最后写了个垃圾递归,跪就跪在这一
轮了"
面试官说dp不够好??
i only know recursion and dp... what would be a better way than dp for regex
? |
|
|
g*****g 发帖数: 34805 | 25 【 以下文字转载自 Programming 讨论区 】
发信人: goodbug (好虫), 信区: Programming
标 题: regex急求帮助
发信站: BBS 未名空间站 (Thu Nov 9 17:45:56 2006), 转信
谁能告诉我为啥
body.replaceAll("(?i)(
|
|