5. RE overview
match foo replace with bar
Perl /foo/ (on $_) s/foo/bar/ (on $_)
Javascript /foo/ foolish.replace(/foo/, bar)
Vi /foo/ :s/foo/bar/
TextMate -F, Find: foo -F Find: foo, Replace: bar
6. RE overview
match foo replace with bar
Perl /foo/ (on $_) s/foo/bar/ (on $_)
Javascript /foo/ foolish.replace(/foo/, bar)
Vi /foo/ :s/foo/bar/
TextMate -F, Find: foo -F Find: foo, Replace: bar
7. RE overview
match foo replace with bar
Perl /foo/ (on $_) s/foo/bar/ (on $_)
Javascript /foo/ foolish.replace(/foo/, bar)
Vi /foo/ :s/foo/bar/
TextMate -F, Find: foo -F Find: foo, Replace: bar
15. Example
This reveals that plain text is in fact the
technical user's way to regard a 鍖le or a
sequence of bytes. In this sense, there is no
plain text.
/reveal(.*)plain/
/reveal(.*?)plain/
/t.{2,3}t/
16. Example
This reveals that plain text is in fact the
technical user's way to regard a 鍖le or a
sequence of bytes. In this sense, there is no
plain text.
/reveal(.*)plain/
/reveal(.*?)plain/
/t.{2,3}t/
17. Example
This reveals that plain text is in fact the
technical user's way to regard a 鍖le or a
sequence of bytes. In this sense, there is no
plain text.
/reveal(.*)plain/
/reveal(.*?)plain/
/t.{2,3}t/
18. Example
This reveals that plain text is in fact the
technical user's way to regard a 鍖le or a
sequence of bytes. In this sense, there is no
plain text.
/reveal(.*)plain/
/reveal(.*?)plain/
/t.{2,3}t/
21. Character Classes /
Properties
[0-9a-z] (classes)
+420[0-9]{9} = simpli鍖ed czech phone nr.
22. Character Classes /
Properties
[0-9a-z] (classes)
+420[0-9]{9} = simpli鍖ed czech phone nr.
dont: [A-z0-]
23. Character Classes /
Properties
[0-9a-z] (classes)
+420[0-9]{9} = simpli鍖ed czech phone nr.
dont: [A-z0-]
[a-z&&[^j-n]] == [a-io-z]
24. Character Classes /
Properties
[0-9a-z] (classes)
+420[0-9]{9} = simpli鍖ed czech phone nr.
dont: [A-z0-]
[a-z&&[^j-n]] == [a-io-z]
p{Upper} (properties)
25. Character Classes /
Properties
[0-9a-z] (classes)
+420[0-9]{9} = simpli鍖ed czech phone nr.
dont: [A-z0-]
[a-z&&[^j-n]] == [a-io-z]
p{Upper} (properties)
works great on Unicode text (Latin,Katakana)
26. Character Classes /
Properties
[0-9a-z] (classes)
+420[0-9]{9} = simpli鍖ed czech phone nr.
dont: [A-z0-]
[a-z&&[^j-n]] == [a-io-z]
p{Upper} (properties)
works great on Unicode text (Latin,Katakana)
[:alnum:], [:^space:] (POSIX bracket)
29. Character Types
. == anything (apart from newline)
s == space == [tnvfr ]
more in unicode
30. Character Types
. == anything (apart from newline)
s == space == [tnvfr ]
more in unicode
w == word char == cca [0-9a-zA-Z_]
is complicated in unicode
31. Character Types
. == anything (apart from newline)
s == space == [tnvfr ]
more in unicode
w == word char == cca [0-9a-zA-Z_]
is complicated in unicode
d == digit == [0-9]
h == hexadecimal digit == [0-9a-fA-F]
32. Character Types
. == anything (apart from newline)
s == space == [tnvfr ]
more in unicode
w == word char == cca [0-9a-zA-Z_]
is complicated in unicode
d == digit == [0-9]
h == hexadecimal digit == [0-9a-fA-F]
SWD == [^s][^w][^d]
33. Example
This reveals that plain text is in fact the
technical user's way to regard a 鍖le or a
sequence of bytes. In this sense, there is no
plain text.
/b[w&&[^aA]]+b/
/W{2,}w+b/
34. Example
This reveals that plain text is in fact the
technical user's way to regard a 鍖le or a
sequence of bytes. In this sense, there is no
plain text.
/b[w&&[^aA]]+b/
/W{2,}w+b/
41. Options
/foo/imsx
i - case insensitive
m - multiline (^,$ represent start of string/鍖le)
s - single line (. matches newlines)
x - extended!
g - global
42. Options
/foo/imsx
i - case insensitive
m - multiline (^,$ represent start of string/鍖le)
s - single line (. matches newlines)
x - extended!
g - global
can be written inline
(?imsx-imsx)
(?imsx-imsx:...)
43. Options
/foo/imsx
i - case insensitive
m - multiline (^,$ represent start of string/鍖le)
s - single line (. matches newlines)
x - extended!
g - global (?x-i)
#this is cool
can be written inline (
foo #my important value
| #don't forget the alternative
(?imsx-imsx)
bar
) # result equals to (foo|bar)
(?imsx-imsx:...)
47. Groups/Replacing
(...) - matched group
$1 - $9
alternatively 1 - 9 (not recommended)
nested groups ordered by left bracket
48. Groups/Replacing
(...) - matched group
$1 - $9
alternatively 1 - 9 (not recommended)
nested groups ordered by left bracket
(?:...) - non-captured group
useful for (?:foo)+ or (?:foo|bar)
55. Recursive RE
very important!
quote & bracket matching
technically not part of regular grammar
two styles
g<name> or g<n> - TextMate
(?R) - Perl
56. Example
(?x:
( # match the initial opening parenthesis
# Now make a named group 'balanced' which
# matches a balanced substring.
(?<balanced>
[^()] # A balanced substring is either something
# that is not a parenthesis:
| # or a parenthesised string:
( # A parenthesised string begins with an opening parenthesis
g<balanced>* # followed by a sequence of balanced substrings
) # and ends with a closing parenthesis
)* # Look for a sequence of balanced substrings
) # Finally, the outer closing parenthesis
)
57. Example
(?x:
( # match the initial opening parenthesis
# Now make a named group 'balanced' which
# matches a balanced substring.
(?<balanced>
[^()] # A balanced substring is either something
# that is not a parenthesis:
| # or a parenthesised string:
( # A parenthesised string begins with an opening parenthesis
g<balanced>* # followed by a sequence of balanced substrings
) # and ends with a closing parenthesis
)* # Look for a sequence of balanced substrings
) # Finally, the outer closing parenthesis
)
or: (([^()]|(?R))*)