際際滷

際際滷Share a Scribd company logo
Regular Expression[A-Z]+
PYCON APAC 2016
覩殊
MATA COMPANY
minji@matazoo.net
覦 螳
覩殊 / 蟆螳 螳覦
) MATA COMPANY Software Engineer
DEVSISTERS, The Beatpacking Company
NEXON Python 覲伎^螳, Django Girls 貊豺
覦 
 覦 Python3 襯 .
 覦襦 蠏  危危  給
る 伎
Why Regex?
螳  x 3
The re module
一給語 焔 
蠏   蟆
Why regex?
轟 蠏豺 螳讌 覓語伎 讌    
覓語伎 蟆企 豺 ク襴.
100312467WhySoLonelywondergirls3014725201603062016-03-20T12:00:35+09:00
->WhySoLonelywondergirls2016-03-20T12:00
/^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]
[0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|
[01]?[0-9][0-9]?)$/
WHAAAAT?
How to learn regex?
豌 覲旧″螻 曙  企糾 蟷伎.
讌襷 蠏 螳覲企 企旧 .
螳  x 3
 1
碁 覯 襷れ広
010-3333-7777
d{3}-d{4}-d{4}
 2
轟危 譯殊 host 企 螳語り鍵
http://www.google.com/?q=pycon
http://([^/]*)/?q=pycon
 3
企 譯殊 企 蟶朱企┨伎
minji@matazoo.net
^([^@]+)@.+$
The re module
re module
Python re 覈襦 蠏  豌襴.
import re
re.search(pattern, string)
re module
>>> re.search(abcd, abcdef)
<_sre.STR_Match object at 0X120670cc2>
>>> re.search(zxc, abcdef)
None
れ 襷  x 3
re.sub()
import re
phone = '010-1234-5678'
re.sub(
r'(d{3}-d{4}-)(d{4})',
r'1****',
phone
)
>>> 010-1234-****'
re.match()
import re
link = 'http://www.google.com/?q=pycon'
match = re.match(
r(http://)([^/]*)(.*),
link
)
match.group(2)
>>> 'www.google.com'
re.search()
import re
email = 'minji@matazoo.net'
match = re.search('^[^@]*', email)
match.group()
>>> 'minji'
match vs search
import re
sample = '2016pycon'
re.match('[a-z]+', sample)
>>> None
re.search('[a-z]+', sample)
>>> <_sre.SRE_Match object; span=(4, 8), match='pycon'>
re module
re.search(pattern, string, flags=0)
= match 豌覯讌 覓語伎 谿場譴
re.match(pattern, string, flags=0)
= string 豌覿 match讌 
re.findall(pattern, string, flags=0)
= string 豌伎 pattern螻 殊 蟆 覈 谿場 list襦
 譴
Character classes
. 譴覦蠖 覓語襯 誤 覈 覓語 襷れ
d 覈  襷れ [0-9]
D 螳  覓語 襷れ [^0-9]
w   覓語 襷れ [a-zA-Z0-9]
(伎  )
W   覓語螳  蟆螻 襷れ [^a-zA-Z0-9]
s 危 ろ伎 覓語 襷れ
S 危 ろ伎り  蟆螻 襷れ
Anchors and Repetition
^abc$ 覓語伎 / 覓語伎 襷讌襷螻 襷れ
* 0 伎 覦覲
+ 1 伎 覦覲
? 0  1
{x} x 覦覲 (e.g {3} )
{x,y} x覿 y蟾讌 覦覲
[abc] 覓語 讌 譴  覓語襯 覩
[^abc] a,b,c 螳  覓語
[a-d] a, b, c or d 伎  覓語襯 覩
PyCon APAC 2016 Regular Expression[A-Z]+
一給語 企
PyCon APAC 2016 Regular Expression[A-Z]+
<html op="news"><head><meta name="referrer" content="origin"><meta
name="viewport" content="width=device-width, initial-
scale=1.0"><link rel="stylesheet" type="text/css" href="news.css?
8h9C3zM9d2ErvunVTkjK">
<link rel="shortcut icon" href="favicon1.ico">
<link rel="alternate" type="application/rss+xml"
title="RSS" href="rss">
<title>Hacker News</title></head><body><center><table
id="hnmain" border="0" cellpadding="0" cellspacing="0" width="85%"
bgcolor="#f6f6ef">
<tr><td bgcolor="#ff6600"><table border="0"
cellpadding="0" cellspacing="0" width="100%" style="padding:
2px"><tr><td style="width:18px;padding-right:4px"><a href="http://
www.ycombinator.com"><img src=/slideshow/pycon-apac-2016-regular-expressionaz-64976902/64976902/"y18.gif" width="18" height="18"
style="border:1px white solid;"></a></td>
<td style="line-height:12pt; height:10px;"><span
class="pagetop"><b class="hnname"><a href="news">Hacker News</a></
b>
link: https://bugzilla.mozilla.org/show_bug.cgi?
id=1173199#c31 
title: Our primary goal is to un-fork the Tor Browser
link: http://siliconangle.com/blog/2016/08/05/watson-
correctly-diagnoses-woman-after-doctors-were-stumped/ 
title: IBM Watson correctly diagnoses a form of leukemia
link: http://gping.io 
title: Show HN: Gping.io  Like TinyURL for your car
link: http://bit-player.org/2016/the-39th-root-of-92 
title: The 39th Root of 92
link: http://www.sciencealert.com/we-just-got-even-
weirder-results-about-the-alien-megastructure-star 
title: Tabby's star is dimming at an incredible rate
磯Μ螳  Output
regex 郁 貊企慨蠍
PyCon APAC 2016 Regular Expression[A-Z]+
PyCon APAC 2016 Regular Expression[A-Z]+
re.DOTALL ??
data = <title>nPYCON APAC 2016nnRegular
Expressionsnn</title>n
re.search(<title>(.*)</title>, data).group(1)
AttributeError: 'NoneType' object has no attribute group'
re.search(<title>(.*)</title>, data, re.DOTALL).group(1)
'nPYCON APAC 2016nnRegular Expressions[A-Z]+nMinji Yangn
re.compile
蠏   蟆
Vim: Find and Replace
:%s/old/new/g
http://vimregex.com/
1033303 -> 1233303, 1033213 -> 1233213

:%s/103(d{4})/1231/g
PyCon APAC 2016 Regular Expression[A-Z]+
str.find vs re.match vs in
http://stackoverflow.com/questions/4901523/whats-a-faster-operation-re-match-search-or-str-find
str.find vs re.match vs in
http://stackoverflow.com/questions/4901523/whats-a-faster-operation-re-match-search-or-str-find
strfind : 0.441393852234
re.match: 2.12302494049
in : 0.251421928406
WHAAAAT?
焔
蠏 焔レ 譬讌 
讌襷 貊 ク襴
焔レ 譴 貊 regex 螳 旧
print(Thank You)

More Related Content

PyCon APAC 2016 Regular Expression[A-Z]+

  • 1. Regular Expression[A-Z]+ PYCON APAC 2016 覩殊 MATA COMPANY minji@matazoo.net
  • 2. 覦 螳 覩殊 / 蟆螳 螳覦 ) MATA COMPANY Software Engineer DEVSISTERS, The Beatpacking Company NEXON Python 覲伎^螳, Django Girls 貊豺
  • 3. 覦 覦 Python3 襯 . 覦襦 蠏 危危 給
  • 4. る 伎 Why Regex? 螳 x 3 The re module 一給語 焔 蠏 蟆
  • 5. Why regex? 轟 蠏豺 螳讌 覓語伎 讌 覓語伎 蟆企 豺 ク襴. 100312467WhySoLonelywondergirls3014725201603062016-03-20T12:00:35+09:00 ->WhySoLonelywondergirls2016-03-20T12:00
  • 8. How to learn regex? 豌 覲旧″螻 曙 企糾 蟷伎. 讌襷 蠏 螳覲企 企旧 .
  • 9. 螳 x 3
  • 10. 1 碁 覯 襷れ広 010-3333-7777 d{3}-d{4}-d{4}
  • 11. 2 轟危 譯殊 host 企 螳語り鍵 http://www.google.com/?q=pycon http://([^/]*)/?q=pycon
  • 12. 3 企 譯殊 企 蟶朱企┨伎 minji@matazoo.net ^([^@]+)@.+$
  • 14. re module Python re 覈襦 蠏 豌襴. import re re.search(pattern, string)
  • 15. re module >>> re.search(abcd, abcdef) <_sre.STR_Match object at 0X120670cc2> >>> re.search(zxc, abcdef) None
  • 16. れ 襷 x 3
  • 17. re.sub() import re phone = '010-1234-5678' re.sub( r'(d{3}-d{4}-)(d{4})', r'1****', phone ) >>> 010-1234-****'
  • 18. re.match() import re link = 'http://www.google.com/?q=pycon' match = re.match( r(http://)([^/]*)(.*), link ) match.group(2) >>> 'www.google.com'
  • 19. re.search() import re email = 'minji@matazoo.net' match = re.search('^[^@]*', email) match.group() >>> 'minji'
  • 20. match vs search import re sample = '2016pycon' re.match('[a-z]+', sample) >>> None re.search('[a-z]+', sample) >>> <_sre.SRE_Match object; span=(4, 8), match='pycon'>
  • 21. re module re.search(pattern, string, flags=0) = match 豌覯讌 覓語伎 谿場譴 re.match(pattern, string, flags=0) = string 豌覿 match讌 re.findall(pattern, string, flags=0) = string 豌伎 pattern螻 殊 蟆 覈 谿場 list襦 譴
  • 22. Character classes . 譴覦蠖 覓語襯 誤 覈 覓語 襷れ d 覈 襷れ [0-9] D 螳 覓語 襷れ [^0-9] w 覓語 襷れ [a-zA-Z0-9] (伎 ) W 覓語螳 蟆螻 襷れ [^a-zA-Z0-9] s 危 ろ伎 覓語 襷れ S 危 ろ伎り 蟆螻 襷れ
  • 23. Anchors and Repetition ^abc$ 覓語伎 / 覓語伎 襷讌襷螻 襷れ * 0 伎 覦覲 + 1 伎 覦覲 ? 0 1 {x} x 覦覲 (e.g {3} ) {x,y} x覿 y蟾讌 覦覲 [abc] 覓語 讌 譴 覓語襯 覩 [^abc] a,b,c 螳 覓語 [a-d] a, b, c or d 伎 覓語襯 覩
  • 27. <html op="news"><head><meta name="referrer" content="origin"><meta name="viewport" content="width=device-width, initial- scale=1.0"><link rel="stylesheet" type="text/css" href="news.css? 8h9C3zM9d2ErvunVTkjK"> <link rel="shortcut icon" href="favicon1.ico"> <link rel="alternate" type="application/rss+xml" title="RSS" href="rss"> <title>Hacker News</title></head><body><center><table id="hnmain" border="0" cellpadding="0" cellspacing="0" width="85%" bgcolor="#f6f6ef"> <tr><td bgcolor="#ff6600"><table border="0" cellpadding="0" cellspacing="0" width="100%" style="padding: 2px"><tr><td style="width:18px;padding-right:4px"><a href="http:// www.ycombinator.com"><img src=/slideshow/pycon-apac-2016-regular-expressionaz-64976902/64976902/"y18.gif" width="18" height="18" style="border:1px white solid;"></a></td> <td style="line-height:12pt; height:10px;"><span class="pagetop"><b class="hnname"><a href="news">Hacker News</a></ b>
  • 28. link: https://bugzilla.mozilla.org/show_bug.cgi? id=1173199#c31 title: Our primary goal is to un-fork the Tor Browser link: http://siliconangle.com/blog/2016/08/05/watson- correctly-diagnoses-woman-after-doctors-were-stumped/ title: IBM Watson correctly diagnoses a form of leukemia link: http://gping.io title: Show HN: Gping.io Like TinyURL for your car link: http://bit-player.org/2016/the-39th-root-of-92 title: The 39th Root of 92 link: http://www.sciencealert.com/we-just-got-even- weirder-results-about-the-alien-megastructure-star title: Tabby's star is dimming at an incredible rate 磯Μ螳 Output
  • 32. re.DOTALL ?? data = <title>nPYCON APAC 2016nnRegular Expressionsnn</title>n re.search(<title>(.*)</title>, data).group(1) AttributeError: 'NoneType' object has no attribute group' re.search(<title>(.*)</title>, data, re.DOTALL).group(1) 'nPYCON APAC 2016nnRegular Expressions[A-Z]+nMinji Yangn
  • 34. 蟆 Vim: Find and Replace :%s/old/new/g http://vimregex.com/ 1033303 -> 1233303, 1033213 -> 1233213 :%s/103(d{4})/1231/g
  • 36. str.find vs re.match vs in http://stackoverflow.com/questions/4901523/whats-a-faster-operation-re-match-search-or-str-find
  • 37. str.find vs re.match vs in http://stackoverflow.com/questions/4901523/whats-a-faster-operation-re-match-search-or-str-find strfind : 0.441393852234 re.match: 2.12302494049 in : 0.251421928406
  • 39. 焔 蠏 焔レ 譬讌 讌襷 貊 ク襴 焔レ 譴 貊 regex 螳 旧