The author developed an obsession with reading Chinese comics which led him to create a web scraping tool to batch download comic images so he could read them on his iPad, and in the process of developing this tool he ended up creating a library called PunkyBrowster for testing JavaScript behavior on websites that is now used at his company Leapfrog. Over time features were added to PunkyBrowster like ignoring SSL errors and form manipulation methods.
1 of 20
Download to read offline
More Related Content
How My Comic Book Obsession Birthed a New Functional Testing Tool
1. How My Comic Book
Obsession Birthed a New
Functional Testing Tool
Feihong Hsu
Testing in Python Birds of a Feather
March 12, 2011
6. Problem: Chinese comics portals load
slowly and have poor usability
Solution: Write a web scraping tool to
batch download images, so I can
be a proper Apple fanboy and read
comics on my iPad
7. First approach: urllib2 + lxml.html
Verdict: Total fail, couldn't handle
JavaScript and cookies
11. Coincidence: At Leapfrog, we needed
a better way to test JavaScript
behavior on our sites
Result: Leapfrog subsidizes my
comic book addiction
12. Over time, we added some nice stuff
to our spynner fork
- Ability to ignore SSL errors
- Form manipulation methods
- Screen capture (consistently-sized
- Other stuff that I can't remember
13. Question: Hey Feihong, where can I get
this sexy library?
Answer: Nowhere, I'm too busy
reading comics to open source it
14. Real answer: We're working on
open sourcing it, but we ran into
some blocks
(Cast sidelong glance at Terry)
15. However, we are NOT
soliciting suggestions for a
name. We have the
PERFECT name already.
18. from punky import Browster
browser = Browster(auto_load_images=True)
'How do I de-pube-ify waterless urinals?')
browser.submit('form#hfh', wait_load=True)
for element in browser.all('#r12 > div'):
print unicode(element.toPlainText())
19. Random note: I
never want to have
the need to see a
urologist. But if I
do, I hope he's
wearing a badge
like this: