1 2 3 4 5 6
import re def process_html(str): pattern = re.compile('<object ([\w="\d+"]|\s)+>([\x20-\x7E\s])+</object>') match = pattern.match(str) return match.group()
Refactorings
No refactoring yet !
nicerobot
March 17, 2010, March 17, 2010 12:46, permalink
1. Regular Expressions Are Not A Good Idea for Parsing XML, HTML, or e-mail Addresses http://wiki.tcl.tk/4164
2. Your code refers to matches as 'm' (line 6) but the matches are named 'match' (line 5).
3. Groups are referenced by specifying the (one-based) group number.
Note: I didn't change your re. I just changed lines 5 and 6.
1 2 3 4 5 6
import re def process_html(str): pattern = re.compile('<object ([\w="\d+"]|\s)+>([\x20-\x7E\s])+</object>') m = pattern.match(str) return m.group(1)
rullon.myopenid.com
March 17, 2010, March 17, 2010 13:33, permalink
2nicerobot, thx for reply!
goal was to clean vimeo(or any other service) embed player code. so i decided to not parse anything.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
we have: -------- <object width="400" height="300"><param name="allowfullscreen" value="true" /> <param name="allowscriptaccess" value="always" /><param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=9851483&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1" /> <embed src="http://vimeo.com/moogaloop.swf?clip_id=9851483&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"></embed> </object> <p><a href="http://vimeo.com/9851483">Gorillaz - Stylo</a> from <a href="http://vimeo.com/uccimaru">mario ucci</a> on <a href="http://vimeo.com">Vimeo</a>.</p> we want: -------- <object width="400" height="300"><param name="allowfullscreen" value="true" /> <param name="allowscriptaccess" value="always" /><param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=9851483&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1" /> <embed src="http://vimeo.com/moogaloop.swf?clip_id=9851483&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"></embed> </object>
Is it good way to extract pattern match from string?