1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
require 'net/http' require 'uri' require 'strscan' html = Net::HTTP.get URI.parse("http://www.apple.com") s = StringScanner.new(html) while true txt = s.scan_until(/\.jpg/) if not s.matched? break end p /src=.(.*jpg)/.match(txt)[1] end
Refactorings
No refactoring yet !
Jordan Glasner
January 30, 2008, January 30, 2008 16:21, permalink
I'm sure my regex could be tuned a bit...
1 2 3 4 5 6 7 8 9
require 'open-uri' jpg = /img src="(.[^"]*\.jpg)"/ open('http://www.apple.com') do |response| html = response.read p jpg.match(html)[1] end
gregory.barborini.myopenid.com
January 30, 2008, January 30, 2008 20:27, permalink
Nice !
.. and do you know if there's a way to display ALL the jpg image in the page ?
Jordan Glasner
January 31, 2008, January 31, 2008 01:51, permalink
Somehow I've wrecked my Ruby install today, but I think the following should work
1 2 3 4 5 6 7 8 9
require 'open-uri' jpg = /img src="(.[^"]*\.jpg)"/ open('http://www.apple.com') do |response| match = jpg.match(response.read) match.captures.each { |x| print x } end
gregory.barborini.myopenid.com
January 31, 2008, January 31, 2008 03:52, permalink
Hum.. I still got one result but thank you so much : i've done some search and here's the result :
1 2 3 4 5
require 'open-uri' open('http://www.apple.com') do |response| response.read.scan(/img src="(.[^"]*\.jpg)"/).each { |x| p x.to_s } end
getopenid.com/ihack
March 19, 2008, March 19, 2008 20:53, permalink
1 2 3
require 'open-uri' open('http://www.apple.com') { |r| r.read.scan(/img src="(.[^"]*\.jpg)"/).each { |x| p x.to_s } }
This code display jpg image url from a web site.
How can we make the code smarter and better ?
Thanks !