642d333608cc3483f64dba38cb81bdd2

Is there a better way for me to collect the information I want, then how I have it in my while loop ?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
require 'rubygems'
require 'hpricot'
require 'open-uri'

doc = Hpricot(open('http://www.mpi.mb.ca/salvage/auction.asp?salenm=1694'))

doc.search("tr > td > font").each do |i|
  cars << i.inner_html  
end

b = 0
car_array=[]
while b < cars.length do
car_array << {
    :num => cars[b],
    :year => cars[b+=1],
    :make_model => cars[b+=1],
    :vin => cars[b+=1],
    :engine => cars[b+=1],
    :status => cars[b+=1],
    :km => cars[b+=1]
  }
    b+=1
end

Refactorings

No refactoring yet !

Avatar

bob

April 26, 2010, April 26, 2010 05:35, permalink

2 ratings. Login to rate!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
require 'rubygems'
require 'hpricot'
require 'open-uri'

doc = Hpricot(open('http://www.mpi.mb.ca/salvage/auction.asp?salenm=1694'))

cars = doc.search("tr > td > font").collect { |i| i.inner_html }

car_array=[]
cars.each_slice(7) do |a,b,c,d,e,f,g|
car_array << {
    :num => a,
    :year => b,
    :make_model => c,
    :vin => d,
    :engine => e,
    :status => f,
    :km => g
  }
end
E8f0d6e9bc8bca695b3c5bdf75cdcc03

Martin Plöger

April 26, 2010, April 26, 2010 06:50, permalink

2 ratings. Login to rate!

I'm not quite satisfied with the indexes inside the #collect-method. But I think this should be resolved by improving the scraping of the html-code before the loop, not by the loop later on.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
require 'rubygems'
require 'hpricot'
require 'open-uri'

doc = Hpricot(open('http://www.mpi.mb.ca/salvage/auction.asp?salenm=1694'))

cars = doc.search("tr > td > font").collect { |i| i.inner_html }

car_array = cars.each_slice(7).collect do |car|
  :num => car[0], :year => car[1],
  :make_model => car[2], vin: => car[3]
  :engine => car[4], :status => car[5],
  :km => car[6]
end
0ba52af63f34c9e680a535fa8a3ae1bb

LucaB

April 26, 2010, April 26, 2010 07:45, permalink

1 rating. Login to rate!

you are assuming that a car's values are implicitly organised in groups of 7, so you could try:

Ruby

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
require 'rubygems'
require 'hpricot'
require 'open-uri'

doc = Hpricot(open('http://www.mpi.mb.ca/salvage/auction.asp?salenm=1694'))

cars=[] # => you can't see the variable cars outside of the search block

doc.search("tr > td > font").each do |i|
  cars << i.inner_html
end

car_keys = [:num, :year, :make_model, :vin, :engine, :status, :km]

grouped_cars = cars.each_slice(7).collect do |car|
  car_keys.zip(car).inject({}){|hash, pair| hash[pair[0]] = pair[1]; hash}
end
Avatar

drTheMan

May 18, 2010, May 18, 2010 08:53, permalink

1 rating. Login to rate!

Note the shorter syntax in the doc.search.collect; when just calling a single method without parameters on the original value each loop; you can use this syntax.
Then I also took the liberty to shorten LucaB's inject loop a little bit (and use the original variable names).

1
2
3
4
5
6
7
8
9
10
11
require 'rubygems'
require 'hpricot'
require 'open-uri'

doc = Hpricot(open('http://www.mpi.mb.ca/salvage/auction.asp?salenm=1694'))

cars = doc.search("tr > td > font").collect(&:inner_html)

car_array = cars.each_slice(7).collect do |car|
  [:num, :year, :make_model, :vin, :engine, :status, :km].zip(car).inject({}){|hash, pair| hash.merge({pair[0] => pair[1]})}
end
36ad787d1b5306e65e89b441e0cade8f

Dragan Cvetinovic

June 18, 2010, June 18, 2010 13:40, permalink

1 rating. Login to rate!

The collect(&:inner_html) is great! I'll remember it.
I would just tighten the code in the block a little bit...

1
2
3
4
5
6
7
8
9
10
11
require 'rubygems'
require 'hpricot'
require 'open-uri'

doc = Hpricot(open('http://www.mpi.mb.ca/salvage/auction.asp?salenm=1694'))

cars = doc.search("tr > td > font").collect(&:inner_html)

car_array = cars.each_slice(7).collect do |car|
  Hash[*[:num, :year, :make_model, :vin, :engine, :status, :km].zip(car).flatten]
end

Your refactoring





Format Copy from initial code

or Cancel