98edd841bbc19903cd758722d9745f0b

I have large text-reports from legacy ERP that needs parsing. Basically I need convert data to csv for further analysis. I came up with following code for it. What would be the most obvious improvements I should make?

I will include example of sourcedata in comment, but it can also be found from here: http://www.ruby-forum.com/topic/167270.>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
def writefile(file, *linedata)
	linedata.each do |line| 	
		file << line.join(",") + "\n"
	end
end

def readfile(file, outputfile)
	out = File.new(outputfile, "w+")
	info = []
	wline = ['year', 'week', 'day', 'customerid', 'customer', 'subcustid', 'subcust', 'prodid', 'prod', 'qty']
	IO.foreach(file){|line|
		if line =~ /Customer/
			wline[3] = line.split(":")[1].scan(/\d+/)
			wline[4] = line.split(":")[1].scan(/[a-zA-Z]+/).join(" ")
		elsif line =~ /Sub-cust/
			wline[5] = line.split(":")[1].scan(/\d+/)
			wline[6] = line.split(":")[1].scan(/[a-zA-Z]+/).join(" ")
		elsif line =~ /Year/
			wline[0] = line.scan(/\d+/)[0]
		elsif line =~ /Week/
			wline[1] = line.scan(/\d+/)[0]
		elsif line.strip! =~ /\A\d/
			wline[7] = line.scan(/\A\d+/)
			temp = line.scan(/[A-Za-z]+/)
			wline[2] = temp.pop #later used for delivery day
			wline[8] = temp.join(" ")
			wline[9] = line.scan(/\d+/)[line.scan(/\d+/).length-13].to_i
			wline[9] > 0 ? writefile(out, wline) : wline
		elsif 	line =~ /\AMON/ || line =~ /\ATUE/ || line =~ /\AWED/ || 
				line =~ /\ATHU/ || line =~ /\AFRI/ || line =~ /\ASAT/ || 
				line =~ /\ASUN/
			wline[2] = line.scan(/[A-Za-z]+/)
			wline[9] = line.scan(/\d+/)[line.scan(/\d+/).length-13].to_i
			wline[9] > 0 ? writefile(out, wline) : wline
		end
		
	}
	out.close
end
readfile('source_exmaple', 'output.txt')




=begin

Example of source data

Customer  :     97  CUSTOMER A                  
Year       :   2008
Week      : 39 ..  39
Sub-cust  :    999  DEPARTMENT A                       

------------------------------------------------------------------------------------
ARTIKEL                          Dag     V E R B R U I K                 
nr.    omschrijving                       39      0      0      0      0      0      0      0      0      0  Totaal Gemiddeld Norm       
------------------------------------------------------------------------------------

   1234  PRODUCT A Beeee      MON    150    0  0  0  0  0  0  0  0  0  150  150    0
                              TUE     50    0  0  0  0  0  0  0  0  0   50   50    0

   
 P E R I O D E   O V E R Z I C H T   A B S O L U T E   L E V E R I N G   P E R   D A G 
                       =====================================================================================
=end

Refactorings

No refactoring yet !

880cbab435f00197613c9cc2065b4f5a

danielharan

October 6, 2008, October 06, 2008 13:54, permalink

No rating. Login to rate!

Separate out parsing logic into its own easily tested class, as well as how to write to CSV. Right now you're mixing those two concerns as well as the writing of the new file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class ReportReader
  def initialize(file_name)
    @lines = File.open(file_name).read.split("\n")
  end

  def customer
    lines.detect {|line| line =~ /Customer/}.split(":")[1].scan(/\d+/)
  end
  ...
  def to_csv

  end 
end

File.new('output.csv', "w+") do |f|
  f.puts ReportReader.new('source_exmaple').to_csv
end

Your refactoring





Format Copy from initial code

or Cancel