######iniparser.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
import sys import re inifile = sys.argv[1] quotes = ["'", '\"'] f = file(inifile) d = {} def get_quote_char(line): for char in line: if char in quotes: return char def getkey(line): #swallow everything up to the = return line[ : line.find('=') ].strip() def getval(line): #swallow everything after the = line = line[ line.find('=') + 1 : ].strip() q = get_quote_char(line) startq = line.find(q) #start scanning the line from the quote onwards position = 0 for char in line[ startq : ]: if char not in quotes or line[ position - 1 ] == '\\': pass else: #might hit some remote corner-case with this if position > 0: return line[ startq + 1 : position ] position+=1 for line in f: line = line.strip() #skip comments and empty lines if line.startswith(';') or line=='': pass #store sections as dicts elif line.startswith('['): section_name = line[ 1 : len(line) - 1 ].strip() section_dict = { section_name : {} } d.update(section_dict) else: k = getkey(line) v = getval(line) #print k,v try: d[section_name].update( {k:v} ) except TypeError: print 'The ini file contains invalid characters' print d
#########test.ini
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
[foo] greeting = 'hello' ;this is a comment name = 'Eddie' [bar] lastname = 'Vedder';that's another comment; [ malformed section ] city='Prague' country="\'Czech Republic\'" whatever='this ; is nasty' [bad] dog='bau' cat = 'miao' mouse = "squeak" [tabbed section] dogname = '\Oliver' catname = 'Barbara' [one more] appliance='lcd \'monitor\'' car = "Alfa \"Romeo\" - Giulietta";"foo"
Refactorings
No refactoring yet !
jaredgrubb
November 11, 2007, November 11, 2007 20:50, permalink
I know this may not be the answer you're looking for, but if someone asked me that in an interview, I would say "Well, I can't give the complete code off the top of my head, but it would start with 'import ConfigParser', a built-in module for Python."
lbolognini
November 13, 2007, November 13, 2007 09:18, permalink
Hi Jared,
that solution wouldn't apply. It didn't even cross my mind to say smt like "I'll use a library" because the point of the question, as i assumed, was to see how i would solve a problem that i was unlikely to have solved before (because of the availibility of libraries).
Besides I believe that my version, while not perfect, goes to some length to ensure that no matter how badly formatted the ini file is, it will be parsed anyway ;)
Thanks anyway,
L.
jaredgrubb
November 17, 2007, November 17, 2007 18:34, permalink
If what you're looking for is robustness... then I would recommend using regular expressions (which it looks like you thought of with the 'import re'.) You can trim this program down to a dozen lines that way... Maybe if I get ambitious and no one else beat me to it, I'll give it a shot soon.
John
January 9, 2008, January 09, 2008 06:04, permalink
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
import sys import re SECTION = re.compile('^\s*\[\s*([^\]]*)\s*\]\s*$') PARAM = re.compile('^\s*(\w+)\s*=\s*(.*)\s*$') COMMENT = re.compile('^\s*;.*$') d = {} f = open(sys.argv[1]) for line in f: if COMMENT.match(line): continue m = SECTION.match(line) if m: section, = m.groups() d[section] = {} m = PARAM.match(line) if m: key, val = m.groups() d[section][key] = val for k, v in d.items(): print k, v
This is an excercise to implement an .ini parser that I was given at an interview. It needs to be able to read even some pretty malformed .ini files.
Things to note:
1) ; (semicolon) is the start of a comment
2) string values need to be taken verbatim from the file
3) quotes can be either 'single' or "double"
4) test.ini is your test file of course ;)
Can you make it smarter/more robust?
Thanks,
Lorenzo