1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
def sort(words): sortedList = [] for i in range(len(words)): word = words[i] if not sortedList: sortedList.append(word) else: for j in range(len(sortedList)): sortedWord = sortedList[j] if isLeftWordBeforeRight(word, sortedWord): sortedList.insert(j, word) break elif j == (len(sortedList) - 1): sortedList.append(word) return sortedList def isLeftWordBeforeRight(left, right): isBefore = False if right.startswith(left): isBefore = True else: for n in range(len(left)): if n < len(right): comparison = compareChars(left[n], right[n]) if comparison != 0: isBefore = comparison < 0 break return isBefore def compareChars(left, right): return alphabet.find(left) - alphabet.find(right) alphabet = "zyxwvutsrqpomnlkjihgfedcba" inputWords = ["england", "france", "spain", "italy", "greece", "portugal", "canada", "usa", "mexico", "peru", "cuba", "chile", "argentina", "zimbabwe", "uganda", "congo", "zambia", "namibia", "ghana"] print sort(inputWords)
Refactorings
No refactoring yet !
jaredgrubb
December 11, 2007, December 11, 2007 04:24, permalink
You can use the sort() method of Python and change the comparison function. Then you should be able to run inputWords.sort(my_cmp) and it should do the same thing. I havent tested the following code, but I think it works... :)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
def my_cmp(a, b): global alphabet # Check if either input is zero-length, which is considered "smallest" if 0 in (len(a),len(b)): return cmp(a,b) # Try to see if the first element is in the alphabet; get its index try: a_i = alphabet.index(a[0]) except ValueError: a_i = None try: b_i = alphabet.index(b[0]) except ValueError: b_i = None if None in (a_i, b_i): # If exactly one is None, then the non-None comes earlier if a_i!=None: return -1 if b_i!=None: return 1 # They're both None, so use standard compare; if equal, keep comparing return cmp(a[0],b[0]) if a[0]!=b[0] else my_cmp(a[1:],b[1:]) return cmp(a_i,b_i) if a_i!=b_i else my_cmp(a[1:],b[1:])
armandino.myopenid.com
December 12, 2007, December 12, 2007 03:00, permalink
Hi Jared,
There seems to be a problem with the last two return statements. I'm not that familiar with python syntax to fix it tho :)
An unrelated question.. what's the benefit of using alphabet.index(...) instead of the find method? The latter, I understand, doesn't throw an error.
jaredgrubb
December 14, 2007, December 14, 2007 21:56, permalink
As for the difference between .index() and .find().... index raises an exception on no match, whereas find returns -1. You could remove the try..except's and test against -1 instead of None if you want to use find instead of index. Choice of style.
The last two return statements are like the ternary operator in C. For example, instead of the last return statement, you could replace it with:
1 2 3
if a_i!=b_i then: return cmp(a_i, b_i) return my_cmp(a[1:], b[1:])
jaredgrubb
December 14, 2007, December 14, 2007 21:57, permalink
(ugh, my ancient BASIC background just popped through! Yikes. Forget that "then" in there lol)
orip
January 18, 2008, January 18, 2008 01:12, permalink
armandino - I liked how you subtracted the two indexes to get the cmp, cool!
How about this version? (append to your original code)
alphabet_cmp
1 2 3 4 5 6 7 8 9 10 11 12 13 14
from itertools import izip def alphabet_cmp(a, b): # Make sure no letters are missing. For performance in final code, comment out or remove with 'python -O' assert set(a).issubset( set(alphabet) ) and set(b).issubset( set(alphabet) ) # find the first non-equal character for c_a, c_b in izip(a,b): if c_a != c_b: return alphabet.index(c_a) - alphabet.index(c_b) # one word is a prefix of another, sort by length return len(a) - len(b) print sorted(inputWords, alphabet_cmp)
Bjoern Brandenburg
May 24, 2008, May 24, 2008 01:59, permalink
This can be done much more concise and without any global variables. Python offers higher-order functions and closures. Use them. :-)
1 2 3 4 5 6 7 8 9 10
from collections import defaultdict as dd def list_cmp(order): "Return a list comparison function that sorts by order." idx = dd(lambda: len(order), [(order[i], i) for i in xrange(len(order))]) cmp = lambda a, b, i=0: len(a) - len(b) if i == min(len(a), len(b)) else _cmp(a, b, i) _cmp = lambda a, b, i: cmp(a, b, i + 1) if idx[a[i]] == idx[b[i]] else idx[a[i]] - idx[b[i]] return cmp inputWords.sort(cmp=list_cmp(alphabet))
Tom Lynn
July 17, 2008, July 17, 2008 09:43, permalink
1 2 3 4 5 6 7
alphabet = "zyxwvutsrqpomnlkjihgfedcba" inputWords = ["england", "france", "spain", "italy", "greece", "portugal", "canada", "usa", "mexico", "peru", "cuba", "chile", "argentina", "zimbabwe", "uganda", "congo", "zambia", "namibia", "ghana"] print sorted(inputWords, key=lambda word: [alphabet.index(c) for c in word])
My second python script. :)
This one sorts a list of words in a particular order, which depends on the order of characters in the 'alphabet' string.
Any suggestions on how to improve / simplify it?