Split by comma and strip whitespace in Python

Split by comma and strip whitespace in Python

Use list comprehension — simpler, and just as easy to read as a for loop.

my_string = blah, lots  ,  of ,  spaces, here 
result = [x.strip() for x in my_string.split(,)]
# result is [blah, lots, of, spaces, here]

See: Python docs on List Comprehension
A good 2 second explanation of list comprehension.

I came to add:

map(str.strip, string.split(,))

but saw it had already been mentioned by Jason Orendorff in a comment.

Reading Glenn Maynards comment on the same answer suggesting list comprehensions over map I started to wonder why. I assumed he meant for performance reasons, but of course he might have meant for stylistic reasons, or something else (Glenn?).

So a quick (possibly flawed?) test on my box (Python 2.6.5 on Ubuntu 10.04) applying the three methods in a loop revealed:

$ time ./list_comprehension.py  # [word.strip() for word in string.split(,)]
real    0m22.876s

$ time ./map_with_lambda.py     # map(lambda s: s.strip(), string.split(,))
real    0m25.736s

$ time ./map_with_str.strip.py  # map(str.strip, string.split(,))
real    0m19.428s

making map(str.strip, string.split(,)) the winner, although it seems they are all in the same ballpark.

Certainly though map (with or without a lambda) should not necessarily be ruled out for performance reasons, and for me it is at least as clear as a list comprehension.

Split by comma and strip whitespace in Python

Split using a regular expression. Note I made the case more general with leading spaces. The list comprehension is to remove the null strings at the front and back.

>>> import re
>>> string =   blah, lots  ,  of ,  spaces, here 
>>> pattern = re.compile(^s+|s*,s*|s+$)
>>> print([x for x in pattern.split(string) if x])
[blah, lots, of, spaces, here]

This works even if ^s+ doesnt match:

>>> string = foo,   bar  
>>> print([x for x in pattern.split(string) if x])
[foo, bar]
>>>

Heres why you need ^s+:

>>> pattern = re.compile(s*,s*|s+$)
>>> print([x for x in pattern.split(string) if x])
[  blah, lots, of, spaces, here]

See the leading spaces in blah?

Clarification: above uses the Python 3 interpreter, but results are the same in Python 2.

Leave a Reply

Your email address will not be published. Required fields are marked *