Thursday, 15 January 2015

python - Regex to help split up list into two-tuples -



python - Regex to help split up list into two-tuples -

given list of actors, their character name in brackets, separated either semi-colon (;) or comm (,):

shelley winters [ruby]; millicent martin [siddie]; julia foster [gilda]; jane asher [annie]; shirley ann field [carla]; vivien merchant [lily]; eleanor bron [woman doctor], denholm elliott [mr. smith; abortionist]; alfie bass [harry]

how parse list of two-typles in form of [(actor, character),...]

--> [('shelley winters', 'ruby'), ('millicent martin', 'siddie'), ('denholm elliott', 'mr. smith; abortionist')]

i had:

actors = [item.strip().rstrip(']') item in re.split('\[|,|;',data['actors'])] data['actors'] = [(actors[i], actors[i + 1]) in range(0, len(actors), 2)]

but doesn't quite work, splits items within brackets.

you can go like:

>>> re.findall(r'(\w[\w\s\.]+?)\s*\[([\w\s;\.,]+)\][,;\s$]*', s) [('shelley winters', 'ruby'), ('millicent martin', 'siddie'), ('julia foster', 'gilda'), ('jane asher', 'annie'), ('shirley ann field', 'carla'), ('vivien merchant', 'lily'), ('eleanor bron', 'woman doctor'), ('denholm elliott', 'mr. smith; abortionist'), ('alfie bass', 'harry')]

one can simplify things .*?:

re.findall(r'(\w.*?)\s*\[(.*?)\][,;\s$]*', s)

python regex

No comments:

Post a Comment