python - Regex to help split up list into two-tuples -
given list of actors, their character name in brackets, separated either semi-colon (;) or comm (,):
shelley winters [ruby]; millicent martin [siddie]; julia foster [gilda]; jane asher [annie]; shirley ann field [carla]; vivien merchant [lily]; eleanor bron [woman doctor], denholm elliott [mr. smith; abortionist]; alfie bass [harry]
how parse list of two-typles in form of [(actor, character),...]
--> [('shelley winters', 'ruby'), ('millicent martin', 'siddie'), ('denholm elliott', 'mr. smith; abortionist')]
i had:
actors = [item.strip().rstrip(']') item in re.split('\[|,|;',data['actors'])] data['actors'] = [(actors[i], actors[i + 1]) in range(0, len(actors), 2)]
but doesn't quite work, splits items within brackets.
you can go like:
>>> re.findall(r'(\w[\w\s\.]+?)\s*\[([\w\s;\.,]+)\][,;\s$]*', s) [('shelley winters', 'ruby'), ('millicent martin', 'siddie'), ('julia foster', 'gilda'), ('jane asher', 'annie'), ('shirley ann field', 'carla'), ('vivien merchant', 'lily'), ('eleanor bron', 'woman doctor'), ('denholm elliott', 'mr. smith; abortionist'), ('alfie bass', 'harry')]
one can simplify things .*?
:
re.findall(r'(\w.*?)\s*\[(.*?)\][,;\s$]*', s)
python regex
No comments:
Post a Comment