Sunday, 15 March 2015

javascript - Python: RegEx pattern to parse working hours string -



javascript - Python: RegEx pattern to parse working hours string -

i writing python library parse different working hours string , produce standard format of hours. stuck in next case:

my regex should homecoming groups mon - fri 7am - 5pm sat 9am - 3pm ['mon - fri 7am - 5pm ', 'sat 9am - 3pm'] if there comma between first , sec should homecoming [].

also comma can in anywhere should not between 2 weekdays & duration. eg: mon - fri 7am - 5pm sat 9am - 3pm , available upon email, phone call should homecoming ['mon - fri 7am - 5pm ', 'sat 9am - 3pm'].

this have tried,

import re pattern = """( (?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|m|w|f|thurs) # start weekday \s*[-|to]+\s* # seperator (?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|^(?![ap])m|w|f|thurs)? # end weekday \s*[from]*\s* # seperator (?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # start hr \s*[-|to]+\s* # seperator (?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # close hr )""" regex = re.compile(pattern, re.ignorecase|re.verbose) print re.findall(regex, "mon - fri 7am - 5pm sat 9am - 3pm") # output ['mon - fri 7am - 5pm ', 'sat 9am - 3pm'] print re.findall(regex, "mon - fri 7am - 5pm sat - sun 9am - 3pm") # output ['mon - fri 7am - 5pm ', 'sat - sun 9am - 3pm'] print re.findall(regex, "mon - fri 7am - 5pm, sat 9am - 3pm") # expected output [] # ['mon - fri 7am - 5pm,', 'sat 9am - 3pm'] print re.findall(regex, "mon - fri 7am - 5pm , sat 9am - 3pm") # expected output [] # ['mon - fri 7am - 5pm ', 'sat 9am - 3pm']

also tried negative ahead pattern in regex

pattern = """( (?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|m|w|f|thurs) \s*[-|to]+\s* (?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|^(?![ap])m|w|f|thurs)? \s*[from]*\s* (?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) \s*[-|to]+\s* (?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) (?![^,]) )"""

but didnt expected one. should explicitly write code checking condition? there way changing regex instead of writing explicit status checking?

another way implement infix comma between 2 weekday- duration if comma doesn't exist , alter regex grouping by/split comma. "mon - fri 7am - 5pm sat 9am - 3pm" => "mon - fri 7am - 5pm, sat 9am - 3pm"

i think can doing matching whole look comma (and other characters not allowed :

pattern = """^( ( (?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|m|w|f|thurs) # start weekday \s*[-|to]+\s* # seperator (?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|^(?![ap])m|w|f|thurs)? # end weekday \s*[from]*\s* # seperator (?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # start hr \s*[-|to]+\s* # seperator (?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # close hr ) )+$""

this output :

[('sat 9am - 3pm', 'sat 9am - 3pm')] [('sat - sun 9am - 3pm', 'sat - sun 9am - 3pm')] [] []

hope helps,

javascript python regex datetime python-2.7

No comments:

Post a Comment