Tuesday, 15 April 2014

regex - Validating and grouping excel formula format -



regex - Validating and grouping excel formula format -

i'm trying validate excel formula style, next regex:

=sum\(((?:\w+\d+)(?::\w+\d+)?)((?:,\w+\d+)(?::\w+\d+)?)*\)

on source:

should pass

=sum(a1,a11:a212,a12:a56,a342:a12,a3) =sum(a11:a12,a12:a12,a34:a3) =sum(a1,a2,a3) =sum(a1)

should fail

=sum(a11:a212:a2,a12:a56,a4,a342:a12)

and have validation part working, can't figure out how grouping each comma sperated values. should be:

how want them grouped:

=sum(a1,a11:a12,a12:a56,a3) // groups: $1 = a1 $2 = a11:a12 $3 = a12:a56 $4 = a3 =sum(a11:a12,a10:a12,a34:a3) // groups: $1 = a11:a12 $2 = a10:a12 $3 = a34:a3 =sum(a1,a2,a3) //groups: $1 = a1 $2 = a2 $3 = a3 =sum(a1) //groups: $1 = a1

how grouped:

=sum(a1,a11:a12,a12:a56,a3) // groups: $1 = a1 $2 = a3 =sum(a11:a12,a10:a12,a34:a3) // groups: $1 = a11:a12 $2 = a34:a3 =sum(a1,a2,a3) //groups: $1 = a1 $2 = a3 =sum(a1) //groups: $1 = a1

notice, grouping first , last. i've pretty new regex if i'm doing awful here, please point me in right direction. give thanks you!

that's not possible: (...)(?:,(...))+ (2 groups) produce 2 matches, no matter how much + matches.

you'll need in (at least) 2 steps:

value := /\w+\d+(?::\w+\d+)?/ value_list := /value(?:,value)*/ look := /=sum\((value_list)\)/

now match grouping 1 expression (the value_list), , find value occurrences in match.

a quick demo in php:

class="lang-php prettyprint-override">$text = 'should pass =sum(a1,a11:a212,a12:a56,a342:a12,a3) =sum(a11:a12,a12:a12,a34:a3) =sum(a1,a2,a3) =sum(a1) should fail =sum(a11:a212:a2,a12:a56,a4,a342:a12)'; $value = "\w+\d+(?::\w+\d+)?"; $value_list = "$value(?:,$value)*"; $expression = "=sum\(($value_list)\)"; preg_match_all("/$expression/", $text, $matches); // iterate on $value_list $expression (group 1) foreach($matches[1] $group1) { preg_match_all("/$value/", $group1, $m); print_r($m); }

prints:

array ( [0] => array ( [0] => a1 [1] => a11:a212 [2] => a12:a56 [3] => a342:a12 [4] => a3 ) ) array ( [0] => array ( [0] => a11:a12 [1] => a12:a12 [2] => a34:a3 ) ) array ( [0] => array ( [0] => a1 [1] => a2 [2] => a3 ) ) array ( [0] => array ( [0] => a1 ) )

regex

No comments:

Post a Comment