Tuesday, 15 March 2011

Parsing a C family language out of order -



Parsing a C family language out of order -

c parsed strictly in order, i.e. has declared before used; in particular, types must declared before variables of types. makes sense because grammar ambiguous if didn't know name of type , wasn't, e.g. a * b depends on whether a names type.

on other hand, c family languages have desirable property of relaxing restriction (thus eliminating manual juggling of header files). i'm writing parser c-superset language intended likewise have restriction relaxed, need figure out how it.

one method occurs me 2 passes. first pass goes through everything, taking advantage of fact @ top level must declaration, not statement, , picks types. @ stage function bodies left unexamined, picked token streams delimited matching braces. sec pass parses function bodies. local declarations within function have in order, that's not problem.

are there stumbling blocks in method haven't thought of?

how compilers c++, java, c# etc. typically handle parts of languages don't require declarations in order?

you don't have name resolution parse. first, if designing "c-like" language (as opposed new c implementation), can define syntax declarations, expressions, methods, etc. unambiguous in syntax. parsing order doesn't matter. (this prepare preprocessor disease, too, integrating preprocessor language in structured way).

if insist on c-like syntax, can utilize parser tolerates ambiguity, e.g., happy process "x*y;" , hold both look , declaration until gets farther data. in extreme case, think of constraint-based resolution. c , c++ insisted on knowing definitions first because compiler memory space pretty constrained , couldn't maintain everything; that's no longer true. don't have insist on knowing reply when parse.

we utilize glr parsers in our dms software reengineering toolkit, , happy parse c , c++11 fine. name resolution after parsing; isolates parsing , name resolution, making much cleaner, easier manage front end end.

c parsing compiler-construction programming-languages

No comments:

Post a Comment