Tuesday, 15 May 2012

parsing - Replace comment in JavaScript AST with subtree derived from the comment's content -



parsing - Replace comment in JavaScript AST with subtree derived from the comment's content -

i'm author of doctest, quick , dirty doctests javascript , coffeescript. i'd create library less dirty using javascript parser rather regular expressions locate comments.

i'd utilize esprima or acorn following:

create ast walk tree, , each comment node: create ast comment node's text replace comment node in main tree subtree

input:

!function() { // > tousername("jesper nøhr") // "jespernhr" var tousername = function(text) { homecoming ('' + text).replace(/\w/g, '').tolowercase() } }()

output:

!function() { doctest.input(function() { homecoming tousername("jesper nøhr") }); doctest.output(4, function() { homecoming "jespernhr" }); var tousername = function(text) { homecoming ('' + text).replace(/\w/g, '').tolowercase() } }()

i don't know how this. acorn provides walker takes node type , function, , walks tree invoking function each time node of specified type encountered. seems promising, doesn't apply comments.

with esprima can utilize esprima.parse(input, {comment: true, loc: true}).comments comments, i'm not sure how update tree.

most ast-producing parsers throw away comments. don't know esprima or acorn do, might issue.

.... in fact, esprima lists comment capture current bug: http://code.google.com/p/esprima/issues/detail?id=197

... acorn's code right there in github. appears throw comments away, too.

so, looks prepare either parser capture comments first, @ point task should straightforward, or, you're stuck.

our dms software reengineering toolkit has javascript parsers capture comments, in tree. has language substring parsers, used parse comment text javascript asts of whatever type comment represents (e.g, function declaration, expression, variable declaration, ...), , back upwards machinery graft such new asts main tree. if going manipulate asts, substring capability important: parsers won't parse arbitrary language fragments, wired parse "whole programs". dms, there no comment nodes replace; there comments associated asts nodes, grafting process little trickier "replace comment nodes". still pretty easy.

i'll observe parsers (including these) read source , break tokens using or applying equivalent of regular expressions. so, if using these locate comments (that means using them locate *non*comments throw away, well, e.g., need recognize string literals contain comment-like text , ignore them), doing parsers anyway in terms of finding comments. , if want replace them content, echoing source stream comment prefix/suffix /* */ stripped apparantly want, parsing machinery seems overkill.

javascript parsing abstract-syntax-tree

No comments:

Post a Comment