ECMAScript 5 parser, by Peter van der Zee

ECMAScript 5 parser

by Peter van der Zee, September 2010 (Please note: this project is not a usable parser, but it does produce the _complete_ parse tree for a valid ES5 script. This was my first attempt at a JS parser and I didn't really know how to get started. For my better and faster parser attemts check ZeParser, ZeParser2 and finally Tenko)

CFG

rewrite Optional parts (safe)
rewrite epsilon rules (safe)
rewrite recursion (safe)
rewrite Any parts (safe)
rewrite unit rules (unsafe!)
rewrite non-productive rules
rewrite single rules
rewrite non-reachable rules
remove duplicate rules (safe)

Parser

output parser debug trace (LOT of output!)
enable unit tests (slows down parsing)
enable pruning (speeds up parsing)
directive scanner (post parsing)
fix parse tree (fix ranges wrt comments and whitespace)
remove artifacts (anon, recursive and empty nodes)

Pruning

reduce parse tree (removes subtree's of "tokens")
flatten parse tree (replace nodes with single child by that child)

Whitespace

dont clean
clean (remove whitespace and comment)
clean plus (also remove line terminators)

Misc

generate copy-paste html-escaped output
add group parenthesis (experimental, surrounds any node with parens)

This is a parser for ES5, the current version of ECMAScript (the governing language of JavaScript and JScript).

This project is not a usable parser, but it does produce the _complete_ parse tree. For my faster parsers check ZeParser and ZeParser2)

Simply enter your source in the textarea and click parse. If a certain snippet doesn't parse and you are SURE it should according to the ES5 specification, please let me know!

You can also run the test suite (takes a few seconds) or a fuzzy test (original by Jesse Ruderman, blatantly stripped by me, no longer always spits out proper js).

2010-09-28 Update: fixed parsing an empty script (caused by a bug). Fixing that allowed me to use the exact cfg by the spec with just one minor edit now :D Also fixed trailing whitespace/comments not being processed (they're normally ignored by a parser)

2010-12-30 Update: fixed a minor parse bug (forgot what it was :). Added simple variable inspector. Checks var declarations, shadowed vars and whether used vars are in fact declared. Sort of a work in progress. Modified code to use with my pragma pre-processor. Created own (simpler) fuzzy test suit. Made unit tests optional. Added pruning. Added simple (regex, non-parsing) syntax highlighter. Added parse tree cleanup to remove parser artifacts. Flat token tree now adds returns after each statement (want to do this for visual model as well).

2011-01-05 Update: fixed error output to no longer duplicate the string and to be html escaped. Editor is hidable. Entire settings are "exportable" in the form of a hash, which can be shared with others (settings that are missing are unchanged), also easy for my own debugging ;). Some code refactoring and cleanup. Moved highlighting css into the appropriate js files. Added an output to html option. Added small tool to do a HEREDOC kind of string-to-js-string conversion, might remove that later. Added cleaned up version of the simple highlighter.

??: fixed loop buttons for fuzzy's. Added option to break down visual parse tree model on source elements (which is probably what you want).

Planned: strict mode, variable inspector improvements, property inspector, type inferring, built in option for using a post-pragma-processed version of the script, ~~parse by url-argument,~~ various code cleanups, getting rid of recursion limits.

(This is a finalized project :)