PHP 5.6 grammar with C# (Sharwell runtime) and Java runtime by Ivan Kochurkin (kvanttt@gmail.com), Positive Technologies. PHP keywords are case-insensitive, but tokens in grammar written in lower case. Thus CaseInsensitiveInputStream should be used. C# or Java code actions used for context-sensitivity features like Heredoc.
Parser grammar based on Phalanger grammar by Jakub Míšek (jakubmisek). Html mode based on ANTLR html grammar by Tom Everett (@teverett).
Supported features:
- Different modes (because of PHP is island grammar):
- HTML
- Script
- CSS
- PHP
- Heredoc
- Alternative syntax.
- Heredoc.
- Interpolation strings (not fully completed, see TODO).
- Deep expressions handling (such as very long concatenation).
- aspTags.
- Improved lexer error processing with artificial string fragments (for example double closed quote at the end:
<div attr='value'' />
).
PHP parser has been successfully tested (parsing without errors) on the following projects.
Also this parser has been tested on plenty number of PHP files from different CMS (~70000 files). It took approximately 1 hour and 15 minutes with 70% on lexer part and 30% on parser part.