-
Notifications
You must be signed in to change notification settings - Fork 744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some updates to the PHP lexer #1397
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the submission! Some comments within.
@julp Sorry for the delay in getting back to you. I've resolved the completed issues and left comments on the ones that remain outstanding. Thanks for all your work on this! |
No problem ;) |
@julp Sorry again for the delay. Added two comments! |
+ `<?php` + keywords + function/method names * updates to the language: + support `_` in (binary, decimal, hexadecimal, ...) numbers (7.4.0) + support for binary numbers (`0b...`) (5.4.0) + Unicode codepoints escape syntax (`\u{...}`) (7.0.0) + add some missing keywords: * `fn` keyword (7.4.0) * type declarations (PHP 7, including `void` and nullable types from PHP 7.1) * 7.0.0: `class` (anonymous classes), `yield from` * 5.4.0: `callable`, `insteadof`, `trait`, `__TRAIT__` * 5.3.0: `goto`, `__NAMESPACE__`, `__DIR__` * others: casts, `instanceof`, `__CLASS__`, `__FUNCTION__`, `__METHOD__`, `__halt_compiler` * `self` even if it's not really a reserved word
we already had a rule to handle their lower case form, just make it case insensitive.
* remove E_* and PHP_* constants: these are not keywords and are case sensitive (keywords are not) * `not` doesn't exist * neither `this` * `empty` should be treated in the same way as `isset` and `unset` (currently functions) * `virtual` is a function, not a keyword * `php_user_filter` is a class, not a keyword
Don't worry. Huge thanks to you for taking care of rouge and the guidance/help! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these!
Would you be able to add something to the visual sample so that we can see the effects of some of the changes? I'm thinking especially of things like the number rules. Something also using yield
would also be good, I think.
@julp I realise this is still in 'draft' but I'm ready to merge this if there's nothing you have left to add :) |
I messed up, I didn't see your commit. Anyway, I think It's good now except if you see something else ;) Next PR: fixing the |
@julp This is the commit message I want to use:
Could you correct anything you think is wrong? |
I think people will misunderstand that point, it will be more obvious written as
To be precise: There are few removals from keywords (I just write them here if needed as a reminder):
I think the list is exhaustive and well summarized 👍 EDIT:
Partial typehinting, limited to (a few) builtin types. I don't think we can do better, handling nullable instances ( And before merging, let me check if |
typehinting should be placed before the one whith `?` and which recognizes, among others, operators like `?`, `??`, `??=`, ... In clear, `?int` is actually tokenized as: `['Operator', '?'], ['Keyword.Type', 'int']` Instead of: `['Keyword.Type', '?int']`
@julp Thanks for all your help on this one! I think there are a lot of good fixes in there. The next version of Rouge will come out on Tuesday 14 April and this will be part of that release 🎉 |
…y#1397) This commit adds a number of fixes to the PHP lexer to both fix outstanding issues and to add support for features introduced in versions of PHP up to 7.4.0. In particular, this commit: - makes the following terms match case-insensitively: `<?php`, `as`, `use`, the strings in `@keywords` and the strings in `@builtins`; - fixes an issue with heredoc syntax where `"` around the starting label would lex incorrectly; - adds support for: - binary numbers; - use of `_` as a separator in numbers; - `yield from` keyword; - Unicode codepoint escape syntax; and - partial type hinting; - adds the following words to `@keywords`: `__CLASS__`, `__DIR__`, `__FUNCTION__`, `__halt_compiler`, `__METHOD__`, `__NAMESPACE__`, `__TRAIT__`, `callable`, `class`, `fn`, `goto`, `instanceof`, `insteadof`, `self`, `trait`; - removes the following words from `@keywords`: `__sleep`, `__wakeup`, `empty`, `php_user_filter`, `stdClass`, `this`, `virtual`; and - simplifies rules for `E_*` and `PHP_*` constants.
Hi,
I would like to submit some updates and improvements to the PHP lexer:
<?php
as
inuse
list@keywords
)@builtins
)"
around the starting label was not recognized as is_
in (binary, decimal, hexadecimal, ...) numbers (7.4.0)0b...
) (5.4.0)\u{...}
) (7.0.0)fn
void
and nullable types from PHP 7.1)class
(anonymous classes),yield from
callable
,insteadof
,trait
,__TRAIT__
goto
,__NAMESPACE__
,__DIR__
instanceof
,__CLASS__
,__FUNCTION__
,__METHOD__
,__halt_compiler
self
even if it's not really a reserved wordI haven't deleted them but I am in favor of the removal of the current builtin/predefined constants (E_* and PHP_*) and class
stdClass
from@keywords
:Let me know if you agree and if I missed or forgot something.
Also, is there a reason for
TRUE
/FALSE
/NULL
to be in@keywords
? I mean shouldn'trule %r/(true|false|null)\b/, Keyword::Constant
be made case insensitive instead?Thanks.