Open
Description
We need a way to allow incorporating comments into a grammar. It seems somewhat standard to
use the ;
to indicate the start of a comment and a \n
to close it. It'd be a good add I think to make that work.
A couple initial questions:
- Do we look for the
;
all along the way? - Do we make a "first pass" that strips all comments before parsing?
Activity
Carlyle-Foster commentedon Jan 1, 2025
i don't think we'd have to look for
;
all the time, unless there's some de facto standard for for closing comments within a line comments can never appear as interjections so we'd only have to checkhere <a> ::= "b" here <c> here | <d> and here
Carlyle-Foster commentedon Jan 1, 2025
now that i think of it, we should be able to confine comment parsing entirely within whitespace parsing, can you think of any exceptions?
Carlyle-Foster commentedon Jan 2, 2025
wait a second that's wrong, it's actually just
here <a> ::= "b" <c> | <d> and here
that have to be checkedCarlyle-Foster commentedon Jan 5, 2025
i'm not sure if this should be closed yet, i found this out in the wild, it's VERY old but it's usage of comments really make sense, if newlines are allowable WS then any whitespace could have comments inside
Carlyle-Foster commentedon Jan 5, 2025
i might just have to bite the bullet on this one and replace all the simple WS parsing with a custom function that parses WS while transparently skipping comments
shnewto commentedon Jan 5, 2025
one note here is that I don't think comments are actually allowed anywhere except at the end of a rule so rather than
here <a> ::= "b" <c> | <d> and here
it's just
<a> ::= "b" <c> | <d> here
From section 2.8 of the spec linked above:
shnewto commentedon Jan 5, 2025
which effectively means that anything that follows a
;
until a newline can be treated as a comment and we can expect that there can be lines that are only whitespace + commentsshnewto commentedon Jan 5, 2025
also 🤔 I think we already handle
;
to terminate lines so it might just be that we want to eat anything that follows a;
until the newline rather than allowing;
as a delimiter which I think it's behaving as currently.Carlyle-Foster commentedon Jan 5, 2025
;
isn't the main delimiter, we do normally just consume until we hit a newline,;
is only a delimiter because for some reason i though having two comments on the same line should be rejected, i think because i interpreted that as trying to close the comment b4 the newlinein retrospect, that doesn't make much sense
shnewto commentedon Jan 5, 2025
I think we're talking about the same thing but just in case, here's the current behavior for bnf grammars
is equivalent to
i.e.
;
is (incorrectly) just acting as an alternative to\n
as a delimiter and they both result in the same object after parsing.but if we want to handle comments correctly, I believe we want
to be equivalent to
Carlyle-Foster commentedon Jan 5, 2025
the problem is currently all WS could include newlines so all WS has to handle comments, take this example from the RFC u quoted foreinstance, it idiomatically puts a NL in the whitespace before a
/
Carlyle-Foster commentedon Jan 5, 2025
oh, that's what ur talking about, yeah i thought that was a little weird, i've never seen that used out in the wild for that matter, not sure why anyone would want 2 production rules on 1 line
2 remaining items