Skip to content

Commit

Permalink
Revert "Permit more control characters in comments (toml-lang#924)"
Browse files Browse the repository at this point in the history
This reverts commit ab74958.

I'm a simple guy. Someone reports a problem, I drink coffee and fix it. No one reports a problem? There is nothing to fix and I go drink beer.

No one really reported this as a problem, but it *does* introduce needless churn for all TOML implementations and the test suite. Do we need to forbid *anything* in comments? Probably not, and in strings we probably only need to forbid \x00. But at least before it was consistent with strings, and more importantly, what everyone wrote code for, which is tested, and already works.

[None of the hypotheticals](toml-lang#567 (comment)) on why this is "needed" are practical issues people reported, and most aren't even fixed: a comment can still invalidate the file, you must still parse each character in a comment as some are still forbidden, the performance benefits are very close to zero they might as well be zero, and you still can't "dump whatever you like" in comments.

So it doesn't *actually* change anything, it just changes "disallow this set of control characters" to ... "disallow this set of control characters" (but for a different set). That's not really a substantial or meaningful change. The only (minor) real-world issue that was reported (from the person doing the Java implementation) was that "it's substantially more complicated to parse out control characters in comments and raise an error, and this kind of strictness provides no real advantage to users". And that's not addressed at all with this, so...

---

And while I'm at it, let me have a complaint about how this was merged:

1. Two people, both of whom actually maintain implementations, say they don't like this change.
2. This is basically ignored.
3. Three people continue written a fairly large number of large comments, so anyone who wasn't already interested in this change unsubscribes and/or goes 🤷
4. "Consensus".

Sometimes I feel TOML attracts people who like to argue things from a mile-high ivory tower with abstract arguments that have only passing familiarity with any actual pragmatic reality.

Fixes toml-lang#995
  • Loading branch information
arp242 committed Oct 1, 2023
1 parent 23c3fb7 commit 8132f34
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 6 deletions.
1 change: 0 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
## unreleased

- Clarify Unicode and UTF-8 references.
- Relax comment parsing; most control characters are again permitted.
- Allow newline after key/values in inline tables.
- Allow trailing comma in inline tables.
- Clarify where and how dotted keys define tables.
Expand Down
5 changes: 3 additions & 2 deletions toml.abnf
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,11 @@ newline =/ %x0D.0A ; CRLF

;; Comment

comment = comment-start-symbol *allowed-comment-char
comment-start-symbol = %x23 ; #
allowed-comment-char = %x01-09 / %x0E-7F / non-ascii
non-ascii = %x80-D7FF / %xE000-10FFFF
non-eol = %x09 / %x20-7E / non-ascii

comment = comment-start-symbol *non-eol

;; Key-Value pairs

Expand Down
5 changes: 2 additions & 3 deletions toml.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,8 @@ key = "value" # This is a comment at the end of a line
another = "# This is not a comment"
```

Comments may contain any Unicode code points except the following control codes
that could cause problems during editing or processing: U+0000, and U+000A to
U+000D.
Control characters other than tab (U+0000 to U+0008, U+000A to U+001F, U+007F)
are not permitted in comments.

## Key/Value Pair

Expand Down

0 comments on commit 8132f34

Please sign in to comment.