Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Golang syntax highlighting ignores newlines #1097

Closed
calini opened this issue Mar 16, 2019 · 9 comments · Fixed by #1122
Closed

Golang syntax highlighting ignores newlines #1097

calini opened this issue Mar 16, 2019 · 9 comments · Fixed by #1122

Comments

@calini
Copy link

calini commented Mar 16, 2019

Hello! I wanted to post some Go snippets on my Jekyll powered website, but I see the syntax highlighter just ignores the newlines and renders the code on one line.
This does not reproduce for other programming languages, or when not mentioning the programming language after the opening ~~~.
(Fun fact, Go is rendered in a decent fashion if marked as C)

@calini calini changed the title syntax highlighting newline Golang syntax highlighting ignores newline Mar 16, 2019
@calini calini changed the title Golang syntax highlighting ignores newline Golang syntax highlighting ignores newlines Mar 16, 2019
@pyrmont
Copy link
Contributor

pyrmont commented May 18, 2019

@calini Our sample file is rendering without a problem and I can't reproduce this problem. Do you have more details?

@calini
Copy link
Author

calini commented May 19, 2019

Not really, since I have not tinkered with the configuration, especially nothing related to code highlighting/rendering. I can provide links to places where it shows up: link on my website, link to .md source

@pyrmont
Copy link
Contributor

pyrmont commented May 19, 2019

Thanks! That's helpful. Will have an investigate!

@pyrmont
Copy link
Contributor

pyrmont commented May 19, 2019

Hmmm, it looks fine in this snippet. And it worked fine when I used rougify.

I noticed that the symbol used for the code fences differs between Go and the other languages. If you use ~ or ` for all three are the results the same?

@pyrmont
Copy link
Contributor

pyrmont commented May 20, 2019

Hmmm. I cloned the Git repo and the problem seems to be the way in which the HTML is being output.

The default rendering puts all tokens in their own <span> tags. With Go code, the newlines in the code block are also wrapped in these tags and that's causing them not to display. If you look at the HTML for the Java and C code, you can see the newlines there are outside the <span> tags.

Not sure how to fix it yet but I think we're getting closer.

@pyrmont
Copy link
Contributor

pyrmont commented May 20, 2019

It just occurred to me that newlines are significant in Go in a way they aren't in Java and C. The fact the lexer is matching these newlines explicitly is probably what's causing the difference.

@pyrmont
Copy link
Contributor

pyrmont commented May 20, 2019

rule(WHITE_SPACE, Other)

The 'problem' is here. Other lexers tokenise whitespace as Text whereas the Go lexer uses Other.

This becomes an issue when it's being rendered as Text tokens are not wrapped in tags at all. This is important because of this CSS rule.

I'm not sure which is correct; arguably a newline at the end of a statement isn't really 'text'. Plus the problem is in many ways the CSS rule; you can 'fix' the rendering problem by just changing how it works. What do you think?

@calini
Copy link
Author

calini commented May 20, 2019

I could just fix it in my fork but that means whoever is pulling a theme that uses Rouge will deal with the same thing. Should I fix it and do a pull request on it?

pyrmont added a commit to pyrmont/rouge that referenced this issue May 20, 2019
As discussed in rouge-ruby#1097, the Go lexer tokenises whitespace using the
`Other` token. This causes issues when rendering in HTML the code parsed
by Rouge because:

1. the use of `Other` will result in the whitespace being wrapped in
   separate HTML tags, which in turn causes
2. all newlines to be collapsed (due to CSS rules that expect newlines
   not to be wrapped in seperate HTML tags).

The `Other` token's intended use is described as follows:

> Token for data not matched by a parser (e.g. HTML markup in PHP code)

This is not a correct characterisation of whitespace in Go and is not
consistent with other lexers. This commit changes the token to `Text`,
the token generally used by other lexers. It fixes rouge-ruby#1097.
@pyrmont
Copy link
Contributor

pyrmont commented May 20, 2019

I had a look at the description of the tokens in the wiki and Other is described like this:

Token for data not matched by a parser (e.g. HTML markup in PHP code)

Pygments has the same description.

To my mind, that's not an accurate description of whitespace in Go. That makes me feel confident that changing the token to Text is the proper approach (there is actually a Text::Whitespace token but I think at this stage it's enough to have the Go lexer be consistent with other lexers).

I'll update and submit a PR.

pyrmont added a commit that referenced this issue May 27, 2019
As discussed in #1097, the Go lexer tokenises whitespace using the
`Other` token. This causes issues when rendering in HTML the code parsed
by Rouge because the use of `Other` will result in the whitespace being
wrapped in separate HTML tags, which in turn causes all newlines are
collapsed (due to CSS rules that expect newlines not to be wrapped in
seperate HTML tags).

The `Other` token's intended use is described as follows:

> Token for data not matched by a parser (e.g. HTML markup in PHP code)

This is not a correct characterisation of whitespace in Go and is not
consistent with other lexers. This commit changes the token to `Text`,
the token generally used by other lexers. It fixes #1097.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants