Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the matching of $'...' strings. #776

Merged
merged 4 commits into from
Oct 20, 2023

Conversation

danfuzz
Copy link
Contributor

@danfuzz danfuzz commented Oct 17, 2023

Requirements for Contributing a Bug Fix

Please see "Verification Process" below for caveat on this PR.

Identify the Bug

#775

Description of the Change

[Update: Replaced with new description]

Added rules to highlights.scm:

  • to account for ansi_c_string (which is what $'...' is categorized as).
  • to mark the punctuation of raw_string (which is what '...' is categorized as).

Alternate Designs

[UPDATE: Replaced with new description]

It seems like there are several possible ways to mark the punctuation, but the overall shape of the code is going to be about the same.

Possible Drawbacks

None, pretty sure.

Verification Process

[Update: Now verified!]

NOT ACTUALLY VERIFIED. Very sorry.

As I write this, I am attempting to get a Pulsar build environment set up (on macOS), and it is not exactly straightforward. I figured it'd be most expedient to get this simple fix in the pipeline sooner rather than later. Apologies in advance if you would have preferred me to 100% validate.

Release Notes

Fixed scoping/highlighting of single-quoted ('...') and C-style ($'...') strings in shell scripts.

@confused-Techie
Copy link
Member

Thanks a ton for contributing!

Love to see this one put together! I'm going ahead right now to give this one a test, and make sure it's resolve the issue you've described, so fingers crossed!

@confused-Techie
Copy link
Member

Alright, so I've gone ahead and tested the Windows binary built via your PR. As well as I'm using the example of the broken text from here:

echo $'hello'

With that said, unfortunately I don't think this fixes it. Here's my results:

TextMate (Should be working, not effected by this PR)

image

Legacy Tree-Sitter This is what this PR attempts to fix

image

Modern Tree-Sitter (Not effected by this PR)

image


So we can see the only grammar where this does work properly is TextMate. So it seems that there must be something broken in the upstream tree-sitter parser of this grammar. Which I assume is also what's being used to generate the WASM files of Modern Tree-Sitter. Which unfortunately it seems the upstream parser for both Legacy Tree-Sitter and Modern Tree-Sitter tree-sitter-bash has no open issues that reflect this behavior.

Lastly I did check for any errors shown in the console and I'm seeing none. So it's prudent to ask @savetheclocktower any ideas here? Could this be in our implementation? Or is more than likely it's an error upstream?

@confused-Techie
Copy link
Member

Although what is interesting, is according to the original error report, it made sense that this would be the fix, and using tree-sitter-tools I can confirm that this item is still being correctly identified:

image

But I can also confirm no scopes at all are being applied:

image

Although beyond this is really where my TreeSitter knowledge ends, I hate to say

@danfuzz
Copy link
Contributor Author

danfuzz commented Oct 18, 2023

@confused-Techie Just to be clear, in the bug report I noted the output of "Log Cursor Syntax Tree Scope" and not "Log Cursor Scope." Here's both of those, for me (in 1.110.0; I still haven't gotten my own build working, alas):

scopes

BTW I didn't know about tree-sitter-tools before; thanks for (implicitly) pointing out its existence.

@danfuzz
Copy link
Contributor Author

danfuzz commented Oct 18, 2023

Ok, having finally gotten a local build working (see https://github.com/orgs/pulsar-edit/discussions/225), I can now sadly confirm that this PR doesn't actually seem to do anything to address the problem. :(

@danfuzz
Copy link
Contributor Author

danfuzz commented Oct 18, 2023

BTW, I hadn't realized I was only (ostensibly) addressing the non-modern tree-sitter implementation with my original commit: I searched the codebase for something that looked anything like ansi_c_string and only found that one location.

In any case, I just added a new commit to this PR that really-actually-i-tested-it-even gets $'...' strings to get a string-ish scope in the modern tree-sitter implementation.

However, I have a question: In the "STRINGS" section of highlights.scm, there are a couple of lines that seem to get the open and close quote of a double-quoted string to get marked as punctuation. I made a few attempts to duplicate that pattern for ansi_c_string, and everything I tried caused the highlighting to totally fail (like, no scopes at all for anything), but even so I couldn't spot any related-looking spew to the console. Do you have a suggestion as to what's going on or better yet how to actually write clauses that would work in this case?

Incidentally, though double-quoted strings get punctuation scopes, it doesn't look like single-quoted strings do. I also tried adding extra rules for that case and was met with a similar completely-scopeless fate.

One final note: It took me a while to figure out that neither yarn install nor yarn build were actually picking up the changes I was making to highlights.scm; I had to copy the file manually into the build output (node_modules/...) in order to test. Unclear to me if this is a deficiency just with my setup, or just on macOS, or generally with the build.

@danfuzz
Copy link
Contributor Author

danfuzz commented Oct 18, 2023

One mystery solved: The version of tree-sitter-bash that the non-modern tree-sitter uses itself has the same typo for the name ansii_c_string. It got fixed here: tree-sitter/tree-sitter-bash@385a8e4

So my original commit would only make sense if the non-modern setup switched the version of tree-sitter-bash it depends on.

…tup.

This is how `tree-sitter-bash` marks `$'...'` strings.
@confused-Techie
Copy link
Member

@danfuzz I'm glad you were able to get a fix! And get the fix in for Modern Tree-Sitter (As that is our current focus) So super awesome all around!

We very well should still fix this for Legacy Tree-Sitter which it looks like could be fixed by updating our version of tree-sitter-bash to include the commit you found, as well as include your previously attempted change here, so glad we know what's coming up there!

And if your new set of changes are actually fixing things I'm happy to hear it and will try and test soon (Unless someone else beats me).

As for writing the queries, I'll be honest that's not my area of expertise. Hopefully @savetheclocktower could point out what may be going on, but besides that, here's the documentation they wrote when they created this behavior. So maybe that can be of some help?

@danfuzz danfuzz force-pushed the danfuzz-fixes-ansi branch from 3b31d38 to b906f01 Compare October 18, 2023 18:28
@danfuzz
Copy link
Contributor Author

danfuzz commented Oct 18, 2023

Just force-pushed to remove the original commit.

@danfuzz
Copy link
Contributor Author

danfuzz commented Oct 18, 2023

Though I don't know why the original delimiter-matching patterns caused catastrophic failure, I was able to figure out something that worked, based on the tree-sitter docs, and I used the same form to also add delimiter-matching for single-quoted strings.

One more question: The modern tree-sitter uses string scope names string.quoted.single.shell and string.quoted.double.shell. The legacy tree-sitter just uses the scope name string for both. But the old TextMate syntax uses the long forms, which includes string.quoted.single.dollar.shell for $'...'. In my PR, I just used string.quoted.single.shell for $'...' (that is, no .dollar., because I hadn't spotted that difference until just now). Wondering if I should go ahead and update this PR to reflect that original scope name.

Thanks for all the help!

@savetheclocktower
Copy link
Contributor

@danfuzz A few things to address here.

However, I have a question: In the "STRINGS" section of highlights.scm, there are a couple of lines that seem to get the open and close quote of a double-quoted string to get marked as punctuation. I made a few attempts to duplicate that pattern for ansi_c_string, and everything I tried caused the highlighting to totally fail (like, no scopes at all for anything), but even so I couldn't spot any related-looking spew to the console. Do you have a suggestion as to what's going on or better yet how to actually write clauses that would work in this case?

If you make a change that results in a syntax error in a file — like if your parentheses are unbalanced — Pulsar should beep and show an error message in the console. So this must've been something else. If you remember exactly what change you tried to make, I could try to reproduce your outcome on my machine; that sounds like a bad outcome that we should try to prevent.

One final note: It took me a while to figure out that neither yarn install nor yarn build were actually picking up the changes I was making to highlights.scm; I had to copy the file manually into the build output (node_modules/...) in order to test. Unclear to me if this is a deficiency just with my setup, or just on macOS, or generally with the build.

You shouldn't need to do any of this to see changes reflected when you change a highlights.scm. If you set the ATOM_DEV_RESOURCE_PATH environment variable to the path of the checked-out Atom source, you can launch Pulsar with the --dev flag and it should just work without requiring a full rebuild (just make sure no Pulsar instance is already running). This is how I do all my development — start with the latest stable Pulsar release, but launch it with ATOM_DEV_RESOURCE_PATH set (I use direnv for this). I don't think I've actually needed to build Pulsar yet from my local machine.

When Pulsar is run in dev mode, all changes to query files take effect immediately once a query file is saved.

When you do a yarn install with the source checked out locally, you might then have to run npx electron-rebuild -v 12.2.3 to rebuild native modules to work with Electron.

@confused-Techie It doesn't quite explain your screenshots, but just for reference: the Log Cursor Scope command shows the scopes at the position to the immediate right of the cursor. If you want to check for the presence of a punctuation scope on a quote character, the cursor should be on the left side of the quote.

The PR looks good; let me try to verify it myself. @danfuzz, your tenacity in the face of lack of documentation is inspiring to me.

Copy link
Contributor

@savetheclocktower savetheclocktower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggested change (suggested by @danfuzz himself!) and then this is good to go.

Thanks so much for this PR, and for expending so much effort for such a small change!

Co-authored-by: Andrew Dupont <andrew@andrewdupont.net>
@danfuzz
Copy link
Contributor Author

danfuzz commented Oct 19, 2023

If you remember exactly what change you tried to make, I could try to reproduce your outcome on my machine; that sounds like a bad outcome that we should try to prevent.

One of my attempts to address non-$ single-quoted strings was this (below), which was patterned after the equivalent for a double-quote string. To be clear, I now understand why this doesn't work — because as you mention, the bash parser doesn't produce delimiter tokens for single-quoted strings — but indeed it would have helped to see a prominent error message (etc.).

(raw_string "'" @punctuation.definition.string.begin.shell
  (#is? test.first true))

It took me a while to figure out that neither yarn install nor yarn build were actually picking up the changes I was making to highlights.scm; I had to copy the file manually [...]

[...] If you set the ATOM_DEV_RESOURCE_PATH environment variable to the path of the checked-out Atom source, you can launch Pulsar with the --dev flag [...]

Understood. I intentionally wasn't trying to use pulsar --dev because I figured there was a good enough chance that it would fail in an inscrutable way and would be more of a distraction than just doing the thing-that-i-just-got-working.

I don't think I've actually needed to build Pulsar yet from my local machine.

Ah, it wasn't clear from the instructions (that I saw) that I didn't even have to run yarn install etc. before attempting to use pulsar --dev. I thought I had to have a working build as a baseline.

you might then have to run npx electron-rebuild -v 12.2.3 to rebuild native modules to work with Electron.

Thanks! I'll give that a whirl.

@danfuzz
Copy link
Contributor Author

danfuzz commented Oct 19, 2023

you might then have to run npx electron-rebuild -v 12.2.3 to rebuild native modules to work with Electron.

Thanks! I'll give that a whirl.

Turns out that's what happens when you say yarn build:

$ yarn build
yarn run v1.22.19
$ electron-rebuild
✔ Rebuild Complete
✨  Done in 0.74s.

So, I guess there's something else (perhaps) which is responsible for copying modified files from locally-defined modules into node_modules.

Copy link
Contributor

@savetheclocktower savetheclocktower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this PR!

@savetheclocktower savetheclocktower merged commit 6f1c772 into pulsar-edit:master Oct 20, 2023
@danfuzz danfuzz deleted the danfuzz-fixes-ansi branch October 21, 2023 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants