Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boost in tokenizer performance with 3 liner #896

Merged
merged 1 commit into from
Mar 25, 2022

Conversation

jgmdev
Copy link
Member

@jgmdev jgmdev commented Mar 24, 2022

While working on improving responsiveness of lite-xl while highlithing on #885 I noticed that the slow downs in tokenization process where occurring on lines with long amounts of consecutive spaces. It seems the tokenizer was trying to apply each of its rules to each of the spaces found which slowed things a lot.

Adding a rule of %s+ that matches to "normal" in the beginning of every syntax table, will hugely improve the performance of the tokenizer basically for free! I haven't measured yet the gains but it is noticeable how tokenization is much more faster now.

@adamharrison
Copy link
Member

Tested this; seems legit.

Of course, this means we can't consider whitespace to be part of token starts. I don't think any language currently does this, so this is probably fine. And given that you're right, this would absolutely improve performance by a significant degree, I think this is worth it.

Any objections? Otherwise, let's merge. We can always take it out later if there is some language that this conflicts with (but I can't think of any off hand).

@jgmdev
Copy link
Member Author

jgmdev commented Mar 24, 2022

Without this change opening the plugins readme as shown on #885 was like this:

highlight-no-lazy-mode.mp4

and with this change it is now like this:

with-match-spaces-rule.mp4

@adamharrison
Copy link
Member

Oh wow, that is quite noticeable! We'll merge tomorrow if no one has any further comments.

@adamharrison adamharrison merged commit 951f091 into lite-xl:master Mar 25, 2022
jgmdev added a commit to jgmdev/lite-xl that referenced this pull request Mar 29, 2022
* mainly the language_md got affected which has some exotic rules
* some other languages are also using spaces at start of pattern
  and even if not affected this change tackles that
jgmdev added a commit that referenced this pull request Mar 29, 2022
adamharrison pushed a commit to adamharrison/lite-xl that referenced this pull request Apr 1, 2022
adamharrison pushed a commit to adamharrison/lite-xl that referenced this pull request Apr 1, 2022
* mainly the language_md got affected which has some exotic rules
* some other languages are also using spaces at start of pattern
  and even if not affected this change tackles that
sprainbrains pushed a commit to sprainbrains/lite-xl that referenced this pull request Apr 1, 2022
* mainly the language_md got affected which has some exotic rules
* some other languages are also using spaces at start of pattern
  and even if not affected this change tackles that
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants