Create julia lexer and tags parser #2584

getzze · 2020-09-11T18:22:06Z

resolves #434

[Edit2] Lexer

[Edit] Link to official Julia parser: https://github.com/JuliaLang/julia/blob/master/src/julia-parser.scm
and the C library used in Julia for unicode parsing: https://github.com/JuliaStrings/utf8proc

eht16 · 2020-09-12T17:38:09Z

I didn't review the code, just an early note:
the Scintilla lexer should be first submitted to and accepted by the Scintilla project (https://www.scintilla.org/).
Also the Ctags parser should be first submitted to and accepted by the Universal Ctags project (https://github.com/universal-ctags/ctags).
See also universal-ctags/ctags#1566 (comment) where someone also started a parser.

We want to stay inline with the upstream projects and so the code should be accepted there first.

getzze · 2020-09-14T14:04:29Z

Ok, thanks.
They will need to be tested by julia+geany users first anyway.

elextr · 2020-09-15T01:12:00Z

They will need to be tested by julia+geany users first anyway.

And a reminder to those who test it to comment, that lets @getzze know its working (or what needs fixing :) and it lets Geany project devs know there is a demand.

getzze · 2020-10-03T15:57:43Z

Upstream merge:

Scintilla: https://sourceforge.net/p/scintilla/code/merge-requests/25/ -> Improve LexJulia ScintillaOrg/lexilla#13
Universal-ctags: Add julia parser universal-ctags/ctags#2654 -> Julia parser TODO list universal-ctags/ctags#2723

AndiMD · 2020-12-24T20:01:32Z

Thanks for your efforts so far @getzze. I'd be happy to see this merged.
I wanted to test, but have trouble building the branch julia_lexxer in a standard Kubuntu 20.04 environment (gcc 9.3.0):

  CXX      lexers/LexMake.lo
lexers/LexJulia.cxx: In static member function ‘static Scintilla::ILexer* LexerJulia::LexerFactoryJulia()’:
lexers/LexJulia.cxx:123:25: error: invalid new-expression of abstract class type ‘LexerJulia’
  123 |   return new LexerJulia();
      |                         ^
lexers/LexJulia.cxx:90:7: note:   because the following virtual functions are pure within ‘LexerJulia’:
   90 | class LexerJulia : public DefaultLexer {
      |       ^~~~~~~~~~
In file included from lexers/LexJulia.cxx:25:
./include/ILexer.h:92:34: note:         ‘virtual const char* Scintilla::ILexerWithIdentity::PropertyGet(const char*)’
   92 |  virtual const char * SCI_METHOD PropertyGet(const char *key) = 0;
      |                                  ^~~~~~~~~~~
  CXX      lexers/LexMarkdown.lo

getzze · 2020-12-25T23:21:39Z

Thanks for testing it, I commited a correction, you should be able to compile it now. Happy to hear your feedback.

AndiMD · 2020-12-27T03:19:30Z

Thanks, that fixed it. Looking pretty good already. I will continue to test this in the coming days and report back.

Two points I noticed instantly:

Highlighting interpolated strings seems inconsistent: Depending on using $(var) vs $var and single quote vs triple quote strings, I see different results. Example:
"($scanPosX,$scanPosY): (k±Δk)=$(kLaserApprox) $(begin x+5 end)"
"""($scanPosX,$scanPosY): (k±Δk)=$(kLaserApprox) $(begin x+5 end)"""
Macros confuse the lexer (possibly we have to live with that, since macros can have arbitrary syntax)
Example:

Base.@kwdef struct A
    a::Float64 = 1E-19 
    b::Float64 = 1E-22 
end

shows that the struct contains Symbols E

AndiMD · 2020-12-30T20:24:52Z

@elextr @getzze I have not discovered any further issues. If more testers or feedback on the level of demand is required, I suggest announcing this on https://discourse.julialang.org/.

elextr · 2020-12-30T22:59:01Z

@AndiMD thats entirely up to @getzze, remember no Geany devs are Juliaists, so they have no knowledge or need to support this.

Also the failing CI needs fixing.

getzze · 2020-12-31T14:09:10Z

@AndiMD Thanks for reporting these problems! I think it would be great to announce it on Julia discourse.

1. Highlighting interpolated strings seems inconsistent: Depending on using `$(var)` vs `$var` and single quote vs triple quote strings, I see different results. Example:
   `"($scanPosX,$scanPosY): (k±Δk)=$(kLaserApprox) $(begin x+5 end)"`
   `"""($scanPosX,$scanPosY): (k±Δk)=$(kLaserApprox)  $(begin x+5 end)"""`

Interpolation highlighting is only done for single-quote strings, but I am planning to allow it in triple-quote and commands also.
For $(var) vs $var, I didn't find a way to highlight $( in the former so I ended up highlighting only what is inside the parenthesis. Because in the latter case there are no parenthesis, nothing is highlighted. I can try to find a way to highlight both, including the $.

2. Macros confuse the lexer (possibly we have to live with that, since macros can have arbitrary syntax)
   Example:
Base.@kwdef struct A
a::Float64 = 1E-19
b::Float64 = 1E-22
end

Thanks for reporting, that are actually two bugs:

defined values for the attributes are not parsed at all
Base.@kwdef is not recognized as a macro (@kwdef is well recognized):

@eval fun1(x) = $pi        # recognizes `fun1` short function definition
Base.@eval fun1(x) = $pi   # does not recognize `fun1` short function definition

logankilpatrick · 2021-01-01T23:24:18Z

Love this initiative! Thank you all.

AndiMD · 2021-03-18T18:59:02Z

Found another edge case: (2E9) / (2e9)/ (2f9): the right bracket is highlighted as part of the number.

getzze · 2021-03-26T14:27:08Z

Thanks for the feedback.
I pushed a patch for it, along with some other improvements:

merge ctags from upstream
do not highlight interpolation for raw strings
correct folding
add REPL autocompletion using snippets (needs to change wordchars)

oscardssmith · 2021-05-19T17:27:05Z

What's the status on this? I love both Julia and Geany and this looks like a really good improvement.

elextr · 2021-05-19T20:17:56Z

@oscardssmith three problems:

Geany regular contributors are very quiet ATM for various reasons, but then none of them are Juliaists anyway
As noted in the first comment, the lexer and parser need to be upstreamed, even when 1. improves Geany doesn't have the effort to maintain its own versions of those
Scintilla is going through a major evolution, and is still classed as unstable by its maintainer, so any lexer submitted there will be a while before its added to Geany (after Geany is updated to the eventually stable new Scintilla)
it has merge conflicts which prevent CI building to see if its ok

oops thats 4, anyway at least you could build your own test if the merge was fixed.

getzze · 2021-05-20T12:02:40Z

I just pushed new commits to solve conflicts.
As @elextr said, julia ctag parser has been merged upstream but from Scintilla it will take much longer (they only reviewed aesthetic aspects of the code so far).
But you can build from source applying this PR.

getzze · 2021-06-02T16:03:31Z

LexJulia has been merged upstream by Scintilla (ScintillaOrg/lexilla@6ee6b08 and later commits for minor corrections).
I will push a clean rebase of this PR so it can be merged to geany.

getzze · 2021-06-02T23:24:05Z

The PR has been rebased.
The ctags parser file is identical to upstream and the scintilla lexer has minimal change (3 lines) because upstream is using lexilla v5 and geany is still at v4.
Geany-specific files need to be reviewed.

elextr

First glance at it only a few comments, also for the Julia lexer could you add a diff that applies the three lines (I think you said) to backport it from Lexilla, probably best in the same place as scintilla_changes.patch and add a prompt to update_scintilla.sh to use it so the changes are known and can be re-applied if another backport is needed.

[Edit: to be clear something to update_scintilla.sh like echo "Don't forget to manually copy the julia lexer from Lexilla and apply 'julia_lexer.patch' to it no point in trying to automate getting it from Lexilla, leave that for Scintilla 5 update]

data/filedefs/filetypes.common

data/snippets.conf

data/filedefs/filetypes.common

data/filedefs/filetypes.julia

getzze · 2021-06-21T23:39:05Z

There are still some open issues with Lexilla: ScintillaOrg/lexilla#13

elextr · 2021-06-22T02:21:08Z

I didn't look at the lexer code, but most lexers load the "keyword" lists from Geany. Not sure how you can approach the two word keywords (curse Jeff and his femtolisp parser :-) where each word is a valid variable in other contexts. Unless you simply build it into the lexer, have it know and find the two word keywords first and do not test any of the keyword list keywords in that case.

A similar (unsolved) issue happens with C++ where names like override which are keywords in one context and variable or type names in other contexts are currently keyword highlighted in all contexts since the lexer doesn't have sufficient capability to distinguish the context, it also happens in other IDEs where highlighting is separate from parsing (eg Eclipse).

So its not unprecedented if some edge cases are wrongly highlighted, just raise an issue after this is committed.

elextr · 2021-06-23T04:51:06Z

@getzze can you push some inconsequential change (or a big change if you have one :) to see if CI has been fixed?

elextr · 2021-06-24T00:31:51Z

Yay CI works again!!

One thing you might like to consider (can be later) is a test for the ctags parser, see tests/ctags so CI would notice if a ctags infrastructure update like #2830 broke your parser.

elextr · 2021-06-24T00:35:05Z

Have approved the changes (pending build-menu :) and LGTM, but havn't tested it yet, might have time in a few days then should be mergable.

getzze · 2021-06-24T09:02:36Z

Yay CI works again!!

One thing you might like to consider (can be later) is a test for the ctags parser, see tests/ctags so CI would notice if a ctags infrastructure update like #2830 broke your parser.

That's a good idea. What is the format of this file? There are unit test for julia in the ctag repository, these files can be used maybe.

elextr · 2021-06-24T10:16:52Z

IIUC most of the tests are simply from ctags anyway, many of their names relate to ctags bug reports

IIUC the tests are just one (or more) source file(s) mentioned in Makefile.am and tags file(s) with the same name but extension .tags (generated by Geany, they are tagmanager format, which is binary, so not manually generated). I think mentioning the source files in Makefile.am sources list makes the existing test harness feed the source to Geany and generate and compare to the tags files.

getzze · 2021-06-24T23:22:56Z

I pushed some changed from Lexilla (like using a new keyword list for coloring builtin functions).

I added a patch file in the scintilla folder, so the lexer works with lexilla v4.
Also added a ctags test file (I didn't manage to generate the binary file).

scintilla/julia_lexilla_v4.patch

getzze · 2021-06-28T21:49:10Z

The issue with Lexilla is solved. It's ready to merge.
If needed, I can rebase to one commit.

getzze · 2021-07-06T13:08:58Z

I just rebased to two commits. I left the ctags test file in a separate commit because I am not sure it is needed, julia parser is well tested in ctags upstream.

elextr · 2021-07-20T04:06:16Z

WFM, although testing limited, thanks @getzze

oscardssmith · 2021-07-20T05:19:51Z

Thanks so much for getting this pushed through!

VarLad · 2021-08-04T12:47:00Z

@elextr Any idea when's the next release?

eht16 · 2021-08-07T08:26:14Z

Hopefully it will happen in September, if not at least this year :).

czylabsonasa · 2022-08-01T18:42:43Z

Hello,
in the following snippet,
the 4th line after the # char constant parsed as comment:

ok=true
for r=2:R+1,c=2:C+1
  if table[r,c]=='#'
    all(table[r+d[1],c+d[2]]!='#' for d=[(1,0),(-1,0),(0,1),(0,-1)])&&(ok=false;break)
  end
end

it can be resolved by putting an extra space after the operator.
geany version is 1.39.

oscardssmith · 2022-08-01T20:53:46Z

That's not a correct fix. The correct fix would be to check whether you're inside or outside quotation marks.

elextr · 2022-08-01T21:44:23Z

Its actually lexing the first ' as an operator, possibly a postfix transpose, then the # is indeed a comment.

Lexers come from the Lexilla project. Since Geany folks don't know Julia, please post the issue there so the Julia lexer maintainer can talk to you directly.

Post a link to the issue here.

czylabsonasa · 2022-08-01T22:15:30Z

@oscardssmith: i did not mean it as a fix, just as a how to get rid of it. :-)
@elextr: thx, i'll file an issue.

getzze · 2022-08-03T09:45:57Z

Thanks for catching this bug!
I submitted a fix to lexilla: ScintillaOrg/lexilla#98

getzze changed the title ~~Create julia lexer, no ctags~~ Create julia lexer and tags parser Sep 12, 2020

codebrainz mentioned this pull request Sep 26, 2020

Julia support #434

Closed

getzze mentioned this pull request Oct 2, 2020

Add julia parser universal-ctags/ctags#2654

Merged

elextr mentioned this pull request Apr 22, 2021

Request: support for Julia #2795

Closed

getzze force-pushed the julia_lexer branch from 9b377e2 to 3b9238b Compare June 2, 2021 23:06

elextr mentioned this pull request Jun 20, 2021

Add support for Kotlin tags #2778

Closed

elextr requested changes Jun 20, 2021

View reviewed changes

data/filedefs/filetypes.common Outdated Show resolved Hide resolved

data/snippets.conf Outdated Show resolved Hide resolved

data/filedefs/filetypes.common Outdated Show resolved Hide resolved

data/filedefs/filetypes.julia Outdated Show resolved Hide resolved

elextr approved these changes Jun 24, 2021

View reviewed changes

getzze closed this Jun 24, 2021

getzze reopened this Jun 24, 2021

elextr reviewed Jun 24, 2021

View reviewed changes

scintilla/julia_lexilla_v4.patch Outdated Show resolved Hide resolved

getzze added 2 commits July 6, 2021 14:02

Create julia parser, ctags and lexilla

b3dab3b

add ctags test file

a8343e4

getzze force-pushed the julia_lexer branch from 897ae6a to a8343e4 Compare July 6, 2021 13:03

elextr merged commit 90c6096 into geany:master Jul 20, 2021

rdipardo mentioned this pull request Dec 13, 2021

RFE: please add support for Perl6 (Raku) #3054

Closed

FelipeLema mentioned this pull request Jun 7, 2022

use tree-sitter JuliaEditorSupport/julia-emacs#174

Open

czylabsonasa mentioned this pull request Aug 2, 2022

Julia ScintillaOrg/lexilla#97

Closed

elextr mentioned this pull request Jun 30, 2023

[Enhancement] Lexer-integration for the Nim/Nimrod-language #3520

Closed

elextr mentioned this pull request Sep 24, 2023

SystemVerilog filetype syntax highlighting #3511

Closed

Create julia lexer and tags parser #2584

Create julia lexer and tags parser #2584

Conversation

getzze commented Sep 11, 2020 • edited Loading

eht16 commented Sep 12, 2020

getzze commented Sep 14, 2020

elextr commented Sep 15, 2020

getzze commented Oct 3, 2020 • edited Loading

AndiMD commented Dec 24, 2020

getzze commented Dec 25, 2020

AndiMD commented Dec 27, 2020 • edited Loading

AndiMD commented Dec 30, 2020

elextr commented Dec 30, 2020

getzze commented Dec 31, 2020

logankilpatrick commented Jan 1, 2021

AndiMD commented Mar 18, 2021

getzze commented Mar 26, 2021

oscardssmith commented May 19, 2021

elextr commented May 19, 2021

getzze commented May 20, 2021

getzze commented Jun 2, 2021

getzze commented Jun 2, 2021

elextr left a comment • edited Loading

Choose a reason for hiding this comment

getzze commented Jun 21, 2021

elextr commented Jun 22, 2021

elextr commented Jun 23, 2021

elextr commented Jun 24, 2021

elextr commented Jun 24, 2021

getzze commented Jun 24, 2021 • edited Loading

elextr commented Jun 24, 2021

getzze commented Jun 24, 2021

getzze commented Jun 28, 2021

getzze commented Jul 6, 2021

elextr commented Jul 20, 2021

oscardssmith commented Jul 20, 2021

VarLad commented Aug 4, 2021

eht16 commented Aug 7, 2021

czylabsonasa commented Aug 1, 2022 • edited Loading

oscardssmith commented Aug 1, 2022

elextr commented Aug 1, 2022

czylabsonasa commented Aug 1, 2022 • edited Loading

getzze commented Aug 3, 2022

getzze commented Sep 11, 2020 •

edited

Loading

getzze commented Oct 3, 2020 •

edited

Loading

AndiMD commented Dec 27, 2020 •

edited

Loading

elextr left a comment •

edited

Loading

getzze commented Jun 24, 2021 •

edited

Loading

czylabsonasa commented Aug 1, 2022 •

edited

Loading

czylabsonasa commented Aug 1, 2022 •

edited

Loading