Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Prolog Parser #2628

Closed
TrentSe opened this issue Aug 25, 2020 · 19 comments
Closed

Request: Prolog Parser #2628

TrentSe opened this issue Aug 25, 2020 · 19 comments

Comments

@TrentSe
Copy link

TrentSe commented Aug 25, 2020

Creating an issue for extending universal-ctags to handle Prolog [1].

... as per Masatake's request [2]

[1] https://en.wikipedia.org/wiki/Prolog
[2] #1566 (comment)

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

SWI Prolog provide a clone of Emacs that (I believe*) parses Prolog:

https://www.swi-prolog.org/pldoc/man?section=pceemacs

https://en.wikipedia.org/wiki/SWI-Prolog#PceEmacs

  • I haven't used it; I'm familiar with Vim.

@masatake
Copy link
Member

Thank you.

I will write an initial minimum version of prolog parser.
I expect you complete it. I would like you to read #2622.
The issue tells how important designing in a parser development is.

Taken from [1]:

/* input.pl */
mother_child(trude, sally).
 
father_child(tom, sally).
father_child(tom, erica).
father_child(mike, tom).
 
sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).
 
parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).

What kind of tags output do you expect?
What should be tagged? What "kinds" should be assigned to the tags?

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

Hey Masatake,

Cheers for coming back so quickly. I'll read #2622.

Thanks also for your efforts writing the minimum parser. I'll do what I can to complete it, though will probably need assistance.

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

I'm not that familiar with ctags, so may not answer your question correctly.

Tags:

predicates: mother_child, father_child etc.
atoms: tom, sally,
variables: X, Y

Translation from Prolog speak:

"predicates" -> method (Prolog has no functions)
"atom" -> static value (distinct from a string)
"variable" -> :)

String -> string (text surrounded with ",
for example "this is a SWI-Prolog string"

Text surrounded with ' are treated as an atom.
abc and 'abc' are equivalent.

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

I note that Prolog defines modules [1], for example:

36 :- module(charsio,
37 [ format_to_chars/3, % +Format, +Args, -Codes
38 format_to_chars/4, % +Format, +Args, -Codes, ?Tail
39 write_to_chars/2, % +Term, -Codes
40 write_to_chars/3, % +Term, -Codes, ?Tail
41 atom_to_chars/2, % +Atom, -Codes
42 atom_to_chars/3, % +Atom, -Codes, ?Tail
43 number_to_chars/2, % +Number, -Codes

Is the definition for the module 'charsio' [2].

Predicates listed in the list (from line 37 onwards) are added to the global namespace. Predicates not listed are still accessible, by prepending the module name. For example charsio:some_other_predicate.

Fyi, the number following the predicate name (e.g. atom_to_chars/3) is the arity; it specifies the number of arguments....

[1] https://www.swi-prolog.org/pldoc/man?section=modules
[2] https://www.swi-prolog.org/pldoc/doc/_SWI_/library/charsio.pl?show=src

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

If you have the time, here is a quick intro to Prolog:

https://www.youtube.com/watch?v=SykxWpFwMGs

It's 1 hour long, but you wouldn't need to watch anything like that much to see most of the syntax...

@masatake
Copy link
Member

Thank you but prolog knowledge is enough.
Expected tags output is really needed.
It is the area I cannot help you.

@masatake
Copy link
Member

Let's focus on smaller input:

mother_child(trude, sally).

What should be tagged?
If we have perfect prolog parser in ctags, which tokens may ctags capture?

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

Here's the syntax doco fro Gnu Prolog:

http://gprolog.org/manual/html_node/gprolog019.html

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

mother_child(trude, sally).

If we have perfect prolog parser in ctags, which tokens may ctags capture?

I'm not that familiar with ctags yet. I can read more to be more helpful.

From what I know now, "mother_child" would be tagged as the equivalent of a C method, trude and sally as the equivalent of C strings.

... would it be helpful for me to write an "equivalent" C program and run it through u-ctags xref...?

My C is rusty, but it would be something like:

void mother_child( "trude", "sally" );

I know this isn't valid C; but trude and sally aren't variables here.

By comparison,

void mother_child( "trude", char *Name ) {
    printf( "%s%n", Name );
}

would be:

mother_child( trude, Name ) :- writeln( Name ).

in Prolog.

@masatake
Copy link
Member

O.k. I wrote minimum version of prolog parser.

$ cat input.pl 
mother_child(trude, sally).
$ cat prolog.ctags
--langdef=Prolog
--map-Prolog=.pl
--kinddef-Prolog=p,predicate,predicates
--regex-Prolog=/^([a-zA-Z_]+)\([^.]+\)./\1/p/
$ ./ctags --options=./prolog.ctags --languages=Prolog -o - input.pl 
mother_child	input.pl	/^mother_child(trude, sally).$/;"	p	language:Prolog

Is this the same as what you expect?

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

That was quick!

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

... I also think I see what you're doing. I can build on that...

@masatake
Copy link
Member

Let's extend the input a bit.

parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).

The first question is parent_child should be tagged twice or once?

I guess you may want to have a arity: field like:

parent_child input.pl  /^...$/;"  p arity:2

Am I correct? if yes, what I should do for the input:

mother_child( trude, Name ) :- writeln( Name ).

Is the arity for mother_child 1 or 2?

@TrentSe
Copy link
Author

TrentSe commented Aug 25, 2020

parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).

The first question is parent_child should be tagged twice or once?

It shoud be tagged twice; they're multiple definitions of the predicate (method).

I guess you may want to have a arity: field like:

parent_child input.pl /^...$/;" p arity:2

Maybe, though I'm not sure how that would be used. Arity is just the count of predicate arguments...

mother_child( trude, Name ) :- writeln( Name ).

Is the arity for mother_child 1 or 2?

2 - the size of the argument list [ trude, Name ].

Everything after the :- is the "implementation" (in C terms... 😄 ).

@masatake
Copy link
Member

$ cat input.pl
/* input.pl */
mother_child(trude, sally).

father_child(tom, sally).
father_child(tom, erica).
father_child(mike, tom).
 
sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).
 
parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).

% dummy0()
/* dummy1() */

$ cat optlib/prolog.ctags
--langdef=Prolog

--map-Prolog=.pl

###
# kind definitions
#
--kinddef-Prolog=p,predicate,predicates
--kinddef-Prolog=v,variable,variables

###
# table declarations
#
--_tabledef-Prolog=main
--_tabledef-Prolog=args
--_tabledef-Prolog=impl
--_tabledef-Prolog=comment
--_tabledef-Prolog=comment_multiline
--_tabledef-Prolog=comment_oneline
--_tabledef-Prolog=any
--_tabledef-Prolog=ignoreWhiteSpace

###
# utilities
#
--_mtable-regex-Prolog=any/.//
--_mtable-regex-Prolog=ignoreWhiteSpace/[ \t\n]+//

###
# comment
#
--_mtable-regex-Prolog=comment/\/\*//{tenter=comment_multiline}
--_mtable-regex-Prolog=comment/\%//{tenter=comment_oneline}

--_mtable-regex-Prolog=comment_multiline/\*\///{tleave}
--_mtable-extend-Prolog=comment_multiline+any

--_mtable-regex-Prolog=comment_oneline/\n//{tleave}
--_mtable-extend-Prolog=comment_oneline+any


###
# main
#
--_mtable-extend-Prolog=main+comment
--_mtable-extend-Prolog=main+ignoreWhiteSpace
--_mtable-regex-Prolog=main/([a-zA-Z_][a-zA-Z_0-9]*)/\1/p/{scope=push}
--_mtable-regex-Prolog=main/\(//{tenter=args}
--_mtable-regex-Prolog=main/\.//{scope=pop}
--_mtable-regex-Prolog=main/:-//{tenter=impl}
--_mtable-extend-Prolog=main+any

###
# args
#
--_mtable-extend-Prolog=args+comment
--_mtable-extend-Prolog=args+ignoreWhiteSpace
--_mtable-regex-Prolog=args/\)//{tleave}
--_mtable-regex-Prolog=args/[a-z,]+//
--_mtable-regex-Prolog=args/([A-Z][A-Za-z]*)/\1/v/{scope=ref}
--_mtable-extend-Prolog=args+any

###
# impl
#
--_mtable-extend-Prolog=impl+comment
--_mtable-extend-Prolog=impl+ignoreWhiteSpace

# If . is found, push back it. So the upper table (main) can handle it.
--_mtable-regex-Prolog=impl/\.//{tleave}{_advanceTo=0start}
--_mtable-extend-Prolog=impl+any

$ ./ctags --sort=no --options=./optlib/prolog.ctags --languages=Prolog -o - input.pl 
mother_child	input.pl	/^mother_child(trude, sally).$/;"	p	language:Prolog
father_child	input.pl	/^father_child(tom, sally).$/;"	p	language:Prolog
father_child	input.pl	/^father_child(tom, erica).$/;"	p	language:Prolog
father_child	input.pl	/^father_child(mike, tom).$/;"	p	language:Prolog
sibling	input.pl	/^sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).$/;"	p	language:Prolog
X	input.pl	/^sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).$/;"	v	language:Prolog	predicate:sibling
Y	input.pl	/^sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).$/;"	v	language:Prolog	predicate:sibling
parent_child	input.pl	/^parent_child(X, Y) :- father_child(X, Y).$/;"	p	language:Prolog
X	input.pl	/^parent_child(X, Y) :- father_child(X, Y).$/;"	v	language:Prolog	predicate:parent_child
Y	input.pl	/^parent_child(X, Y) :- father_child(X, Y).$/;"	v	language:Prolog	predicate:parent_child
parent_child	input.pl	/^parent_child(X, Y) :- mother_child(X, Y).$/;"	p	language:Prolog
X	input.pl	/^parent_child(X, Y) :- mother_child(X, Y).$/;"	v	language:Prolog	predicate:parent_child
Y	input.pl	/^parent_child(X, Y) :- mother_child(X, Y).$/;"	v	language:Prolog	predicate:parent_child

@masatake
Copy link
Member

I guess arity field and signature field should be filled.

@masatake
Copy link
Member

Feel free to reopen this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants