[cs] CPD: Replace C# tokenizer by an Antlr-based one #2280

maikelsteneker · 2020-02-12T11:08:18Z

Before submitting a PR, please check that:

The PR is submitted against master. The PMD team will merge back to support branches as needed.
./mvnw clean verify passes. This will build and test PMD, execute PMD and checkstyle rules. Check this for more info

PR Description:
This replaces the C# tokenizer by one based on Antlr, in line to how it's done for many other languages already.

This has a couple of advantages, including adding column information and fixing #2139

In order to correctly filter using directives as the previous tokenizer did, I had to extend the BaseTokenFilter class. Existing subclasses should not be impacted by this, but the new functionality might be useful for other languages as well.

Fixes #2139

When filtering tokens, the analyzeToken method can be overriden to access the current token. This can then be used to implement isLanguageSpecificDiscarding. However, it may be desirable to "look ahead" and base the decision of whether to filter or not on multiple tokens. In order to support this new use case, a new extension point analyzeTokens is provided, which not only has access to the current token, but can also iterate over the upcoming tokens. The functionality of iterating over remaining tokens uses Guava for its implementation. Since pmd-core targets Java 7, the Android flavour of Guava is used. In order to stay consistent with pmd-apex-jorje, this has also been adjusted to the Android flavour. For PMD 7.0, the jre flavour can be used instead.

This is based on the Antlr grammar from https://github.com/antlr/grammars-v4/tree/master/csharp. This adds column information for C# and fixes pmd#2139.

pmd-test · 2020-02-12T11:31:49Z

	1 Message
📖	This changeset introduces 0 new violations, 0 new errors and 0 new configuration errors, removes 0 violations, 0 errors and 0 configuration errors. Full report
✅	This changeset introduces 0 new violations, 0 new errors and 0 new configuration errors, removes 0 violations, 0 errors and 0 configuration errors. Full report

Generated by 🚫 Danger

adangel

Thanks, the seems to improve C# a lot.

However, I'm especially unsure about the licensing - see the comment.

pmd-core/src/test/java/net/sourceforge/pmd/cpd/token/internal/BaseTokenFilterTest.java

pmd-apex-jorje/pom.xml

pmd-core/pom.xml

pmd-cs/src/main/antlr4/net/sourceforge/pmd/lang/cs/antlr4/CSharpLexer.g4

adangel

Thanks!

maikelsteneker added 2 commits February 12, 2020 11:12

C# tokenizer is now Antlr-based.

bdfbfae

This is based on the Antlr grammar from https://github.com/antlr/grammars-v4/tree/master/csharp. This adds column information for C# and fixes pmd#2139.

adangel changed the title ~~[C#][cpd] Replace C# tokenizer by an Antlr-based one~~ [cs] CPD: Replace C# tokenizer by an Antlr-based one Feb 13, 2020

adangel reviewed Feb 13, 2020

View reviewed changes

Rewrite to avoid Guava dependency.

4bd5a15

maikelsteneker requested a review from adangel February 27, 2020 11:05

adangel added this to the 6.22.0 milestone Feb 29, 2020

adangel approved these changes Feb 29, 2020

View reviewed changes

adangel added a commit that referenced this pull request Feb 29, 2020

[doc] Update release notes, fixes #2139, refs #2280.

b65a6a5

adangel merged commit 4bd5a15 into pmd:master Feb 29, 2020

This was referenced Mar 6, 2020

[cs] CPD: fixes in filtering of using directives #2338

Merged

[cs] CPD: Fixed CPD --ignore-usings option #2339

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cs] CPD: Replace C# tokenizer by an Antlr-based one #2280

[cs] CPD: Replace C# tokenizer by an Antlr-based one #2280

maikelsteneker commented Feb 12, 2020 •

edited by adangel

Loading

pmd-test commented Feb 12, 2020 •

edited

Loading

adangel left a comment

adangel left a comment

[cs] CPD: Replace C# tokenizer by an Antlr-based one #2280

[cs] CPD: Replace C# tokenizer by an Antlr-based one #2280

Conversation

maikelsteneker commented Feb 12, 2020 • edited by adangel Loading

pmd-test commented Feb 12, 2020 • edited Loading

adangel left a comment

Choose a reason for hiding this comment

adangel left a comment

Choose a reason for hiding this comment

maikelsteneker commented Feb 12, 2020 •

edited by adangel

Loading

pmd-test commented Feb 12, 2020 •

edited

Loading