Skip to content

Commit

Permalink
Improve Python lexer (#1919)
Browse files Browse the repository at this point in the history
* Improve Python lexer

This includes three changes to the Python lexer:

1) Move the `decorator` rule before the `operator` rule.

   Before this change, decorators were never getting recognized. This
   was because if there was ever an `@`, it was always satisfying the
   `operator` rule. This remedies that by first checking for the
   `decorator` pattern and then the operator pattern.

2) Recognize functions and classes when they are called.

   Previously, functions and classes were only recognized when they were
   defined. With this change, they will also be recognized and styled
   when they are called.

3) Don't recognize imported modules as `Namespace`s.

   It is not desirable to highlight imported modules like namespaces.
   With this change, they are simply recognized as the general `Name`
   token.

* Add more python visual samples

* Add python class visual sample

---------

Co-authored-by: Tan Le <tan.le@hey.com>
  • Loading branch information
dunkmann00 and tancnle authored Feb 1, 2023
1 parent a4ed658 commit a218f22
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 6 deletions.
13 changes: 8 additions & 5 deletions lib/rouge/lexers/python.rb
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ def self.builtins
end

def self.builtins_pseudo
@builtins_pseudo ||= %w(self None Ellipsis NotImplemented False True)
@builtins_pseudo ||= %w(None Ellipsis NotImplemented False True)
end

def self.exceptions
Expand Down Expand Up @@ -86,20 +86,22 @@ def current_string
rule %r/\\\n/, Text
rule %r/\\/, Text

rule %r/@#{dotted_identifier}/i, Name::Decorator

rule %r/(in|is|and|or|not)\b/, Operator::Word
rule %r/(<<|>>|\/\/|\*\*)=?/, Operator
rule %r/[-~+\/*%=<>&^|@]=?|!=/, Operator

rule %r/(from)((?:\\\s|\s)+)(#{dotted_identifier})((?:\\\s|\s)+)(import)/ do
groups Keyword::Namespace,
Text,
Name::Namespace,
Name,
Text,
Keyword::Namespace
end

rule %r/(import)(\s+)(#{dotted_identifier})/ do
groups Keyword::Namespace, Text, Name::Namespace
groups Keyword::Namespace, Text, Name
end

rule %r/(def)((?:\s|\\\s)+)/ do
Expand All @@ -112,6 +114,9 @@ def current_string
push :classname
end

rule %r/([a-z_]\w*)[ \t]*(?=(\(.*\)))/m, Name::Function
rule %r/([A-Z_]\w*)[ \t]*(?=(\(.*\)))/m, Name::Class

# TODO: not in python 3
rule %r/`.*?`/, Str::Backtick
rule %r/([rfbu]{0,2})('''|"""|['"])/i do |m|
Expand All @@ -120,8 +125,6 @@ def current_string
push :generic_string
end

rule %r/@#{dotted_identifier}/i, Name::Decorator

# using negative lookbehind so we don't match property names
rule %r/(?<!\.)#{identifier}/ do |m|
if self.class.keywords.include? m[0]
Expand Down
19 changes: 18 additions & 1 deletion spec/visual/samples/python
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,26 @@ x @= y
f'{hello} world {int(x) + 1}'
f'{{ {4*10} }}'
f'result: {value:{width}.{precision}}'
f'{value!r}
f'{value!r}'

# Unicode identifiers
α = 10
def coöperative(б):
return f"{б} is Russian"

def __init__(self, input_dim: list, output_dim: int, **kwargs):
super(AverageEmbedding, self).__init__(**kwargs)
self.input_dim = input_dim
self.output_dim = output_dim

@abstractmethod
def evaluate(self, *args, **kwargs):
raise NotImplementedError

def _get_metadata(self, input_samples: list[InputSample]) -> Tuple[str, str, int]:
project_name, id = self._get_info()

class Spam:
pass

spam = Spam()

0 comments on commit a218f22

Please sign in to comment.