Releases: pyparsing/pyparsing
Pyparsing 3.2.1
-
Updated generated railroad diagrams to make non-terminal elements links to their related sub-diagrams. This greatly improves navigation of the diagram, especially for large, complex parsers.
-
Simplified railroad diagrams emitted for parsers using
infix_notation
, by hiding lookahead terms. Renamed internally generated expressions for clarity, and improved diagramming. -
Improved performance of
cpp_style_comment
,c_style_comment
,common.fnumber
andcommon.ieee_float
Regex expressions. PRs submitted by Gabriel Gerlero,
nice work, thanks! -
Add missing type annotations to
match_only_at_col
,replace_with
,remove_quotes
,with_attribute
, andwith_class
. Issue #585 reported by rafrafrek. -
Added generated diagrams for many of the examples.
-
Replaced old examples/0README.html file with examples/README.md file.
pyparsing 3.2.0
Version 3.2.0 - October, 2024
-
Discontinued support for Python 3.6, 3.7, and 3.8. Adopted new Python features from Python versions 3.7-3.9:
- Updated type annotations to use built-in container types instead of names imported from the
typing
module (e.g.,list[str]
vsList[str]
). - Reworked portions of the packrat cache to leverage insertion-preserving ordering in dicts (including removal of uses of
OrderedDict
). - Changed
pdb.set_trace()
call inParserElement.set_break()
tobreakpoint()
. - Converted
typing.NamedTuple
todataclasses.dataclass
in railroad diagramming code. - Added
from __future__ import annotations
to clean up some type annotations. (with assistance from ISyncWithFoo, issue #535, thanks for the help!)
- Updated type annotations to use built-in container types instead of names imported from the
-
POSSIBLE BREAKING CHANGES
The following bugfixes may result in subtle changes in the results returned or exceptions raised by pyparsing.
-
Fixed code in
ParseElementEnhance
subclasses that replaced detailed exception messages raised in contained expressions with a less-specific and less-informative generic exception message and location.If your code has conditional logic based on the message content in raised
ParseExceptions
, this bugfix may require changes in your code. -
Fixed bug in
transform_string()
where whitespace in the input string was not properly preserved in the output string.If your code uses
transform_string
, this bugfix may require changes in your code. -
Fixed bug where an
IndexError
raised in a parse action was incorrectly handled as anIndexError
raised as part of theParserElement
parsing methods, and reraised as aParseException
. Now anIndexError
that raises inside a parse action will properly propagate out as anIndexError
. (Issue #573, reported by August Karlstedt, thanks!)If your code raises
IndexError
s in parse actions, this bugfix may require changes in your code.
-
-
FIXES AND NEW FEATURES
-
Added type annotations to remainder of
pyparsing
package, and addedmypy
run totox.ini
, so that type annotations are now run as part of pyparsing's CI. Addresses Issue #373, raised by Iwan Aucamp, thanks! -
Exception message format can now be customized, by overriding
ParseBaseException.format_message
:def custom_exception_message(exc) -> str: found_phrase = f", found {exc.found}" if exc.found else "" return f"{exc.lineno}:{exc.column} {exc.msg}{found_phrase}" ParseBaseException.formatted_message = custom_exception_message
(PR #571 submitted by Odysseyas Krystalakos, nice work!)
-
run_tests
now detects if an exception is raised in a parse action, and will report it with an enhanced error message, with the exception type, string, and parse action name. -
QuotedString
now handles translation of escaped integer, hex, octal, and Unicode sequences to their corresponding characters. -
Fixed the displayed output of
Regex
terms to deduplicate repeated backslashes, for easier reading in debugging, printing, and railroad diagrams. -
Fixed (or at least reduced) elusive bug when generating railroad diagrams, where some diagram elements were just empty blocks. Fix submitted by RoDuth, thanks a ton!
-
Fixed railroad diagrams that get generated with a parser containing a Regex element defined using a verbose pattern - the pattern gets flattened and comments removed before creating the corresponding diagram element.
-
Defined a more performant regular expression used internally by
common_html_entity
. -
Regex
instances can now be created using a callable that takes no arguments and just returns a string or a compiled regular expression, so that creating complex regular expression patterns can be deferred until they are actually used for the first time in the parser. -
Added optional
flatten
Boolean argument toParseResults.as_list()
, to return the parsed values in a flattened list. -
Added
indent
andbase_1
arguments topyparsing.testing.with_line_numbers
. When usingwith_line_numbers
inside a parse action, setbase_1
=False, since the reportedloc
value is 0-based.indent
can be a leading string (typically of spaces or tabs) to indent the numbered string passed towith_line_numbers
. Added while working on #557, reported by Bernd Wechner.
-
-
NEW/ENHANCED EXAMPLES
-
Added query syntax to
mongodb_query_expression.py
with:- better support for array fields ("contains", "contains all", "contains any", and "contains none")
- "like" and "not like" operators to support SQL "%" wildcard matching and "=~" operator to support regex matching
- text search using "search for"
- dates and datetimes as query values
a[0]
style array referencing
-
Added
lox_parser.py
example, a parser for the Lox language used as a tutorial in Robert Nystrom's "Crafting Interpreters" (http://craftinginterpreters.com/). With helpful corrections from RoDuth. -
Added
complex_chemical_formulas.py
example, to add parsing capability for formulas such as "3(C₆H₅OH)₂". -
Updated
tag_emitter.py
to use newTag
class, introduced in pyparsing 3.1.3.
-
pyparsing 3.2.0rc1
Changes since 3.2.0b3:
- Fixed handling of
IndexError
raised in a parse action. QuotedString
parser now handles\xnn
,\ooo
, and\unnnn
characters whenconvert_whitespace_escapes
is True.- Reformatted CHANGES file for final release.
All changes in 3.2.0:
Version 3.2.0 - October, 2024
-
Discontinued support for Python 3.6, 3.7, and 3.8. Adopted new Python features from Python versions 3.7-3.9:
- Updated type annotations to use built-in container types instead of names imported from the
typing
module (e.g.,list[str]
vsList[str]
). - Reworked portions of the packrat cache to leverage insertion-preserving ordering in dicts (including removal of uses of
OrderedDict
). - Changed
pdb.set_trace()
call inParserElement.set_break()
tobreakpoint()
. - Converted
typing.NamedTuple
todataclasses.dataclass
in railroad diagramming code. - Added
from __future__ import annotations
to clean up some type annotations. (with assistance from ISyncWithFoo, issue #535, thanks for the help!)
- Updated type annotations to use built-in container types instead of names imported from the
-
POSSIBLE BREAKING CHANGES
The following bugfixes may result in subtle changes in the results returned or exceptions raised by pyparsing.
-
Fixed code in
ParseElementEnhance
subclasses that replaced detailed exception messages raised in contained expressions with a less-specific and less-informative generic exception message and location.If your code has conditional logic based on the message content in raised
ParseExceptions
, this bugfix may require changes in your code. -
Fixed bug in
transform_string()
where whitespace in the input string was not properly preserved in the output string.If your code uses
transform_string
, this bugfix may require changes in your code. -
Fixed bug where an
IndexError
raised in a parse action was incorrectly handled as anIndexError
raised as part of theParserElement
parsing methods, and reraised as aParseException
. Now anIndexError
that raises inside a parse action will properly propagate out as anIndexError
. (Issue #573, reported by August Karlstedt, thanks!)If your code raises
IndexError
s in parse actions, this bugfix may require changes in your code.
-
-
FIXES AND NEW FEATURES
-
Added type annotations to remainder of
pyparsing
package, and addedmypy
run totox.ini
, so that type annotations are now run as part of pyparsing's CI. Addresses Issue #373, raised by Iwan Aucamp, thanks! -
Exception message format can now be customized, by overriding
ParseBaseException.format_message
:def custom_exception_message(exc) -> str: found_phrase = f", found {exc.found}" if exc.found else "" return f"{exc.lineno}:{exc.column} {exc.msg}{found_phrase}" ParseBaseException.formatted_message = custom_exception_message
(PR #571 submitted by Odysseyas Krystalakos, nice work!)
-
run_tests
now detects if an exception is raised in a parse action, and will report it with an enhanced error message, with the exception type, string, and parse action name. -
QuotedString
now handles translation of escaped integer, hex, octal, and Unicode sequences to their corresponding characters. -
Fixed the displayed output of
Regex
terms to deduplicate repeated backslashes, for easier reading in debugging, printing, and railroad diagrams. -
Fixed (or at least reduced) elusive bug when generating railroad diagrams, where some diagram elements were just empty blocks. Fix submitted by RoDuth, thanks a ton!
-
Fixed railroad diagrams that get generated with a parser containing a Regex element defined using a verbose pattern - the pattern gets flattened and comments removed before creating the corresponding diagram element.
-
Defined a more performant regular expression used internally by
common_html_entity
. -
Regex
instances can now be created using a callable that takes no arguments and just returns a string or a compiled regular expression, so that creating complex regular expression patterns can be deferred until they are actually used for the first time in the parser. -
Added optional
flatten
Boolean argument toParseResults.as_list()
, to return the parsed values in a flattened list. -
Added
indent
andbase_1
arguments topyparsing.testing.with_line_numbers
. When usingwith_line_numbers
inside a parse action, setbase_1
=False, since the reportedloc
value is 0-based.indent
can be a leading string (typically of spaces or tabs) to indent the numbered string passed towith_line_numbers
. Added while working on #557, reported by Bernd Wechner.
-
-
NEW/ENHANCED EXAMPLES
-
Added query syntax to
mongodb_query_expression.py
with:- better support for array fields ("contains", "contains all", "contains any", and "contains none")
- "like" and "not like" operators to support SQL "%" wildcard matching and "=~" operator to support regex matching
- text search using "search for"
- dates and datetimes as query values
a[0]
style array referencing
-
Added
lox_parser.py
example, a parser for the Lox language used as a tutorial in Robert Nystrom's "Crafting Interpreters" (http://craftinginterpreters.com/). With helpful corrections from RoDuth. -
Added
complex_chemical_formulas.py
example, to add parsing capability for formulas such as "3(C₆H₅OH)₂". -
Updated
tag_emitter.py
to use newTag
class, introduced in pyparsing 3.1.3.
-
pyparsing 3.2.0b3
(This is the final beta release before 3.2.0.)
QuotedString
now handles translation of escaped integer, hex, octal, and Unicode sequences to their corresponding characters.
pyparsing 3.2.0b2
-
Added type annotations to remainder of
pyparsing
package, and addedmypy
run totox.ini
, so that type annotations are now run as part of pyparsing's CI. Addresses Issue #373, raised by Iwan Aucamp, thanks! -
Exception message format can now be customized, by overriding
ParseBaseException.format_message
:def custom_exception_message(exc) -> str: found_phrase = f", found {exc.found}" if exc.found else "" return f"{exc.lineno}:{exc.column} {exc.msg}{found_phrase}" ParseBaseException.formatted_message = custom_exception_message
(PR #571 submitted by Odysseyas Krystalakos, nice work!)
-
POSSIBLE BREAKING CHANGE: Fixed bug in
transform_string()
where whitespace in the input string was not properly preserved in the output string.If your code uses
transform_string
, this bugfix may require changes in your code. -
Fixed railroad diagrams that get generated with a parser containing a Regex element defined using a verbose pattern - the pattern gets flattened and comments removed before creating the corresponding diagram element.
-
Defined a more performant regular expression used internally by
common_html_entity
. -
Regex
instances can now be created using a callable that takes no arguments and just returns a string or a compiled regular expression, so that creating complex regular expression patterns can be deferred until they are actually used for the first time in the parser. -
Added optional
flatten
Boolean argument toParseResults.as_list()
, to return the parsed values in a flattened list.
Pyparsing 3.2.0b1
-
Discontinued support for Python 3.6, 3.7, and 3.8. Adopted new Python features from Python versions 3.7-3.9:
- Updated type annotations to use built-in container types instead of names imported from the
typing
module (e.g.,list[str]
vsList[str]
). - Reworked portions of the packrat cache to leverage insertion-preserving ordering in dicts.
- Changed
pdb.set_trace()
call inParserElement.set_break()
tobreakpoint()
. - Converted
typing.NamedTuple
todataclasses.dataclass
in railroad diagramming code. - Added
from __future__ import annotations
to clean up some type annotations.
- Updated type annotations to use built-in container types instead of names imported from the
-
POSSIBLE BREAKING CHANGE: Fixed code in
ParseElementEnhance
subclasses that replaced detailed exception messages raised in contained expressions with a less-specific and less-informative generic exception message and location.If your code has conditional logic based on the message content in raised
ParseExceptions
, this bugfix may require changes in your code. -
Fixed the displayed output of
Regex
terms to deduplicate repeated backslashes, for easier reading in debugging, printing, and railroad diagrams. -
Fixed (or at least reduced) elusive bug when generating railroad diagrams, where some diagram elements were just empty blocks. Fix submitted by RoDuth, thanks a ton!
-
Added
indent
andbase_1
arguments topyparsing.testing.with_line_numbers
. When usingwith_line_numbers
inside a parse action, setbase_1
=False, since the reportedloc
value is 0-based.indent
can be a leading string (typically of spaces or tabs) to indent the numbered string passed towith_line_numbers
. Added while working on #557, reported by Bernd Wechner. -
Added query syntax to
mongodb_query_expression.py
with better support for array fields ("contains", "contains all", "contains any", and "contains none"); and "like" and "not like" operators to support SQL "%" wildcard matching and "=~" operator to support regex matching. Also:- added support for dates and datetimes as query values
- added support for
a[0]
style array referencing
-
Added
lox_parser.py
example, a parser for the Lox language used as a tutorial in Robert Nystrom's "Crafting Interpreters" (http://craftinginterpreters.com/). With helpful corrections from RoDuth. -
Added
complex_chemical_formulas.py
example, to add parsing capability for formulas such as "3(C₆H₅OH)₂".
Pyparsing 3.1.4
- Fixed a regression introduced in pyparsing 3.1.3, addition of a type annotation that referenced
re.Pattern
. Since this type was introduced in Python 3.7, using this type definition broke Python 3.6 installs of pyparsing 3.1.3. PR submitted by Felix Fontein, nice work!
Pyparsing 3.1.3
-
Added new
Tag
ParserElement, for inserting metadata into the parsed results. This allows a parser to add metadata or annotations to the parsed tokens. TheTag
element also accepts an optionalvalue
parameter, defaulting toTrue
. See the newtag_metadata.py
example in theexamples
directory.Example:
# add tag indicating mood end_punc = "." | ("!" + Tag("enthusiastic"))) greeting = "Hello" + Word(alphas) + end_punc result = greeting.parse_string("Hello World.") print(result.dump()) result = greeting.parse_string("Hello World!") print(result.dump())
prints:
['Hello', 'World', '.'] ['Hello', 'World', '!'] - enthusiastic: True
-
Added example
mongodb_query_expression.py
, to convert human-readable infix query expressions (such asa==100 and b>=200
) and transform them into the equivalent query argument for the pymongo package ({'$and': [{'a': 100}, {'b': {'$gte': 200}}]}
). Supports many equality and inequality operators - see the docstring for thetransform_query
function for more examples. -
Fixed issue where PEP8 compatibility names for
ParserElement
static methods were not themselves defined asstaticmethods
. When called using aParserElement
instance, this resulted in aTypeError
exception. Reported by eylenburg (#548). -
To address a compatibility issue in RDFLib, added a property setter for the
ParserElement.name
property, to callParserElement.set_name
. -
Modified
ParserElement.set_name()
to accept a None value, to clear the defined name and corresponding error message for aParserElement
. -
Updated railroad diagram generation for
ZeroOrMore
andOneOrMore
expressions withstop_on
expressions, while investigating #558, reported by user Gu_f. -
Added
<META>
tag to HTML generated for railroad diagrams to force UTF-8 encoding with older browsers, to better display Unicode parser characters. -
Fixed some cosmetics/bugs in railroad diagrams:
- fixed groups being shown even when
show_groups
=False - show results names as quoted strings when
show_results_names
=True - only use integer loop counter if repetition > 2
- fixed groups being shown even when
-
Some type annotations added for parse action related methods, thanks August Karlstedt (#551).
-
Added exception type to
trace_parse_action
exception output, while investigating SO question posted by medihack. -
Added
set_name
calls to internal expressions generated ininfix_notation
, for improved railroad diagramming. -
delta_time
,lua_parser
,decaf_parser
, androman_numerals
examples cleaned up to use latest PEP8 names and add minor enhancements. -
Fixed bug (and corresponding test code) in
delta_time
example that did not handle weekday references in time expressions (like "Monday at 4pm") when the weekday was the same as the current weekday. -
Minor performance speedup in
trim_arity
, to benefit any parsers using parse actions. -
Added early testing support for Python 3.13 with JIT enabled.
Pyparsing 3.1.2
-
Support for Python 3.13.
-
Added
ieee_float
expression topyparsing.common
, which parses float values, plus "NaN", "Inf", "Infinity". PR submitted by Bob Peterson (#538). -
Updated pep8 synonym wrappers for better type checking compatibility. PR submitted by Ricardo Coccioli (#507).
-
Fixed empty error message bug, PR submitted by InSync (#534). This should return pyparsing's exception messages to a former, more helpful form. If you have code that parses the exception messages returned by pyparsing, this may require some code changes.
-
Added unit tests to test for exception message contents, with enhancement to
pyparsing.testing.assertRaisesParseException
to accept an expected exception message. -
Updated example
select_parser.py
to use PEP8 names and added Groups for better retrieval of parsed values from multiple SELECT clauses. -
Added example
email_address_parser.py
, as suggested by John Byrd (#539). -
Added example
directx_x_file_parser.py
to parse DirectX template definitions, and generate a Pyparsing parser from a template to parse .x files. -
Some code refactoring to reduce code nesting, PRs submitted by InSync.
-
All internal string expressions using '%' string interpolation and
str.format()
converted to f-strings.
Pyparsing 3.1.1
-
Fixed regression in
Word(min)
, reported by Ricardo Coccioli, good catch! (Issue #502) -
Fixed bug in bad exception messages raised by
Forward
expressions. PR submitted by Kyle Sunden, thanks for your patience and collaboration on this (#493). -
Fixed regression in
SkipTo
, where ignored expressions were not checked when looking for the target expression. Reported by catcombo, Issue #500. -
Fixed type annotation for
enable_packrat
, PR submitted by Mike Urbach, thanks! (Issue #498) -
Some general internal code cleanup. (Instigated by Michal Čihař, Issue #488)