-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiline still not working with single parsable lines in statement #11
Comments
Update: This still needs work to do, because multiline statements which contains lines that are parsable (with ast) on its own are crashing the program. |
Maybe this is covered by the last sentence above (I couldn't understand it) but keep in mind cases like this: if (
3 +
"String"):
foo()
bar() FYI, the way stack_data works is that it parses the entire file and then extracts what it needs from there, so it never parses partial broken code. Are you still planning on collapsing multiple lines into one? That could lead to some very long lines. |
@alexmojaki I tried using stack_data for extracting the code from the tracebacks, but didnt have much success. The problem was that the traceback I get contains frames of type FrameSummary, which either stack_data or other dependencies could not handle. Or maybe I am using the lib wrong. To recap: We have the scenario that the program is crashing in a multiline statement. We get the traceback and therefore get the current line (line of crash) and the line number. ast.parse(line) To inspect the used variables. This crashes because python only gives us the first line in which the runtime error appears. @alexmojaki can you provide me with a little example on how stack_data can be useful here? You say that it parses the entire file. Can we then deduce the full statement by a given single line? And yes the whole multiline statement is collapsed into one. The ConsoleWriter is not (yet) sophisticated enough to allow |
Indeed, you need to pass an actual traceback object, e.g. the last argument of
Yes, that's basically it. We know where each node starts and ends so we look for the one containing the current line.
Right, this isn't just a technical problem, it's a design problem that may have no solution. better-exceptions has the same look and faces the same problem. I tried to brainstorm solutions in https://github.com/Qix-/better-exceptions/issues/92 but didn't get anywhere. Your collapsing idea is a neat and clever solution but it can easily get out of hand. |
There's a new import stack_data
formatter = stack_data.Formatter(
options=stack_data.Options(before=0, after=0),
)
formatter.set_hook()
# Trigger an exception
import json
print(
json.loads(" s s")
) I do need to improve the documentation and such, but motivation is limited because the library is intended for a very niche audience, mostly library writers like yourself. I'm happy to help you integrate it though. |
Okay after some trying out I think I got what I want. frame_info = stack_data.FrameInfo(traceback_)
t = frame_info.executing.source.statements_at_line(last_stack.lineno)
t = t.pop()
print(ast.dump(t)) # Contains the statement I am looking for I digged a little through stack_data and found out that your @alexmojaki Can you comment on the code snippet and tell me if there is a more cleaner way to get the ast nodes of the statement? |
The problem is you don't really want a whole statement. If the statement is a for loop or something else massive you're in trouble. That's why stack_data has the concept of a piece. Here's some code: import ast
import stack_data
class FlatFormatter(stack_data.Formatter):
def format_frame(self, frame):
yield self.format_frame_header(frame)
collapsed = "".join(line.text for line in frame.lines)
yield f"{collapsed}\n"
cumulative_offset = 0
for line in frame.lines:
for var, node in frame.variables_by_lineno[line.lineno]:
if isinstance(node, ast.Name):
token = node.first_token
elif isinstance(node, ast.Attribute):
token = node.last_token
else:
# Not clear how to point to subscripts
continue
offset = cumulative_offset + token.start[1]
yield " " * offset + f"^ = {var.value!r}\n"
cumulative_offset += len(line.text)
formatter = FlatFormatter(
options=stack_data.Options(before=0, after=0),
)
formatter.set_hook()
# Trigger an exception
import json
formatter.j = " s s"
try:
for _ in (
json.loads(formatter.j)
):
pass
except:
formatter.print_exception() Result: Traceback (most recent call last):
File "/home/alex/.config/JetBrains/PyCharm2020.2/scratches/scratch_979.py", line 37, in <module>
for _ in ( json.loads(formatter.j) ):
^ = <__main__.FlatFormatter object at 0x7f7a09111a60>
^ = ' s s'
File "/home/alex/.pyenv/versions/3.8.5/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
^ = ' s s'
^ = <json.decoder.JSONDecoder object at 0x7f7a08e44d30>
File "/home/alex/.pyenv/versions/3.8.5/lib/python3.8/json/decoder.py", line 337, in JSONDecoder.decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^ = ' s s'
^ = ' s s'
^ = <json.decoder.JSONDecoder object at 0x7f7a08e44d30>
^ = <built-in method match of re.Pattern object at 0x7f7a08828930>
File "/home/alex/.pyenv/versions/3.8.5/lib/python3.8/json/decoder.py", line 355, in JSONDecoder.raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
^ = ' s s'
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)
|
I could not really wrap my head around the formatters from stack_data, but I managed to use the "pieces". def extract_statement_piece(traceback_: TracebackType, last_stack) -> List[List[Token]]:
"""Get frame infos and get code pieces (from stack_data) by line
of crash """
frame_info = stack_data.FrameInfo(traceback_)
pieces = frame_info.executing.source.pieces
tokens = frame_info.executing.source.tokens_by_lineno
statement_piece_tokens = []
for piece in pieces:
if last_stack.lineno in list(piece):
for line in list(piece):
statement_piece_tokens.append(tokens[line])
return statement_piece_tokens This probably can be cleaned up, but now is working with multiline statements swell as for-loops, if-stmts etc. For now I will put this issue on solved. |
It worries me that all the AST info has been lost and now you just have a list of tokens, especially if you're going to be using that to point to variables. Working with tokens doesn't go well. Some examples of my experience with this:
Here is the same code from before without formatters: import ast
import stack_data
def format_frame(tb):
frame = stack_data.FrameInfo(tb, stack_data.Options(before=0, after=0))
collapsed = "".join(line.text for line in frame.lines)
yield f"{collapsed}\n"
cumulative_offset = 0
for line in frame.lines:
for var, node in frame.variables_by_lineno[line.lineno]:
if isinstance(node, ast.Name):
token = node.first_token
elif isinstance(node, ast.Attribute):
token = node.last_token
else:
# Not clear how to point to subscripts
continue
offset = cumulative_offset + token.start[1]
yield " " * offset + f"^ = {var.value!r}\n"
cumulative_offset += len(line.text)
# Trigger an exception
def main():
import json
json.j = " s s"
try:
for _ in (
json.loads(json.j)
):
pass
except Exception as e:
for line in format_frame(e.__traceback__):
print(line, end='')
main() By the way, |
I am indirectly pointing to variables, all I want are the tokens of the statement or the piece of statement. Those are then formatted and after we apply the value annotation. So no offset calculations on the actual tokens. You are probably right, that there will be some drawbacks, maybe if we want to do more sophisticated stuff, but for now this should work. |
Tested in cmd
The text was updated successfully, but these errors were encountered: