-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New generate_query_graph tool. #9212
Conversation
…package. Realized this is useful in case people who use the python utility want to inspect join graphs This reverts commit 9661e2c.
…extra 'nice' features
scripts/generate_querygraph.py
Outdated
text = f.read() | ||
|
||
if open_output: | ||
os.system('open "' + output.replace('"', '\\"') + '"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the webbrowser module here instead
tools/pythonpkg/duckdb/__init__.py
Outdated
_exported_symbols.extend([ | ||
"typing", | ||
"functional" | ||
"functional", | ||
"duckdb_query_graph" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the module name needs to include duckdb
total_timing_sum += self.get_summary_phase_timings(phase).time | ||
return total_timing_sum | ||
|
||
QUERY_TIMINGS = AllTimings() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this being reset anywhere, can the graph generator only be used once per interpreter session?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the idea yea, but it's probably good to just reset this every time a new profile output is passed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to not use a global to begin with probably?
scripts/generate_querygraph.py
Outdated
@@ -0,0 +1,45 @@ | |||
# generate_querygraph.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's include this code and ship it as an executable module instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that means it runs as python3 -m generate_querygraph explain_output.json
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like
python -m duckdb.query_graph ...
But basically yeah. Could even publish it an standalone executable on the user machines that way too
scripts/generate_querygraph.py
Outdated
|
||
sys.path.insert(0, 'benchmark') | ||
|
||
arguments = sys.argv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using the argparse
library makes this much less cumbersome
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
Pressed the re-review button too early. Wait a couple of minutes |
text = f.read() | ||
|
||
if open_output: | ||
webbrowser.open('file:///Users/tomebergen/scripts/run_imdb/tpch_new_html/q08.html', new=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to replace this?
translate_json_to_html(input, output) | ||
|
||
with open(output, 'r') as f: | ||
text = f.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this for?
'chart_script': '' | ||
} | ||
|
||
# ???? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
????
# if detailed profiling is enabled, then information on the timing of each stage of the query is available | ||
# that is "top level timing information" | ||
def get_top_level_timings(json): | ||
return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to contain something?
|
||
def open_utf8(fpath, flags): | ||
import sys | ||
if sys.version_info[0] < 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't support python 2 anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just inline this function, no warning required
scripts/generate_querygraph.py
Outdated
@@ -0,0 +1,45 @@ | |||
# generate_querygraph.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file can be deleted now right?
…e unused code, remove support for python2
|
||
def open_utf8(fpath, flags): | ||
import sys | ||
if sys.version_info[0] < 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just inline this function, no warning required
# add up all of the times | ||
# measure each time as a percentage of the total time. | ||
# then you can return a list of [phase, time, percentage] | ||
child_timings = get_child_timings(json['children'][0], query_timings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: this variable is never used, just call the function
f.write(html) | ||
|
||
def main(): | ||
sys.path.insert(0, 'benchmark') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have you added this line?
@Mause Some of those lines are leftover from the old script which I dropped in. Sorry for not thoroughly checking it. Thank you for the reviews 🙏 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just need to run the make format-fix
script and update for the stubs test
Ready from my side 👍 |
Thanks! |
The generate query graph code was taken out in #6972. Here I'm attempting to add it back in the generate query graph code. Using some pastel colors but it can honestly be anything.
Some goals for this version of the query graph
Things to improve
open to some initial feedback right now about colors/design.
You can check out the branch and generate the query graph with the following commands while in the duckdb directory.
Old query graph generator output
New query graph generator output