-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify the semantics of "query timeouts" #13156
Comments
Good question - I think we should define a timeout from the user side as this gives us maximum flexibility for the solution So explicitly: a timeout is when a user starts a query but doesn't get a result back due to the query being too slow or clickhouse cancelling the query as it believes it will be too slow. The user very likely ends up seeing "There was an error completing this query" If we don't already differentiate whether a query failed because it was too slow vs other potential causes of failure we'd likely want to track both, but for this OKR only focus on the failures due to things being too slow |
On a side note: A quick win I've seen is if a user sees "There was an error completing this query" we should likely guide them on how to reduce the query complexity to speed it up In reproducing the timeout, I was able to "fix it" by shorting the time from "All time" to "past 180 days" - it's likely many users wouldn't consider e.g. they aren't familiar with SQL or aren't thinking about the mechanics under the hood |
@macobo how are we doing with deciding on this OKR? Would be great to get this finalized by EOD tomorrow (Thursday) if feasible |
Discussed sync and the conclusion: "Aligned that timeouts are very important and a cause of frustration - but we don't need to have this at an OKR level as will take a sprint to properly set up all the analytics for it" |
@macobo up to you if you want to keep this issue open or close it |
I'll keep this open - we should definitely fix the problems listed here in the upcoming sprints. I'd love to get help from someone more UI/UX focussed than I am though on it. |
Raw notes from chat with Paul:
|
Background context
When talking about query performance, a topic that frequently comes up is "timeouts". However I argue that the current system of timeouts is hap-hazard and ill-specified.
We should clarify what we want out of "timeouts" in our product and how to action them.
Notes on the existing system
max_execution_time=180
into our settings on US and EU cloud: 1, 2.Which of these is a "timeout"? Should we throw all of this out and rethink it all from scratch? I propose so.
Additional context
Not proposing a solution now - this is enough of a mess.
cc @lharries and @pauldambra - we need to clarify all of these semantics before we settle on any OKRs.
The text was updated successfully, but these errors were encountered: