-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create or use user's Session
in WorkflowsService
to reuse connections
#1183
Create or use user's Session
in WorkflowsService
to reuse connections
#1183
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1183 +/- ##
=======================================
+ Coverage 81.7% 81.9% +0.2%
=======================================
Files 54 61 +7
Lines 4208 4750 +542
Branches 889 1015 +126
=======================================
+ Hits 3439 3892 +453
- Misses 574 625 +51
- Partials 195 233 +38 ☔ View full report in Codecov by Sentry. |
src/hera/events/service.py
Outdated
self.namespace = namespace or global_config.namespace | ||
|
||
def request(self, method, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd mark this private; I don't think we want this added to the public API. (Same for WorkflowsService.)
def request(self, method, **kwargs): | |
def _request(self, method, **kwargs): |
src/hera/events/service.py
Outdated
@@ -47,6 +47,7 @@ def __init__( | |||
token: Optional[str] = None, | |||
client_certs: Optional[Tuple[str, str]] = None, | |||
namespace: Optional[str] = None, | |||
use_session: Optional[bool] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not be more flexible to pass in a session
directly? (Same for WorkflowsService.)
Just out of curiosity - is there any reason not to enable it by default, i.e. is it not a strict enhancement to go from Thinking about client usage of with WorkflowsService(...) as ws:
ws.create(...)
ws.list(...) This would implicitly use a single session. If we want to also allow users to pass in the session for flexibility as per @alicederyn's comment, then they can follow a similar pattern. with request.Session(...) as session:
ws = WorkflowsService(session=session, ...)
ws.create(...)
ws.list(...) It seems reusing a "closed" session just reopens it, and you can close multiple times, so using a "dead" session doesn't seem to be a problem: >>> import requests
>>> session = requests.Session()
>>> session.get("https://example.com/")
<Response [200]>
>>> session.close()
>>> session.get("https://example.com/")
<Response [200]>
>>> session.close()
>>> session.close() |
I was unable to find a clear statement that sessions were thread-safe in requests. I did find a closed issue asking for it to be documented one way or another but I couldn't tell what the close status was intended to be. I'm also not sure if the service classes are intended to be thread-safe themselves? |
Oh, I fully agree :) and I think its fits anyway the "spirit" of having a WorkflowsService: it makes sense for it to hold a session to the Argo Server for the duration of its existence. I just initially thought of introducing this change as opt-in, but if we all agree then it can be the default. Note that the Workflow class creates a default WorkflowsService if not provided, so this would also implicitly initialize and keep one session alive for the duration of the Workflow object (i.e. keep connections opened if calls were made) - and this would also not call it as context manager. So maybe the default service for workflows shouldn't use a session?
I can add a
Perhaps it would be better to let the WorkflowService open and close the user provided Session, rather than relying on the user to open/close it. Also because we need to do it anyway if the user doesn't pass a custom Session and we use the regular one. session = MyCustomSession()
with WorkflowsService(session=session) as ws:
# will open/close the session
ws.list(...)
with Workflow(..., workflows_service=ws) as wf1:
...
with Workflow(..., workflows_service=ws) as wf2:
...
with Workflow(...) as wf3:
# wf3.workflows_service != ws
... It is probably better to still have to explicitly pass the service to any Workflow created under the context manager block (
It's a long-standing question indeed, I've always sided with the "better safe than sorry" approach and used one session per thread. But since the urllib3 pool is thread safe in the newer versions (2022+), it is probably fine
I feel like it should be thread safe: If you have tens of Workflows to submit at once, it makes sense to use one service and reuse connections. Else you just end up with the current behavior of opening one session thus one connection per call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the checks failing with Code is not up-to-date. Please run 'make codegen'
- I'll run this with the proposed fix in __exit__
, then LGTM! 🚀
scripts/service.py
Outdated
|
||
def __exit__(self, *_): | ||
\"\"\"Close the service.\"\"\" | ||
self.session.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we add any more teardown logic, I think we should use close
self.session.close() | |
self.close() |
|
||
def __exit__(self, *_): | ||
"""Close the service.""" | ||
self.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you had it before but didn't regenerate the code?
Signed-off-by: Timothé Perez <achille.ash@gmail.com>
Signed-off-by: Timothé Perez <achille.ash@gmail.com>
Signed-off-by: Elliot Gunton <elliotgunton@gmail.com>
b179740
to
adad63e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! 🚀
Session
in WorkflowsService
to reuse connections
Pull Request Checklist
Description of PR
The WorkflowService uses direct calls to requests library, which under the hood creates a Session, makes the request, then closes the session.
This means that the TCP connection is opened and closed for each call instead of being reused - which causes unnecessary overhead if the service is making multiple calls.
This PR adds the possibility to use a Session object in the Service, by default not enabled to keep the current behavior as-is.
From a performance point of view, if Argo is running locally the advantage will be virtually non-existent ; however, we can simulate the outcome with a remote Argo server using toxiproxy:
Using ipython and the magic command %timeit:
In toxiproxy logs (and the kubectl port-forward logs) , we can see that the tcp open is done now only once.
Thanks!