Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ringhash: fix a couple of flakes in e2e style tests #7784

Merged
merged 1 commit into from
Oct 29, 2024

Conversation

easwars
Copy link
Contributor

@easwars easwars commented Oct 25, 2024

Fixes #7464
Fixes #7777

Summary of fixes:

  • To simulate unresponsive backends, we used to create a listener and immediately close it. What was happening was that the port was being returned to the operating system and was getting used by some other test that was running in parallel. (FYI: even though we don't do t.Parallel in our tests, the go test command runs tests from different packages in parallel, and therefore the port could get picked by a test from a different package)
    • To fix this, I switched to using a RestartableListener that holds on to the port, but closes the connection as soon as it accepts one.
  • Some tests were calling cc.GetState to verify that the state of the grpc client. I switched them to use testutils.AwaitState to make them more forgiving.
    • Note that I have used testutils.AwaitState in some places where cc.GetState might do the job. I did this before the former uses less lines of code and does the same job anyways.

RELEASE NOTES: none

Copy link

codecov bot commented Oct 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.62%. Comparing base (52d7f6a) to head (300bd26).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7784      +/-   ##
==========================================
- Coverage   81.74%   81.62%   -0.12%     
==========================================
  Files         374      374              
  Lines       38149    38149              
==========================================
- Hits        31185    31141      -44     
- Misses       5689     5723      +34     
- Partials     1275     1285      +10     

see 17 files with indirect coverage changes

@easwars easwars added this to the 1.68 Release milestone Oct 25, 2024
Copy link
Member

@arvindbr8 arvindbr8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo one comment

// We don't close these listeners here, to make sure ports are
// not reused across them, and across tests.
lis.Stop()
t.Cleanup(func() { lis.Close() })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was concerned if this would slow down our tests. But seems like we have ports ~ 49152 to 65535 available to use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen us running out of ports so far. So, I'm not worried about it at this point.

Comment on lines +240 to +244
go func() { lis.Accept() }()

// We don't close these listeners here, to make sure ports are
// not reused across them, and across tests.
lis.Stop()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These statements are basically contesting for the lock in RestartableListener.
Could we potentially avoid the lis.Accept call altogether and simply call stop directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the call to Accept, connections from the client will hang instead of fail (which is what we want). In fact, I didn't have the Accept in there and I spent a while debugging the test, until I figured out that we need an Accept in there :(

@arvindbr8
Copy link
Member

GA checks are lost in limbo?

@easwars
Copy link
Contributor Author

easwars commented Oct 29, 2024

GA checks are lost in limbo?

Looks like force push is the way out. I will give that a try.

@easwars easwars merged commit 2e3f547 into grpc:master Oct 29, 2024
15 checks passed
@easwars easwars deleted the ringhash_flakes branch October 29, 2024 20:00
misvivek pushed a commit to misvivek/grpc-go that referenced this pull request Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants