Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use network buffer less aggressive in ping probe #634

Merged
merged 2 commits into from
Nov 10, 2023

Conversation

jumpojoy
Copy link
Contributor

@jumpojoy jumpojoy commented Nov 9, 2023

When have higher number of targets (seen on 600+), no matter in single
probe or in different probes, we may cause network buffer overflow as
we add packets to the buffer much faster than NIC can handle them.
To avoid this we may increase net.core.wmem sysctl, or add 1 milisecond
dalay between buffer writes (which is done by this patch). This allows
to have stable results for even 2k targets in single probe.

Also patch improves logging

  • Added debug log when probe is started.
  • Do not print duplicate p.name as it is always added by logger,
    add p.runCnt instead
  • Improve debug message for a case when packet does not match runId
    this is usually happening when we received packet after probe timeout.

@manugarg
Copy link
Contributor

manugarg commented Nov 9, 2023

Thanks @jumpojoy!

This changes makes sense to me. Can you please sync your repo and send it again? There is a Dockerfile.dev in the PR causing conflicts.

@jumpojoy jumpojoy force-pushed the fix_buffer_overflow branch from 6fc0728 to 0ef6a08 Compare November 10, 2023 07:42
@jumpojoy
Copy link
Contributor Author

Thanks for review @manugarg, done.

When have higher number of targets (seen on 600+), no matter in single
probe or in different probes, we may cause network buffer overflow as
we add packets to the buffer much faster than NIC can handle them.
To avoid this we may increase net.core.wmem sysctl, or add 1 milisecond
dalay between buffer writes (which is done by this patch). This allows
to have stable results for even 2k targets in single probe.

Also patch improves logging
 - Added debug log when probe is started.
 - Do not print duplicate p.name as it is always added by logger,
   add p.runCnt instead
 - Improve debug message for a case when packet does not match runId
   this is usually happening when we received packet after probe timeout.
@jumpojoy jumpojoy force-pushed the fix_buffer_overflow branch from 0ef6a08 to 2477b78 Compare November 10, 2023 08:33
Copy link
Contributor

@manugarg manugarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of minor suggestions.

probes/ping/ping.go Outdated Show resolved Hide resolved
probes/ping/ping.go Outdated Show resolved Hide resolved
Co-authored-by: Manu Garg <manugarg@gmail.com>
@manugarg manugarg merged commit eb9d596 into cloudprober:master Nov 10, 2023
8 checks passed
@jumpojoy jumpojoy deleted the fix_buffer_overflow branch November 10, 2023 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants