You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are experiencing the following issues with our Kubeshark worker deployment:
High Packet Loss:
Traffic monitoring shows significant packet loss, impacting data accuracy.
TCP Stream Timeout Problems:
TCP streams are truncated even after increasing the TCP_STREAM_CHANNEL_TIMEOUT_MS to 10000.
Limited clarity on the purpose of TCP_STREAM_CHANNEL_TIMEOUT_SHOW.
Worker Restart: Found failed daemon pod kubeshark/kubeshark-worker-daemon-set-5q89h on node, will try to kill it
Actions Taken
Increased TCP_STREAM_CHANNEL_TIMEOUT_MS to 10 seconds.
Enabled TCP_STREAM_CHANNEL_TIMEOUT_SHOW for insights (behavior unclear).
Optimized resource configurations for worker pods.
Updated health probes for better container monitoring.
Enabled "-enable-resource-guard".
I've also attached "kubeshark-worker-daemon" YAML and "Received Packet Vs Dropped Packet graph" for more insight. Any pointers, advice, or resources would be super helpful! kubeshark-daemon.yaml.zip
Thanks.
The text was updated successfully, but these errors were encountered:
Hi @nitindhiman314e
Usually you shouldn't see any packet loss as long as the Worker enjoys sufficient resources. In these situations you can either increase resources or use backend filters to reduce the amount of traffic that will be processed.
Hi @nitindhiman314e Usually you shouldn't see any packet loss as long as the Worker enjoys sufficient resources. In these situations you can either increase resources or use backend filters to reduce the amount of traffic that will be processed.
Thank you for your reply. I currently have 4 nodes, and the corresponding 4 Kubeshark workers are running. Out of these, 2 workers have been running for the last 35 hours without any dropped packets. However, the other 2 are restarting intermittently and showing a high packet loss count. Below are the daemon worker resource limits we are using:
I have also attached the current usage of the Kubeshark workers. Could you please suggest any tuning recommendations?
Regarding the worker restarts, we encountered the following error:
Back-off restarting failed container sniffer in pod kubeshark-worker-daemon-set-8tk5l_kubeshark (ee32d21b-85be-49e6-8120-7fab7e986ad2)
We are experiencing the following issues with our Kubeshark worker deployment:
Actions Taken
I've also attached "kubeshark-worker-daemon" YAML and "Received Packet Vs Dropped Packet graph" for more insight. Any pointers, advice, or resources would be super helpful!
kubeshark-daemon.yaml.zip
Thanks.
The text was updated successfully, but these errors were encountered: