Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: monitor: hubble: wireguard: add TRACE_{FROM/TO}_CRYPTO #34958

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

smagnani96
Copy link
Contributor

@smagnani96 smagnani96 commented Sep 19, 2024

  1. bpf: monitor: hubble: add TRACE_{FROM,TO}_CRYPTO observation point: add needed structure to support these new trace points, metrics, and hubble/monitor parsing logic for these new types of events.
  2. bpf: wireguard: add trace notification in cil_to_wireguard: enable the TRACE_TO_CRYPTO from (1) in cil_to_wireguard (cilium_wg0 egress).
  3. bpf: wireguard: test: add wireguard egress hook metrics check: add new bpf test file for metrics in (3).
  4. bpf: wireguard: TRACE_FROM_CRYPTO in bpf_host: enable TRACE_FROM_CRYPTO from (1) in cil_from_netdev (cilium_wg0 ingress).
  5. bpf: wireguard: test: add wireguard ingress hook metrics check: extend existing bpf tests for (5). Will be reverted and placed the logic in (4) when moving to a dedicated Wireguard program cil_from_wireguard.
  6. bpf: nodeport: retrieve src_sec_id when available in native routing: enable logging the source endpoint ID when possible in the nodeport codepath.
  7. bpf: nodeport: retrieve src_sec_id when available in tunnel routing: similar as (7) but for when using overlay.

An example of output with both ingress/egress programs attached on Wireguard (native routing, KPR, nodeEncryption):

<- endpoint 1418 flow 0x0 , identity 53498->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.3.84 -> 10.244.1.28 EchoRequest
-> network flow 0x0 , identity 53498->30469 state new ifindex eth0 orig-ip 0.0.0.0: 10.244.3.84 -> 10.244.1.28 EchoRequest
-> crypto flow 0x0 , identity 53498->unknown state unknown ifindex cilium_wg0 orig-ip 0.0.0.0: 40:00:3f:01:53:89 -> 45:00:00:54:cd:c8 UnknownEthernetType
-> network encrypted  flow 0x0 , identity unknown->remote-node state unknown ifindex eth0 orig-ip 0.0.0.0: 172.18.0.2:51871 -> 172.18.0.4:51871 udp
<- network encrypted  flow 0x0 , identity remote-node->unknown state new ifindex eth0 orig-ip 0.0.0.0: 172.18.0.4:51871 -> 172.18.0.2:51871 udp
<- crypto flow 0x0 , identity 30469->unknown state unknown ifindex cilium_wg0 orig-ip 0.0.0.0: 00:00:3f:01:2b:bb -> 45:00:00:54:35:97 UnknownEthernetType
-> endpoint 1418 flow 0x0 , identity 30469->53498 state reply ifindex lxcc04fd937fd43 orig-ip 10.244.1.28: 10.244.1.28 -> 10.244.3.84 EchoReply
~ cilium bpf metrics list
REASON                   DIRECTION   PACKETS   BYTES    LINE   FILE
...
Interface Decrypted      INGRESS     212       80412    1199   bpf_host.c
Interface Decrypting     INGRESS     160       58754    1199   bpf_host.c # <-- from cilium_wg0, will be cilium_from_wireguard in future
Interface Decrypting     INGRESS     42        3655     1156   bpf_host.c
Interface Encrypting     EGRESS      46        3904     267    nodeport_egress.h
Interface Encrypting     EGRESS      47        2941     601    nodeport_egress.h

NOTE: I opened #37095 while working on this PR. As you can see from above tracing output, packets being traced from an L3 device (ex. Wireguard) are erroneously decoded. The problem has been identified and discussed in the linked issue, and will be tackled on a separate PR.

Fixes: #35280

Add TRACE_{FROM/TO}_CRYPTO observation point and bpf metrics for packets forwarded-to/received-from Wireguard.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 19, 2024
@smagnani96 smagnani96 force-pushed the pr/wg-trace-reason-encrypt branch from 14ca639 to 00ceb20 Compare September 19, 2024 09:42
@smagnani96 smagnani96 force-pushed the pr/wg-trace-reason-encrypt branch 3 times, most recently from 80ed9a1 to f352aa8 Compare September 27, 2024 15:40
@smagnani96 smagnani96 force-pushed the pr/wg-trace-reason-encrypt branch from f352aa8 to 49cc912 Compare September 30, 2024 06:57
@smagnani96
Copy link
Contributor Author

/test

@smagnani96 smagnani96 marked this pull request as ready for review September 30, 2024 14:44
@smagnani96 smagnani96 requested review from a team as code owners September 30, 2024 14:44
Copy link
Member

@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding from-wireguard to trace decryption events makes sense to me. But I think we can use TRACE_{FROM,TO}_CRYPTO_DEV in from-wireguard and to-wireguard instead of anywhere in bpf_host. Those two hooks see the plain-text packets which should be easy to retrieve src_id for trace.

If the above point also makes sense to you, then maybe we don't have to add new mark MARK_MAGIC_WG_DECRYPTED.

bpf/bpf_overlay.c Outdated Show resolved Hide resolved
@smagnani96 smagnani96 marked this pull request as draft October 2, 2024 09:38
@smagnani96 smagnani96 changed the title wireguard: mark and trace encrypted Ingress/Egress traffic adding TRACE_{FROM/TO}_CRYPTO_DEV for tracing (do-not-review) Oct 4, 2024
Copy link

github-actions bot commented Nov 7, 2024

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Nov 7, 2024
@smagnani96 smagnani96 changed the title adding TRACE_{FROM/TO}_CRYPTO_DEV for tracing (do-not-review) wip - adding TRACE_{FROM/TO}_CRYPTO_DEV for tracing Nov 11, 2024
@smagnani96 smagnani96 added the dont-merge/preview-only Only for preview or testing, don't merge it. label Nov 11, 2024
@github-actions github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Nov 12, 2024
Copy link

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Dec 12, 2024
Copy link

This pull request has not seen any activity since it was marked stale.
Closing.

@smagnani96 smagnani96 force-pushed the pr/wg-trace-reason-encrypt branch 5 times, most recently from 37d872a to 7466413 Compare January 29, 2025 09:08
@smagnani96 smagnani96 removed request for a team and rgo3 January 29, 2025 10:44
@smagnani96 smagnani96 force-pushed the pr/wg-trace-reason-encrypt branch from 7466413 to 9ed1102 Compare January 29, 2025 11:11
@smagnani96
Copy link
Contributor Author

/test

@smagnani96
Copy link
Contributor Author

smagnani96 commented Jan 29, 2025

Added new commits (6), (7) to improve tracing, plus I added comments in the code where things might not be 100% clear otherwise (ex. why we should check THIS_INTERFACE_INDEX == WG_IFINDEX in bpf_host).

Could I have a 2nd look from reviewers please? 🙏

@rolinh rolinh requested review from kaworu and removed request for rolinh January 29, 2025 17:12
@kaworu kaworu requested a review from gandro January 29, 2025 18:07
@smagnani96 smagnani96 force-pushed the pr/wg-trace-reason-encrypt branch from 9ed1102 to 4cc1766 Compare January 30, 2025 16:55
@smagnani96
Copy link
Contributor Author

/test

This commit introduces the trace TRACE_{FROM_TO}_CRYPTO observation point
to indicate whether a packet is coming/going through the encryption/decryption
path. This can turn useful for both WireGuard and IPSec, as showed in
subsequent commits.
These values differentiate from the current one as they will be used to
update different types of metrics REASON_{EN,DE}CRYPT that will refers to
different files/lines. This provides more information concerning the flow
of the packet when using `cilium-dbg bpf metrics list`,
`cilium-dbg monitor`, and `hubble observe.
The old TRACE_REASON_ENCRYPTED (along with the REASON_DECRYPT metric) is
still valid and refers to observed encrypted packets prior/after the
crypto process. This patch wants to provide additional metrics and does
not aim to replace those already existing.

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
This commit adds send_trace_notify into the cil_to_wireguard.
With no monitor aggregations, the notification is logged using the new
TRACE_TO_CRYPTO observation point.
With monitor aggregation, the trace event will be displayed according to
the retrieved `monitor` value. In any case, the code updated the respective
bpf metric, and it shows up with `cilium bpf metrics list`, meaning that
packets are going through the WireGuard encryption hook.

Note that we attach the `cil_to_wireguard` program only given a set of
condition. In all the other cases, no metrics updates nor trace events
are expected from this program. Always-injecting wg-related program will
be considered for future enhancements.

An example of a trace event would look as follows:

```bash
-> crypto flow 0x0 , identity unknown->unknown state unknown ifindex cilium_wg0 orig-ip 0.0.0.0: 40:00:3f:01:48:2f -> 45:00:00:54:d9:f3 UnknownEthernetType
```

The bpf metrics map would contain also the following entry:

```bash
REASON                   DIRECTION   PACKETS   BYTES       LINE   FILE
...
Interface Encrypting     EGRESS      1218      103428      57    bpf_wireguard.c
...
```

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
This commit introduces a new bpf test file to make sure that `cil_to_wireguard`
correctly updates the metrics map with REASON_ENCRYPTED.
Here we do not explicitly test the whole egress hook logic (red-DNAT in
handle_nat_fwd) as already covered in other test files. For this reason,
the remaining aspect to cover is the correct metric update.

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
This commit modifies the tracing observation point in cil_from_netdev
in case the program is injected into the Wireguard device. We do that
to attach `cil_from_netdev` to the ingress hook of the wg device given
a set of conditions. In that case, the tracing system would still use
TRACE_FROM_NETWORK as observation point: let's update it with the new
TRACE_FROM_CRYPTO one (a from-network even would already be traced from
the native device).

This commit also adds the minimal logic in `trace.h` to handle
`{FROM,TO}_CRYPTO` events either coming from bpf_wireguard.c or bpf_host.c.
Using these observation points outside the expected code path
(ENABLE_WIREGUARD && (IS_BPF_HOST || IS_BPF_WIREGUARD)) would result in
a build bug.

An example of a trace event would look as follows:

```bash
<- crypto flow 0x0 , identity unknown->unknown state unknown ifindex cilium_wg0 orig-ip 0.0.0.0: 40:00:3f:01:48:2f -> 45:00:00:54:d9:f3 UnknownEthernetType
```

The bpf metrics map would contain also the following entry:

```bash
REASON                   DIRECTION   PACKETS   BYTES       LINE   FILE
...
Interface Decrypting     INGRESS     7303      470892      1199   bpf_host.c
...
```

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
This commit modifies the current tc_nodeport_l3_dev logic to be executed
from the wireguard device (IS_BPF_WIREGUARD and ENABLE_WIREGUARD).
Unfortunately, our Wireguard ingress hook is represented by `cil_from_netdev`,
therefore here we cannot add such a check in the apposite wireguard_metrics.c
file. Once we'll move to a custom `cil_from_wireguard` then we can simply
test this logic there and restore the current file by decoupling-it from
IS_BPF_WIREGUARD.

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
This commit adds to the current nodeport functions the possibility to
retrieve and use the src_sec_id while tracing packets. This only holds
true when the packet mark has the MARK_MAGIC_IDENTITY set. An example of
a trace before and after this patch is reported next.

Before:

```bash
-> crypto flow 0x0 , identity unknown->unknown state unknown ifindex cilium_wg0 orig-ip 0.0.0.0: 40:00:3f:01:48:2f -> 45:00:00:54:d9:f3 UnknownEthernetType
```

After:

```bash
-> crypto flow 0x0 , identity 21567->unknown state unknown ifindex cilium_wg0 orig-ip 0.0.0.0: 40:00:3f:01:48:2f -> 45:00:00:54:d9:f3 UnknownEthernetType
```

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
This commit adds to the current nodeport functions the possibility to
retrieve and use the src_sec_id while tracing packets. This only holds
true when the packet mark has the MARK_MAGIC_OVERLAY set. An example of
a trace before and after this patch is reported next.

Before:

```bash
-> overlay flow 0x0 , identity unknown->unknown state unknown ifindex cilium_vxlan orig-ip 0.0.0.0: 10.244.3.84 -> 10.244.1.28 EchoRequest
```

After:

```bash
-> overlay flow 0x0 , identity 53498->unknown state unknown ifindex cilium_vxlan orig-ip 0.0.0.0: 10.244.3.84 -> 10.244.1.28 EchoRequest
```

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
@smagnani96 smagnani96 force-pushed the pr/wg-trace-reason-encrypt branch from 4cc1766 to c845c97 Compare January 31, 2025 17:35
@smagnani96
Copy link
Contributor Author

/test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. area/metrics Impacts statistics / metrics gathering, eg via Prometheus. feature/wireguard Relates to Cilium's Wireguard feature kind/enhancement This would improve or streamline existing functionality. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enrich datapath WireGuard tracing from/to cilium_wg0
5 participants