Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bgp control plane: add route aggregation feature #37275

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

romanspb80
Copy link
Contributor

Add route aggregation feature in BGP Control Plane for service advertisements (RFC 4632)
New route aggregation option under the CiliumBGPAdvertisement.

apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
name: bgp-advertisements
labels:
advertise: bgp
spec:
advertisements:
- advertisementType: "Service"
service:
aggregate: true <<<<<<<<<<<<<< New Field

Fixes: #36777

Signed-off-by: Roman Ptitcyn romanspb@yahoo.com

…isements (RFC 4632)

New route aggregation option under the CiliumBGPAdvertisement.
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
  name: bgp-advertisements
  labels:
    advertise: bgp
spec:
  advertisements:
    - advertisementType: "Service"
      service:
        aggregate: true                   <<<<<<<<<<<<<<   New Field

Fixes: cilium#36777

Signed-off-by: Roman Ptitcyn <romanspb@yahoo.com>
New route aggregation option under the CiliumBGPAdvertisement.

Fixes: cilium#36777

Signed-off-by: Roman Ptitcyn <romanspb@yahoo.com>
@romanspb80 romanspb80 requested review from a team as code owners January 27, 2025 00:53
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 27, 2025
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Jan 27, 2025
Copy link
Contributor

@harsimran-pabla harsimran-pabla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @romanspb80 Before jumping into the code review of the implementation, I think we need to agree on the API change. I still think user should specify aggregation length instead of dynamically calculating it based on the boolean flag.

In vast majority of cases there is going to be single IP returned from getServicePrefix
, so with single /32 or /128 prefix there will be nothing to aggregate.

Have you tested this change on kubernetes cluster with sample loadbalancer service ? can you provide some example outputs how aggregation will look like?

@romanspb80
Copy link
Contributor Author

Hi @romanspb80 Before jumping into the code review of the implementation, I think we need to agree on the API change. I still think user should specify aggregation length instead of dynamically calculating it based on the boolean flag.

In vast majority of cases there is going to be single IP returned from getServicePrefix , so with single /32 or /128 prefix there will be nothing to aggregate.

Have you tested this change on kubernetes cluster with sample loadbalancer service ? can you provide some example outputs how aggregation will look like?

Hi @harsimran-pabla . Thanks for the comment.
I agree with you that the most common case would be single prefix. But there are couple of points which are worth discussing.
For example, we can use annotation to assign multiple ips for serice advertisement:

apiVersion: v1
kind: Service
metadata:
  name: nginxservice
  annotations:
    "io.cilium/lb-ipam-ips": "172.16.102.2,172.16.102.8"
spec:
  type: LoadBalancer
  selector:
    app: nginx
  ports:
  - name: grpc
    port: 8888
    targetPort: 80 

In this case the dynamically calculating works:

bash-5.1# ./gobgp neighbor 172.18.0.2 adj-in
   ID  Network              Next Hop             AS_PATH              Age        Attrs
   0   10.244.1.0/24        172.18.0.2           65002                00:03:05   [{Origin: i}]
   0   172.16.102.0/28      172.18.0.2           65002                00:00:05   [{Origin: i}]
bash-5.1# 

The dynamically calculated mask /28 is more efficiently then if we set the mask /24 manualy.

Also there is another case which I noted in my comment :
"I would also like to note that in this pull request, aggregation is performed only for prefixes within the service. But in this PR I made a feature in which aggregation takes into account all services with this option and the route aggregation is global for all prefixes of the affected services. I suggest discussing this point too."

Thanks

@harsimran-pabla
Copy link
Contributor

But there are couple of points which are worth discussing. For example, we can use annotation to assign multiple ips for serice advertisement

Yes, but the common case is that each service gets single /32 prefix. And if we pick start and end address from various services which have aggregate flag set, resulting aggregated prefix can be very large range - as services can have IPs allocated from different lb-ipam pools.

@harsimran-pabla
Copy link
Contributor

"I would also like to note that in this pull request, aggregation is performed only for prefixes within the service. But in this #36791 I made a feature in which aggregation takes into account all services with this option and the route aggregation is global for all prefixes of the affected services. I suggest discussing this point too."

This approach was modifying LB Pool with a aggregate flag. Why not introduce new BGP advertisement type which will advertise complete LB IPAM pool range ? Implementation would be much simpler and not change existing service reconcilers.

I am trying to understand if the goal is to reduce /32 prefix announcements, can we achieve it with this simpler mechanism. Existing service announcements can be configured if user wants to service with eTP=Local to work.

@romanspb80
Copy link
Contributor Author

romanspb80 commented Jan 30, 2025

"I would also like to note that in this pull request, aggregation is performed only for prefixes within the service. But in this #36791 I made a feature in which aggregation takes into account all services with this option and the route aggregation is global for all prefixes of the affected services. I suggest discussing this point too."

This approach was modifying LB Pool with a aggregate flag. Why not introduce new BGP advertisement type which will advertise complete LB IPAM pool range ? Implementation would be much simpler and not change existing service reconcilers.

I am trying to understand if the goal is to reduce /32 prefix announcements, can we achieve it with this simpler mechanism. Existing service announcements can be configured if user wants to service with eTP=Local to work.

Yes, that makes sense. So instead of dynamic calculation use the cidr (or all cidrs from the LB pool) mask right away?
For example, if cidr from LB pool is 172.16.102.0/25 then bgp advertisement will be same in case of enabled aggregation.
We can also add an option for the case where the pool contains multiple cidrs, then they can be aggregated if they are adjacent.

@harsimran-pabla
Copy link
Contributor

We can also add an option for the case where the pool contains multiple cidrs, then they can be aggregated if they are adjacent.

No, I would still prefer each pool advertisement. Atleast we should start with this common case and see later if it makes sense to improve further upon it. In this approach, feature is not about prefix aggregation. It is more about LB Pool advertisement.

@harsimran-pabla
Copy link
Contributor

There are two approaches to reduce /32 advertisements, I am not leaning towards one or the other.

  1. Introduce new advertisement type, which is LB Pool Advert. This will advertise LB Pool prefix regardless of checking backend services. Much simpler implementation as well as larger prefix block can be advertised. Drawback is that if you'd want services with eTP=Local, this is not enough. And other cases of cluster ip/external ip advertisement, this advertisement type is not the solution for it.
  2. Provide prefix length as previously discussed here.

I am not yet convinced if we need to jump straight into auto-aggregation, we can look at it in the future. At this stage, I think we need something simpler and deterministic.

@romanspb80
Copy link
Contributor Author

Ok. Will do this approach:

  1. Provide prefix length as previously discussed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. kind/community-contribution This was a contribution made by a community member.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: BGP Control Plane - BGP advertisements, route aggregation
2 participants