Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

object: service account authentication via vault agent #9872

Closed
wants to merge 1 commit into from

Conversation

thotz
Copy link
Contributor

@thotz thotz commented Mar 9, 2022

Description of your changes:
Added support for service account authentication via vault agent for RGW
KMS, followed example from vault docs
using sidecar container
Signed-off-by: Jiffin Tony Thottan thottanjiffin@gmail.com

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

@thotz
Copy link
Contributor Author

thotz commented Mar 9, 2022

@leseb : Above PR is incomplete, need to add test case, documentation and change the hardcoded values etc. But I have a few questions wrt implementation (added in the right place in the PR), hence posting the PR with workflow gist

@@ -113,8 +113,9 @@ func (c *clusterConfig) makeRGWPodSpec(rgwConfig *rgwConfig) (v1.PodTemplateSpec
controller.DaemonVolumes(c.DataPathMap, rgwConfig.ResourceName),
c.mimeTypesVolume(),
),
HostNetwork: c.clusterSpec.Network.IsHost(),
PriorityClassName: c.store.Spec.Gateway.PriorityClassName,
ServiceAccountName: "rook-ceph-rgw",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are creating a separate service account for RGW, I would first make a PR to do that separately from this.

},
Ports: []v1.ContainerPort{{ContainerPort: 8100}},
VolumeMounts: []v1.VolumeMount{vaultAgentVolMount},
Resources: c.store.Spec.Gateway.Resources,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leseb do you know if this is going to double the resource requests limits of the RGW pod?

}
podSpec.Volumes = append(podSpec.Volumes, vaultAgentVol)
podSpec.Containers = append(podSpec.Containers,
c.vaultAgentContainer(rgwConfig))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this as an init container instead of a sidecar to reduce resource usage?

Copy link
Contributor Author

@thotz thotz Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BlaineEXE : Here RGW needs to send the request to vault-agent for fetching the key instead of vault-server so this need to up and running. Can we set a limit on the resources for sidecar containers??

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are already setting resources for the sidecar here: https://github.com/rook/rook/pull/9872/files/5f71a8171c0048e777ebe60d24348cb7bbeb8d36#r827487726

But I am almost certain the vault agent doesn't need to have the same resource reservation as the RGW.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will change that into minimal

@@ -253,7 +276,29 @@ func (c *clusterConfig) vaultTokenInitContainer(rgwConfig *rgwConfig) v1.Contain
SecurityContext: controller.PodSecurityContext(),
}
}

func (c *clusterConfig) vaultAgentContainer(rgwConfig *rgwConfig) v1.Container {
Copy link
Member

@BlaineEXE BlaineEXE Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of hard-coding vault support into Rook, can we start suggesting that users use the vault agent injector? https://www.vaultproject.io/docs/platform/k8s/injector and possibly supplying lighter-weight options for RGW pods to get the secret from some location that is determined by the injector?

I think it is pretty cumbersome for us to always be trying to play catch-up to support this or that key management service when I think there must be a better way by now.

@leseb ?

Copy link
Contributor Author

@thotz thotz Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the Vault agent injector directly injects secrets(keys stored) from the vault server to the application pod, which is not required by RGW. RGW sends curl requests to fetch secrets from the vault and the s3 client currently send the request with key name, both do not need to mount the encryption key value mounted as a file in the pod.
RGW uses vault agent as authentication mechanism with vault server so that it gets different flavours of authentication(initially it only support token-based in which token is hardcoded into the file and that path provided in RGW config)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can RGW not be given commandline options to read secrets from a local file?
Does RGW specifically support vault, or does the secret fetching code apply to any key value store?

My concern is that there are many different key value stores other than vault, and I don't think Rook should be in the business of hard-coding sidecars for myriad different KMSes if we can implement a more robust solution that will work more broadly.

We unfortunately already have this trend implemented for OSDs, and we are seeing requests to add support for more and more KMS integrations. I don't think this is a catch-up game we should really be playing.

This PR (and this specific commit) are one example of the effort needed today. It also showcases that there may be a method we can use for the implementation that is broadly applicable rather than KMS-specific. #9325 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can RGW not be given commandline options to read secrets from a local file?
With current code, it is not possible it HTTP request using libcurl to fetch token from the vault server

Does RGW specifically support vault, or does the secret fetching code apply to any key value store?

For KMS RGW supports vault(supported in downstream as well commonly used I guess), KIMP and barbarian(not tested I guess)

My concern is that there are many different key value stores other than vault, and I don't think Rook should be in the business of hard-coding sidecars for myriad different KMSes if we can implement a more robust solution that will work more broadly.

Currently, I guess sidecar is required for vault, for other implementations RGW is taking directly with KMS server

We unfortunately already have this trend implemented for OSDs, and we are seeing requests to add support for more and more KMS integrations. I don't think this is a catch-up game we should really be playing.

Even I am not sure whether we can have generic solution since each KMS server has their own configuration

This PR (and this specific commit) are one example of the effort needed today. It also showcases that there may be a method we can use for the implementation that is broadly applicable rather than KMS-specific. #9325 (comment)

Copy link
Member

@BlaineEXE BlaineEXE Mar 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, I guess sidecar is required for vault, for other implementations RGW is taking directly with KMS server

Can the RGW be configured to communicate with the vault server directly instead of requiring the sidecar? Or is the vault agent also a requirement in non-containerized Ceph installs?

From reading the Ceph docs, it seems like RGW can contact the vault server directly, so I'm curious why we need the agent in Rook if RGW can just connect directly.
https://docs.ceph.com/en/latest/radosgw/vault/#token-authentication

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, I guess sidecar is required for vault, for other implementations RGW is taking directly with KMS server

Can the RGW be configured to communicate with the vault server directly instead of requiring the sidecar? Or is the vault agent also a requirement in non-containerized Ceph installs?
@BlaineEXE : Yes and this is already implemented for the Rook as well. Vault is agent is recommended and from RHCS 5.x only vault agent is supported AFAIR

From reading the Ceph docs, it seems like RGW can contact the vault server directly, so I'm curious why we need the agent in Rook if RGW can just connect directly. https://docs.ceph.com/en/latest/radosgw/vault/#token-authentication

Vault agent is used for service account-based authentication which is already supported in OSD and ceph-csi etc.

@mergify
Copy link

mergify bot commented Apr 4, 2022

This pull request has merge conflicts that must be resolved before it can be merged. @thotz please rebase it. https://rook.io/docs/rook/latest/development-flow.html#updating-your-fork

Adds support for service account authentication via vault agent for RGW
KMS, followed example from [vault docs](https://learn.hashicorp.com/tutorials/vault/agent-kubernetes?in=vault/kubernetes)

Signed-off-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
@thotz thotz force-pushed the vault-agent-sa-authentication branch from 5f71a81 to 64b97fe Compare April 29, 2022 12:31
@thotz thotz changed the title [WIP] object: service account authentication via vault agent object: service account authentication via vault agent Apr 29, 2022
@thotz thotz marked this pull request as ready for review April 29, 2022 12:33
@thotz
Copy link
Contributor Author

thotz commented Apr 29, 2022

test case for CI is missing, I will add in the next iteration

@mergify
Copy link

mergify bot commented May 18, 2022

This pull request has merge conflicts that must be resolved before it can be merged. @thotz please rebase it. https://rook.io/docs/rook/latest/development-flow.html#updating-your-fork

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Labeled by the stale bot label Jun 17, 2022
func (c *clusterConfig) generateVaultAgentConfigMap() error {
k := k8sutil.NewConfigMapKVStore(c.store.Namespace, c.context.Clientset, c.ownerInfo)
if _, err := k.GetValue(c.clusterInfo.Context, vaultAgentCM, vaultAgentConfigFile); err == nil || !kerrors.IsNotFound(err) {
logger.Infof("vault agent config map %q for object store %q already exists, not overwriting", vaultAgentCM, c.store.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the connection details are updated in the object store CR, shouldn't this configmap be updated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most likely the RGW pod needs to restart if we update connection details. Otherwise, the sidecar pod does not have the latest details. Apart from config map a lot of environment variables are set for the side car pod based on the connection details

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to restart the rgw pod if the connection details are updated. But we still need to update the configmap here, right? Or do we also need to trigger an rgw pod restart?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks now I got it, during I need to update the configmap as well, currently only creating the cm. Will update in the next version

@thotz thotz added keepalive and removed stale Labeled by the stale bot labels Jun 20, 2022
@mergify
Copy link

mergify bot commented Sep 27, 2022

This pull request has merge conflicts that must be resolved before it can be merged. @thotz please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

1 similar comment
@mergify
Copy link

mergify bot commented Mar 20, 2023

This pull request has merge conflicts that must be resolved before it can be merged. @thotz please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

@thotz
Copy link
Contributor Author

thotz commented Aug 9, 2023

Closing this PR since I am not planning to work on it and there is very little interest for the feature

@thotz thotz closed this Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants