-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature - SSO keycloak #5691
base: develop
Are you sure you want to change the base?
Feature - SSO keycloak #5691
Conversation
@paulbauriegel Thanks for this contribution. Last week I started working on a code refactoring to simplify the OAuth provider configuration, having a better integration with the social auth package. The design changes a bit with my changes. Maybe it would be nice if you could adapt yours based on this PR. If not, we can combine them later. |
@frascuchon Ok, let me have a look. I will try to understand the changes. Since you are working on the oauth, it would be nice to be able to use the roles from the oauth audience to have oauth users access specific workspaces based on those roles. I wanted to contribute this in a later part. |
Great @paulbauriegel! I have some doubts about how to match the OAuth audience with the Argilla roles. I would love to hear your thoughts on that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docs folder is outdated. For 2.x docs you should use the argilla/docs
folder.
Hi @paulbauriegel , Here are docs section related to the refactor PR. It would be nice if you could take a look and give some feedback. Also, maybe can be useful to understand the refactoring approach. |
Thank you, yes I will have a look. Just to set expectations, I will only have some time later this week :-) |
@frascuchon I looked through your code. It's rather clear to integrate a new SSO. Integration of self-hosted SSO providers, such as Keycloak, where the e.g. authorization_url is dynamic based on the configuration create a small problem. Generally speaking I would rather configure such settings in the oauth.yaml then via env variables, but it might be a bit more complicated now since there is a common Provider class so there are no optional extra settings that one might need for an SSO that requires more settings. I will open a new MR based on the new code tomorrow. |
This is an example I get back from the Oauth token: {
"exp": 1732293901,
"iat": 1732293601,
"auth_time": 1732292840,
"jti": "0d11e667-4174-41fe-978e-45a9248d009d",
"iss": "http://localhost:8080/realms/argilla",
"aud": [
"argilla",
"account"
],
"sub": "66caeedf-df61-4cc3-9ebf-c269643e454e",
"typ": "Bearer",
"azp": "argilla",
"sid": "359fe83f-7263-486d-93a3-61949d15d224",
"acr": "0",
"allowed-origins": [
"http://127.0.0.1:5000",
"http://localhost:5000"
],
"realm_access": {
"roles": [
"default-roles-argilla",
"offline_access",
"uma_authorization",
"llmbot-annotations-at"
]
},
"resource_access": {
"account": {
"roles": [
"manage-account",
"manage-account-links",
"view-profile"
]
},
"argilla": {
"roles": [
"argilla-access"
]
}
},
"scope": "microprofile-jwt aud email openid profile",
"upn": "paulat",
"email_verified": true,
"name": "Paul Bauriegel",
"groups": [
"default-roles-argilla",
"offline_access",
"uma_authorization",
"llmbot-annotations-at"
],
"preferred_username": "paulat",
"given_name": "Paul",
"family_name": "Bauriegel",
"email": "...
} In this case, the llmbot-annotations-at group was added through Keycloak. My initial thought is to leverage these group roles to define Argilla roles and control access to specific Argilla workspaces. Would it make sense to integrate this logic into the new UserInfo class? In our enterprise environment, managing groups of annotators via a central SSO (e.g., Entra ID) is crucial, as we need fine-grained control over roles and workspace access for different groups. The issue might be that OpenID Connect (OIDC) does not inherently guarantee the availability of roles or groups in all implementations. What do you think @frascuchon |
Great! Based on my proposal here one thing we can do is to extend the class KeycloakOpenId(OpenIdConnectAuth):
"""Huggingface OpenID Connect authentication backend."""
name = "keycloak"
@staticmethod
def from_oidc_endpoint(oidc_endpoint: str):
if oidc_endpoint is None:
raise ValueError(....)
KeycloakOpenId.OIDC_ENDPOINT = oidc_endpoint.rstrip("/")
KeycloakOpenId.AUTHORIZATION_URL = f"{oidc_endpoint}/protocol/openid-connect/auth"
KeycloakOpenId.ACCESS_TOKEN_URL = f"{oidc_endpoint}/protocol/openid-connect/token"
return KeycloakOpenId
def oidc_endpoint(self) -> str:
return self.OIDC_ENDPOINT
def get_user_details(self, response):
data = super().get_user_details(response)
data["role"] = # ... compute the role based on response content
data["allowed_workspaces"] = # ... something similar to identity allowed workspaces dynamically
return data Then, we could extend the ...,
workspaces = userinfo.allowed_workspaces or settings.oauth.allowed_workspaces
) |
Introduces a new SSO option using Keycloak
Enables a different SSO provider next to HuggingFace SSO
Type of change
How Has This Been Tested
Local build of front-end & backend. Keycloak deployment as described in the docs
Checklist
How to test & use it