-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial draft of Auth GEP 1494 #3500
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Nick Young <[email protected]> Co-authored-by: Jen Gao <[email protected]>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: youngnick The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for very detailed content. Left a few comments
|
||
* A way for Chihiro the Cluster Admin to configure a default Authentication and/or Authorization config for some set of HTTPRoutes. | ||
|
||
* Optionally, a way for Ana to have the ability to disable Authentication and/or Authorization for specific routes when needed, allowing certain routes to not be protected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this imply that Ana has the ability to override the cluster-admin configuration? I assume some use cases for this functionality might involve testing, but I’m curious to hear about other potential use cases. My concern is that an app developer, who may not have a strong understanding of authentication, would have the ability to override cluster-admin (or security admin) defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this imply that Ana has the ability to override the cluster-admin configuration? I assume some use cases for this functionality might involve testing, but I’m curious to hear about other potential use cases. My concern is that an app developer, who may not have a strong understanding of authentication, would have the ability to override cluster-admin (or security admin) defaults.
I think some public pages, like a status page or help page, might not need authn?
It could be more flexible if the Gateway API allowed an HTTPRoute to either override or inherit settings from the Gateway under a hierarchical control model. This flexibility would be especially useful in cases where roles overlap, such as in smaller organizations where Ana and Chihiro might be the same person.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the most likely use case here is for healthchecks, or being able to say "the whole website needs auth, except for /public
" or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my experience, cluster admins often lack a clear understanding of the deployed applications and their paths unless developers explicitly provide this information. In this context, the role of the cluster admin would be to enable default authentication globally for all routes. Developers, like Ana, can then override this default authentication at the httproute level or for specific routes within the httproute. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is roughly what I'm thinking here. A default, not an override, although I'm not sure we'll be using Policy Attachment to do this or not yet.
Signed-off-by: Nick Young <[email protected]>
68f1a60
to
6706fbd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple comments around the auth mechanisms + suggestions for user stories.
New to k8s upstream work. Do you care about style comments? E.g. defining AuthN/AuthZ twice, or would you prefer to keep this to technical bits?
|
||
In this case, the server authenticates the client based on the client presenting a certificate that's signed by an authority that's also trusted by the server's trust chain. Some implementations also allow details about the certificate to be passed through to backend clients, to be used in authorization decisions. | ||
|
||
TLS v1.3 is defined in [RFC-8446](https://datatracker.ietf.org/doc/html/rfc8446), with v1.2 defined in [RFC-5246](https://datatracker.ietf.org/doc/html/rfc5246). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While not necessary, I think it's worth pointing to https://datatracker.ietf.org/doc/html/rfc8996 for why TLS 1.1 and earlier are not considered.
|
||
TLS includes the possibility of having both the client and server present certificates for the other party to validate. (This is often called "mutual TLS", but is distinct from the use of that term in Service Mesh contexts, where it means something more like "mutual TLS with short-lifetime, automatically created and managed dynamic keypairs for both client and server"). | ||
|
||
In this case, the server authenticates the client based on the client presenting a certificate that's signed by an authority that's also trusted by the server's trust chain. Some implementations also allow details about the certificate to be passed through to backend clients, to be used in authorization decisions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "also" is misplaced, since it's about the server also verifying a client, not about the client's cert being trusted by the same CA as ??.
The "trust chain" seems weird as well. Generally we have trust roots on a server, and the chain is presented by the client.
Since the client side authentication process of the server is implied prior knowledge, I think this can be condensed a bit.
I suggest:
In this case, the server also authenticates the client, based on the certificate chain presented by the client. Some implementations also allow details about the certificate to be passed through to backend clients, to be used in authorization decisions
|
||
In Basic HTTP Auth, a server asks for authentication as part of returning a `401` status response, and the client includes an `Authorization` header that includes a Base64-encoded username and password. | ||
|
||
Because the password is only _encoded_ and not _encrypted_, Basic Auth is totally unsafe when used outside of an encrypted session (like a HTTPS connection). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the "raw" passwords in basic auth have additional issues (long lived, impersonating human on login pages), JWT and afaik OAUTH/OIDC are not generally secure against "replay" attacks.
I.e. besides mTLS (which enforces confidentiality) all mechanisms are insecure in plaintext messages, because the authentication token can at least be re-used on other connections to gain the same level of privileges.
I think it's best to move the notion of requiring encryption for safe usage out of Basic Auth and potentially add a note here that basic auth has an additional issue with long-lived (potentially higher power) tokens being exchanged.
## Auth* User Stories | ||
|
||
|
||
* As Ana the Application Developer, I wish to be able to configure that some or all of my service exposed via Gateway API requires Authentication, and ideally to be able to make Authorization decisions about _which_ authenticated clients are allowed to access. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might leak into the "API" phase, but I think there's 2 levels to this, which are both worth an explicit mention
- The API of the proposed implementation provides enough flexibility to integrate with an authorization mechanism and protect resources entirely in the gateway
- The API allows to inject information about the authentication result into the requests and allows backend application to make authorization decisions based on this.
|
||
* As Ana the Application Developer, I wish to be able to configure that some or all of my service exposed via Gateway API requires Authentication, and ideally to be able to make Authorization decisions about _which_ authenticated clients are allowed to access. | ||
* As Chihiro the Cluster Admin, I wish to be able to configure default Authentication settings (at least), with an option to enforce Authentication settings (preferable but not required) for some set of services exposed via Gateway API inside my cluster. | ||
* More User Stories welcomed here! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Ana the Application Developer, I wish to be able to redirect users to a login page when they lack authentication, while unauthenticated API access gets the proper 40x response.
^ to make it clear that we need to handle "human" clients (browsers) slightly different to API consumers due to 30X/40X conventions.
* Handling all possible authentication and authorization schemes. Handling a (preferably large) subset of authentication and authorization is acceptable. | ||
|
||
|
||
## Deferred Goals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any intend to also cover GRPCRoute?
I think it should be made clear if it's non-goal, or deferred.
|
||
* A way for Chihiro the Cluster Admin to configure a default Authentication and/or Authorization config for some set of HTTPRoutes. | ||
|
||
* Optionally, a way for Ana to have the ability to disable Authentication and/or Authorization for specific routes when needed, allowing certain routes to not be protected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, the current proposal only allows for "add auth to this website, except for this route". Could this be expanded to have some kind of rule-based authentication selection/bypass mechanism? I'd love to have the ability to choose a specific authentication mechanism (or bypass auth entirely) on a per-route basis, based on multiple factors (e.g. client IP, user agent).
This would make the Gateway API an amazing Layer 7 firewall, but I'm not sure if the project wants to support these kinds of capabilities; I saw this proposal was closed partly because the feature "operated too much like a WAF/firewall".
Proposed user stories:
- As Ana the Application Developer, I'm maintaining a legacy application that doesn't have any existing authentication mechanisms. I wish to enforce SSO when a user accesses the application through a browser. However, I wish to bypass authentication when the user accesses the application through the mobile app (based on the user agent), as it can't handle the SSO flow.
- As Ana the Application Developer, I wish to be able to use different authentication mechanisms for the same route based on the context of the request. I wish to use SSO when a user accesses my API through a browser, and use JWT when using my mobile app,
- As Chihiro the Cluster Admin, I wish to block undesirable User Agents or IP ranges (spiders, scrapers etc.) from accessing any services exposed by Gateway API inside my cluster.
- As Chihiro the Cluster Admin, I wish to be able to bypass AuthN when requests come in from a trusted, controlled IP range, so external services can access APIs running on my cluster without issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @youngnick!
|
||
* A way to configure a Gateway Implementation to perform Authentication (at least), with optional Authorization on behalf of Ana the Application Developer. | ||
|
||
* A way for Chihiro the Cluster Admin to configure a default Authentication and/or Authorization config for some set of HTTPRoutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're saying this is exclusively for north-south traffic, I think I'd rather attach this config at the Gateway level. Maybe it would be safer to say something like this:
* A way for Chihiro the Cluster Admin to configure a default Authentication and/or Authorization config for some set of HTTPRoutes. | |
* A way for Chihiro the Cluster Admin to configure a default Authentication and/or Authorization config for some set of HTTP or GRPC matching criteria. |
|
||
Basic auth is defined in [RFC-7617](https://datatracker.ietf.org/doc/html/rfc7617). | ||
|
||
#### TLS Client Certificate Authentication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be worth referring to https://gateway-api.sigs.k8s.io/geps/gep-91/ as this is already in progress.
/kind gep
What this PR does / why we need it:
This is to get the conversation started around Auth* in Gateway API. Hopefully this part should not be too contentious yet, but reviews gratefully accepted.
Which issue(s) this PR fixes:
Updates #1494