Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Provider for OCI Does not account for disconnected AWS partitions #832

Open
ngearhart opened this issue Dec 6, 2024 · 3 comments · May be fixed by #835
Open

AWS Provider for OCI Does not account for disconnected AWS partitions #832

ngearhart opened this issue Dec 6, 2024 · 3 comments · May be fixed by #835

Comments

@ngearhart
Copy link

ngearhart commented Dec 6, 2024

oci/auth/aws/auth.go::40

var registryPartRe = regexp.MustCompile(`([0-9+]*).dkr.ecr(?:-fips)?\.([^/.]*)\.(amazonaws\.com[.cn]*)`)

This regular expression is used to determine if flux should try to authenticate to the AWS API when pulling an OCI resource. However, this regex does not support some other AWS regions, notably disconnected AWS partitions.
Thus, flux does not detect that it is in AWS and fails due to missing username/password.

Why is this implemented as such? If the user is already forced to specify provider: aws on helmrepositories for example, why does flux still parse the URL instead of de-facto authenticating?

I recommend removing this check and assuming the OCI endpoint requires AWS authentication if the user specifies provider: aws regardless of the URL. If the user in error specifies provider: aws for a non-AWS-backed OCI repo, then flux should fail.

@ngearhart
Copy link
Author

ngearhart commented Dec 6, 2024

I understand that an alternative path to implementing this would be to expand the regex to include these disconnected partitions, but it seems like there is a bigger discussion that should happen around this.

@ngearhart
Copy link
Author

After further investigation, it looks like this happens because the regex tries to pull the region out of the URL.

@darkowlzz
Copy link
Contributor

Hi, this was discussed in the flux dev meeting recently. The auth package originated in image-reflector-controller (IRC) several years ago. IRC was the only flux component that performed authentication with cloud providers. The initial design was such that cluster admins can set autologin flags to enable authentication to certain cloud providers, which allowed ImageRepositories to connect to the cloud provider registries based on the URL. IRC parsed the URL and determined which cloud provider to authenticate with if autologin was enabled for the provider. There was no provider option per object, it was a global autologin configuration. A few years ago, when OCIRepository source was introduced in source-controller (SC), we needed the same container registry authentication that IRC had but by this time, other source APIs started using per object provider option to configure such cloud provider auth. For consistency with other sources, OCIRepository had to also support per object provider configuration. The OCI auth code from IRC was moved to this pkg repository, as it was so that it could be used by OCIRepository. In IRC v1beta2 API, per object provider field was introduced, deprecating autologin flags, but soon after the release, users requested to keep the autologin capability to allow them to migrate to per object provider configuration. In an IRC patch fix release, autologin was added back. Later when Helm OCI support was added, it followed what was done in OCIRepository for cloud provider authentication. It has been more than a year now, we still have autologin support and the auth code still is around the concept of autologin, where the provider is determined by the URL and autologin flag and the provider option just enables the autologin capability to a provider.

In order to respect the specified provider per object and not use the URL to determine the provider, we'll need to remove supporting the concept of autologin from IRC first. And then refactor OCI auth package to take the provider name and just attempt login for the specified provider, instead of taking the provider autologin configuration and determine the provider by the URL. This will be a breaking change in the oci go package. We can either continue with v0.x releases of this package introducing the breaking change, or do a v1 release of this package with all the changes. Either way, this will require refactoring the auth package and all the consumers of the package to adapt to the new API. I'm afraid this would take some time, working out the new design, testing, and making sure the changes are backwards compatible for the usage of this in ImageRepository, OCIRepository and HelmRepository/HelmChart.

In the dev meeting, everybody agreed with this change, but we don't have anyone working on this at present. I believe we can accept a temporary fix to allow the use case described in this issue by modifying the AWS URL parsing regular expression for now. A proper fix as described above would be ideal for the long term.

I hope this provides all the background for why things are this way today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment