Skip to content

feat: support infra deployment in the gateway namespace #5137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Apr 30, 2025

Conversation

cnvergence
Copy link
Member

@cnvergence cnvergence commented Jan 23, 2025

What type of PR is this?

feat: support infra deployment in the gateway namespace

What this PR does / why we need it:

Enable deploying Infra envoy proxies in the namespace of related Gateway resources.

  • Support GatewayNamespaceMode deploy mode in the gateway config map.
  • Adjust infra resources to reflect non-default namespace.
  • Set up the auth between xds-server and infra envoy proxy based on sTLS and JWT Auth, set up proper TLS credentials and interceptor in gRPC server.
  • Update the related helm chart to set up necessary permissions for envoy gateway ServiceAccount.

JWT token are validated using k8s TokenReview API .

Which issue(s) this PR fixes:

Fixes #2629

Release Notes: Yes

@cnvergence cnvergence force-pushed the feat-support-infra-different-ns branch 2 times, most recently from 7bf3489 to 91fadc0 Compare February 7, 2025 22:35
Copy link

codecov bot commented Feb 7, 2025

Codecov Report

Attention: Patch coverage is 50.00000% with 11 lines in your changes missing coverage. Please review.

Project coverage is 65.15%. Comparing base (68a2713) to head (bbddde9).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
api/v1alpha1/envoygateway_helpers.go 0.00% 5 Missing ⚠️
internal/gatewayapi/runner/runner.go 33.33% 2 Missing ⚠️
internal/gatewayapi/securitypolicy.go 33.33% 2 Missing ⚠️
internal/cmd/certgen.go 50.00% 1 Missing ⚠️
internal/extension/registry/extension_manager.go 0.00% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (50.00%) is below the target coverage (60.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5137      +/-   ##
==========================================
- Coverage   65.41%   65.15%   -0.26%     
==========================================
  Files         222      224       +2     
  Lines       35643    35857     +214     
==========================================
+ Hits        23315    23364      +49     
- Misses      10882    11048     +166     
+ Partials     1446     1445       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cnvergence cnvergence force-pushed the feat-support-infra-different-ns branch 3 times, most recently from 291e372 to 3d3932b Compare February 12, 2025 15:14
@cnvergence cnvergence force-pushed the feat-support-infra-different-ns branch from 3d3932b to 9cd1e1b Compare February 20, 2025 14:02
@cnvergence cnvergence force-pushed the feat-support-infra-different-ns branch from 9cd1e1b to 9ee9d48 Compare March 11, 2025 10:57
@cnvergence cnvergence force-pushed the feat-support-infra-different-ns branch 3 times, most recently from ab8e446 to 7e27132 Compare March 24, 2025 15:27
@zhaohuabing
Copy link
Member

Hi @cnvergence Just checking in - any updates on this PR? Would love to see it land in 1.4.0.

@cnvergence
Copy link
Member Author

hey @zhaohuabing, getting back to this after Kubecon, I need to fix connection between control-plane and proxies, if everything will work again, I think it should be ready for 1.4.0 :)

@zhaohuabing
Copy link
Member

zhaohuabing commented Apr 11, 2025

hey @zhaohuabing, getting back to this after Kubecon, I need to fix connection between control-plane and proxies, if everything will work again, I think it should be ready for 1.4.0 :)

@cnvergence a quick way to solve this issue is to copy the envoy secret into each namespace that contains a Gateway resource. While not ideal, this would unblock some key use cases -- such as using EG as an Ambient Waypoint proxy, and allow us to get this PR into v1.4.0.

This is not the most optimal solution, since all Gateway infras share the same client cert. As the next step, we can introduce logic in Envoy Gateway to automatically generate a unique cert secret per Gateway. This can be handled in a follow-up PR.

Would love your input on this, @envoyproxy/gateway-maintainers

@arkodg
Copy link
Contributor

arkodg commented Apr 11, 2025

hey @zhaohuabing, getting back to this after Kubecon, I need to fix connection between control-plane and proxies, if everything will work again, I think it should be ready for 1.4.0 :)

@cnvergence a quick way to solve this issue is to copy the envoy secret into each namespace that contains a Gateway resource. While not ideal, this would unblock some key use cases -- such as using EG as an Ambient Waypoint proxy, and allow us to get this PR into v1.4.0.

This is not the most optimal solution, since all Gateway infras share the same client cert. As the next step, we can introduce logic in Envoy Gateway to automatically generate a unique cert secret per Gateway. This can be handled in a follow-up PR.

Would love your input on this, @envoyproxy/gateway-maintainers

-1 to this, app teams have will have access to this, and can take advantage

@cnvergence
Copy link
Member Author

@zhaohuabing
Hey, thanks for the hint, we have discussed a few solutions:

  • generating certs per every Gateway
  • sharing EG cert per every infra proxy
  • moving to sTLS with JWT validation

While we can introduce more options as the followup to this, we wanted to do it initially with the option 3.
This way we need only to share the CA cert and set up JWT validation with serviceaccount token for envoy proxies.

@zhaohuabing
Copy link
Member

zhaohuabing commented Apr 15, 2025

To move this forward, I agree that we can start without client cert and use the JWT token for client-side validation for the namespaced mode, but please keep mTLS for the current mode.

We eventually will need mTLS for stronger security posture and compliance reasons. We may end up introducing a shim to exchange the JWT token for a client cert, and then establish the mTLS XDS connection for Envoy. This solution is similar to the approach taken by Istio's pilot agent.

If @arkodg also feels this is the right approach, we can go ahead.

@cnvergence cnvergence force-pushed the feat-support-infra-different-ns branch from b48eec8 to b73753e Compare April 15, 2025 19:32
@cnvergence
Copy link
Member Author

Yes, mTLS by default was always planned, the JWT method would only be used for the gatewayNamespacedMode.

@cnvergence cnvergence force-pushed the feat-support-infra-different-ns branch 4 times, most recently from 31850ce to d46e101 Compare April 22, 2025 17:25
@cnvergence cnvergence changed the title wip feat: support infra deployment in the gateway namespace feat: support infra deployment in the gateway namespace Apr 24, 2025
@cnvergence cnvergence force-pushed the feat-support-infra-different-ns branch from 49fc667 to f6e88f8 Compare April 25, 2025 11:49
@cnvergence cnvergence marked this pull request as ready for review April 25, 2025 11:49
@cnvergence cnvergence requested a review from a team as a code owner April 25, 2025 11:49
@cnvergence
Copy link
Member Author

/retest

}

func validateKubeJWT(ctx context.Context, clientset *kubernetes.Clientset, token string) (bool, error) {
tokenReview := &authenticationv1.TokenReview{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we validating that these token belongs to an envoy created by EG ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just by the service account, I was thinking about it, we will need to have an exact name of the envoy pod

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leverage the IR keys in the snapshot cache which map to the fleet deployment name and namespace?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked in this issue: #5863

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the basic check for deployment in snapshot cache, we will need to improve it and make it work with merged gateways mode

@zirain zirain force-pushed the feat-support-infra-different-ns branch from 079ecae to 38ec262 Compare April 26, 2025 12:47
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
@cnvergence cnvergence force-pushed the feat-support-infra-different-ns branch from 79afa42 to 91c297b Compare April 30, 2025 07:28
Signed-off-by: Karol Szwaj <[email protected]>
@zhaohuabing

This comment was marked as outdated.

@cnvergence
Copy link
Member Author

Thanks for the review, I will tackle open issues as the followups

Copy link
Member

@zhaohuabing zhaohuabing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the patience!

@zirain zirain merged commit 0a658b8 into envoyproxy:main Apr 30, 2025
23 of 25 checks passed
@cnvergence cnvergence deleted the feat-support-infra-different-ns branch April 30, 2025 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configure Envoy Proxy fleet in different namespaces
4 participants