Skip to content

SWP Policy Rule - Mitigate multiple rules issue #12704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Feb 27, 2025

Conversation

Samir-Cit
Copy link
Contributor

@Samir-Cit Samir-Cit commented Jan 7, 2025

Hello folks.

This PR is to mitigate the issue when creating and/or deleting multiple google_network_security_gateway_security_policy_rule

  • When creating multiple rules at the same time it can throw an internal error which looks like a race condition.
  • When deleting multiple rules and the policy, the rules delay a little bit to propagate, causing an issue on the policy that can't be destroyed which looks like there is a phantom rule on the policy.

Release Note Template for Downstream PRs (will be copied)

See Write release notes for guidance.

networksecurity: added wait time on `google_network_security_gateway_security_policy_rule` resource when creating and deleting to prevent race conditions

@github-actions github-actions bot requested a review from zli82016 January 7, 2025 21:13
Copy link

github-actions bot commented Jan 7, 2025

Hello! I am a robot. Tests will require approval from a repository maintainer to run.

@slevenick, a repository maintainer, has been assigned to review your changes. If you have not received review feedback within 2 business days, please leave a comment on this PR asking them to take a look.

You can help make sure that review is quick by doing a self-review and by running impacted tests locally.

@modular-magician modular-magician added awaiting-approval Pull requests that need reviewer's approval to run presubmit tests service/networksecurity-swp and removed awaiting-approval Pull requests that need reviewer's approval to run presubmit tests labels Jan 7, 2025
@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

google provider: Diff ( 3 files changed, 106 insertions(+), 139 deletions(-))
google-beta provider: Diff ( 3 files changed, 106 insertions(+), 139 deletions(-))

@zli82016
Copy link
Member

zli82016 commented Jan 8, 2025

Adding wait time when deleting makes sense to me. But I doubt if the wait time will resolve the multiple rules creation issue as the wait time is added to each rule. Do you know the reason for the creation internal error?

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 50
Passed tests: 45
Skipped tests: 4
Affected tests: 1

Click here to see the affected service packages
  • networksecurity

Action taken

Found 1 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
  • TestAccNetworkSecurityGatewaySecurityPolicyRule_multiple

Get to know how VCR tests work

@modular-magician
Copy link
Collaborator

🟢 Tests passed during RECORDING mode:
TestAccNetworkSecurityGatewaySecurityPolicyRule_multiple [Debug log]

🟢 No issues found for passed tests after REPLAYING rerun.


🟢 All tests passed!

View the build log or the debug log for each test

Copy link

@zli82016 This PR has been waiting for review for 3 weekdays. Please take a look! Use the label disable-review-reminders to disable these notifications.

Copy link
Member

@zli82016 zli82016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment

@Samir-Cit
Copy link
Contributor Author

Adding wait time when deleting makes sense to me. But I doubt if the wait time will resolve the multiple rules creation issue as the wait time is added to each rule. Do you know the reason for the creation internal error?

I was investigation this creation error but I was not able to find anything that explains. On the logs it only shows the internal error but no further inforations.

One test I did was create the policy alongside with the rules on the GCP Console:

  • When creating one (1) or two (2) rules it works (same on terraform)
  • When creating more than two (2) rules it throw this error (same on terraform

I believe that it could be some sort of a race codition during the creation!

FYI: I just tryied to create 10 simutaniously and I got the error, I'm investigating to see what else I can do on the code to prevent this creation issue

@github-actions github-actions bot requested a review from zli82016 January 10, 2025 18:13
@zli82016
Copy link
Member

zli82016 commented Jan 10, 2025

@Samir-Cit , thanks for the investigation. Do you have the github issue for the creation error?
If not, do you mind creating one and then we can forward the issue to the service team to check the internal error.

@Samir-Cit
Copy link
Contributor Author

@Samir-Cit , thanks for the investigation. Do you have the github issue for the creation error? If not, do you mind creating a Github issue and then we can forward the issue to the service team to check the internal error.

I'll create the issue and link to this PR

@Samir-Cit
Copy link
Contributor Author

@zli82016 issue opened:
hashicorp/terraform-provider-google#20892

@modular-magician modular-magician added awaiting-approval Pull requests that need reviewer's approval to run presubmit tests and removed awaiting-approval Pull requests that need reviewer's approval to run presubmit tests labels Jan 13, 2025
@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

google provider: Diff ( 3 files changed, 119 insertions(+), 139 deletions(-))
google-beta provider: Diff ( 3 files changed, 119 insertions(+), 139 deletions(-))
terraform-google-conversion: Diff ( 1 file changed, 12 insertions(+))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 46
Passed tests: 42
Skipped tests: 4
Affected tests: 0

Click here to see the affected service packages
  • networksecurity
#### Non-exercised tests

🔴 Tests were added that are skipped in VCR:

  • TestAccNetworkSecurityGatewaySecurityPolicyRule_multiple
    🔴 Errors occurred during REPLAYING mode. Please fix them to complete your PR.

View the build log

@modular-magician modular-magician added the awaiting-approval Pull requests that need reviewer's approval to run presubmit tests label Jan 13, 2025
@Samir-Cit
Copy link
Contributor Author

I just changed the function to get the wait time for the sleep function.
With this new function I was able to create at least 50 rules without getting the error.

Using this code below to create the rules:

resource "google_network_security_gateway_security_policy_rule" "rules" {
  count = 50

  name                    = "tf-test-gateway-sp-rule${count.index}-%{random_suffix}"
  location                = "us-central1"
  gateway_security_policy = google_network_security_gateway_security_policy.default.name
  enabled                 = true  
  description             = "${count.index} rule"
  priority                = count.index+1
  session_matcher         = "host() == 'update.com'"
  application_matcher     = "request.method == 'GET'"
  tls_inspection_enabled  = false
  basic_profile           = "DENY"
}

But it's just a workaround, even though it's two "random" numbers multiplying there's a chance to get a small diference result which will cause the issue.

Copy link

@Samir-Cit, this PR is waiting for action from you. If no action is taken, this PR will be closed in 28 days.

Please address any comments or change requests, or re-request review from a core reviewer if no action is required.

Image showing the re-request review button

This notification can be disabled with the disable-automatic-closure label.

@Samir-Cit
Copy link
Contributor Author

Hello @slevenick and @zli82016
Any news from the service team on this one?

@github-actions github-actions bot requested a review from slevenick February 5, 2025 17:16
@slevenick
Copy link
Contributor

I've assigned the issue to an engineer internally to speed up a response. None yet

@modular-magician modular-magician added the awaiting-approval Pull requests that need reviewer's approval to run presubmit tests label Feb 5, 2025
@modular-magician modular-magician removed the awaiting-approval Pull requests that need reviewer's approval to run presubmit tests label Feb 27, 2025
@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

google provider: Diff ( 2 files changed, 138 insertions(+))
google-beta provider: Diff ( 2 files changed, 138 insertions(+))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 54
Passed tests: 49
Skipped tests: 4
Affected tests: 1

Click here to see the affected service packages
  • networksecurity

Action taken

Found 1 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
  • TestAccNetworkSecurityGatewaySecurityPolicyRule_multiple

Get to know how VCR tests work

@modular-magician
Copy link
Collaborator

🟢 Tests passed during RECORDING mode:
TestAccNetworkSecurityGatewaySecurityPolicyRule_multiple [Debug log]

🟢 No issues found for passed tests after REPLAYING rerun.


🟢 All tests passed!

View the build log or the debug log for each test

@@ -48,6 +48,50 @@ func TestAccNetworkSecurityGatewaySecurityPolicyRule_update(t *testing.T) {
})
}

func TestAccNetworkSecurityGatewaySecurityPolicyRule_multiple(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this test is fairly flakey in our nightly test runs. It has passed once, but it often fails with the error

        Error: Error waiting for Deleting GatewaySecurityPolicy: Error code 3, message: cannot delete a GatewaySecurityPolicy with non-zero number of rules

Would you happen to know why @Samir-Cit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error happens when the execution delay or fail to delete at least one rule.
But I re-run this test a couple of times on my machine and it worked.

I'll need to investigate more to have an idea of what is causing this error.

NA2047 pushed a commit to NA2047/magic-modules that referenced this pull request Mar 13, 2025
JaylonmcShan03 pushed a commit to JaylonmcShan03/magic-modules-jmcshan that referenced this pull request Mar 25, 2025
@Samir-Cit Samir-Cit deleted the feat/swp_rule_mutex branch March 28, 2025 16:26
JaylonmcShan03 pushed a commit to JaylonmcShan03/magic-modules-jmcshan that referenced this pull request Mar 30, 2025
Dawid212 pushed a commit to Dawid212/magic-modules that referenced this pull request Apr 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants