[WIP] Require Google provider 4.0.0 #1071

jackwhelpton · 2021-11-22T18:06:46Z

No description provided.

comment-bot-dev · 2021-11-22T18:19:15Z

Thanks for the PR! 🚀
✅ Lint checks have passed.

jackwhelpton · 2021-11-22T18:48:07Z

This is a bit fiddly, so I'm not surprised to see there are some failures, although at least the validation succeeds now. Can somebody (@bharathkkb?) paste the Cloud Build logs for the int trigger failure here?

It'd be really handy if we could have comment-bot-dev relay those as well as the lint failures.

Hmm, on the subject of comment-bot-dev: even though linting passed some of those errors still look pertinent. I'll take a look at them next.

bharathkkb

@jackwhelpton Thanks for working on this!
We will need to update https://github.com/terraform-google-modules/terraform-google-gcloud for 4.0 which is used by some modules here

* addresses warning about multiple provider blocks

jackwhelpton · 2021-11-22T19:02:25Z

Thanks for the heads up, I'll look at that bit next.

jackwhelpton · 2021-11-22T19:15:17Z

Looks like the same problem over there: there's an automated PR that's passing the lint checks, but raising testing errors that aren't visible to us mere mortals. Any chance you could relay those? The PR in that case is terraform-google-modules/terraform-google-gcloud#108.

bharathkkb · 2021-11-22T20:32:40Z

@jackwhelpton I have updated gcloud to allow 4.0 and will cut a release in a bit. You can use main to iterate on this PR

* fetches main branch by default?

…google-kubernetes-engine into feature/provider-upgrade

…terraform-google-kubernetes-engine into feature/provider-upgrade # Conflicts: # examples/node_pool/main.tf

jackwhelpton · 2021-11-23T18:24:03Z

There's quite a dependency chain here, next up is https://github.com/terraform-google-modules/terraform-google-vm/blob/master/modules/compute_disk_snapshot/versions.tf. That one doesn't seem to have an automated PR, I'll raise one...

terraform-google-modules/terraform-google-vm#215 if anybody wants to help get that released.

jackwhelpton · 2021-11-24T21:25:22Z

Thanks for the continued support on this. Looks like the terraform-google-vm change is merged, so you'll find me over at https://github.com/terraform-google-modules/terraform-google-bastion-host/blob/master/modules/bastion-group/main.tf#L42 next... I'll do the same thing there as we did in this PR, use the master branch until we've got a viable release for the vm module.

bharathkkb · 2021-11-24T21:32:31Z

@jackwhelpton thanks for working on these. For those modules we use in examples(like bastion host), we can leave the example at 3.x and not block this PR on that. If another module is used within a module(like gcloud), then we will need to fix those first.

jackwhelpton · 2021-11-24T21:36:01Z

If the examples have dependencies that have dependencies that have ... that eventually rely on google < 4, doesn't that cause the conflicts we're still seeing above?

bharathkkb · 2021-11-24T21:54:23Z

I forgot in this case we have to constrain GKE module to 4.0+ due to breaking changes, so you are right

bharathkkb · 2021-11-24T22:37:28Z

@jackwhelpton I have created releases for gcloud and vm

https://github.com/terraform-google-modules/terraform-google-gcloud/releases/tag/v3.1.0
https://github.com/terraform-google-modules/terraform-google-vm/releases/tag/v7.3.0

jackwhelpton · 2021-11-24T22:49:09Z

Great, thanks... so I guess this guy is next?

terraform-google-modules/terraform-google-bastion-host#98

morgante · 2021-11-29T02:24:42Z

autogen/main/cluster.tf.tmpl

@@ -610,9 +607,10 @@ resource "google_container_node_pool" "pools" {
      for_each = local.cluster_node_metadata_config

      content {
-        node_metadata = lookup(each.value, "node_metadata", workload_metadata_config.value.node_metadata)
+        mode = lookup(each.value, "node_metadata", workload_metadata_config.value.mode)


I'm not sure we want to change the input value (ie. still look at node_metadata).

I need to refresh my memory on this (and find a line reference), but I think I'm still using the original input value, but I've adjusted the workload_metadata_config object to match the names of the new properties, so it serves as an adapter between the two; at the time that seemed to make the most sense to me.

morgante · 2021-11-29T02:27:21Z

Another question related to the above: I notice that the existing list of node pool names is concatenated with an empty string. Is there a reason for this? If we convert the other outputs to maps, should they have values with an empty key?

I believe this was done as a workaround for cases where the list was rebuilt. It's possible the bug has disappeared from Terraform Core so we should be okay with not including an empty key in maps.

morgante · 2021-11-29T02:28:07Z

autogen/main/outputs.tf.tmpl

  depends_on = [
    google_container_cluster.primary
  ]
 }
-
-output "instance_group_urls" {


I'd like to keep this output value, as it is helpful for broadly addressing the cluster. Could we simply concat all the instance groups from the different node pools?

By all means: so you'd keep the new node_pools_ outputs but also include this?

Ah, just saw your next comment, perhaps I'll wait for you to finish the review :)

I don't think I have enough knowledge about how the instance_group_urls output is currently consumed: it's obviously possible to keep it as a single flattened list, but now the property has migrated to the node pool level within the provider I worried about the loss of information that would result from doing that.

In my experience, it's most useful for addressing the cluster as a whole to apply networking changes. Let's leave it as-is—we can always add an additional output later if requests come in, but every output we add is an addition to the API surface.

morgante · 2021-11-29T02:28:27Z

autogen/main/outputs.tf.tmpl

  value       = local.cluster_node_pools_versions
 }

+output "node_pools_instance_group_urls" {


Let's not add a new output unless needed (see below coment).

autogen/main/outputs.tf.tmpl

morgante · 2021-11-29T02:30:08Z

autogen/main/outputs.tf.tmpl

-output "identity_namespace" {
-  description = "Workload Identity namespace"
-  value       = length(local.cluster_workload_identity_config) > 0 ? local.cluster_workload_identity_config[0].identity_namespace : null
+output "workload_pool" {


I don't think there's a real need to change this output name, since it's still pointing to the same value.

morgante · 2021-11-29T02:30:41Z

autogen/main/variables.tf.tmpl

@@ -548,8 +536,8 @@ variable "database_encryption" {
  }]
 }

-variable "identity_namespace" {
-  description = "Workload Identity namespace. (Default value of `enabled` automatically sets project based namespace `[project_id].svc.id.goog`)"
+variable "workload_pool" {


I don't think we need to change this variable name (can add a note in the description that this is otherwise known as workload_pool).

jackwhelpton · 2021-11-29T04:00:30Z

I think I've addressed those comments, let me know what the build failure is now. Owing to our organizational setup it's unfortunately pretty tricky for me to run these integration tests locally.

bharathkkb · 2021-11-29T19:23:10Z

@jackwhelpton PFA logs. I didn't find anything off at a quick glance but saw hashicorp/terraform-provider-google#10494

Step #6 - "converge shared-vpc-local":        
Step #6 - "converge shared-vpc-local":        Error: 1 error occurred:
Step #6 - "converge shared-vpc-local":        	* one of source_tags, source_ranges, or source_service_accounts must be defined
Step #6 - "converge shared-vpc-local":        
Step #6 - "converge shared-vpc-local":        
Step #6 - "converge shared-vpc-local":        
Step #6 - "converge shared-vpc-local":          with module.example.module.gke.google_compute_firewall.master_webhooks[0],
Step #6 - "converge shared-vpc-local":          on ../../../firewall.tf line 63, in resource "google_compute_firewall" "master_webhooks":
Step #6 - "converge shared-vpc-local":          63: resource "google_compute_firewall" "master_webhooks" {
Step #6 - "converge shared-vpc-local":        
Step #6 - "converge shared-vpc-local": >>>>>> Running the command `terraform apply -auto-approve -lock=true -lock-timeout=0s -input=false -no-color -parallelism=10 -refresh=true  ` failed due to a non-zero exit code of 1.
Step #6 - "converge shared-vpc-local": >>>>>> ------Exception-------
Step #6 - "converge shared-vpc-local": >>>>>> Class: Kitchen::ActionFailed
Step #6 - "converge shared-vpc-local": >>>>>> Message: 1 actions failed.
Step #6 - "converge shared-vpc-local": >>>>>>     Converge failed on instance <shared-vpc-local>.  Please see .kitchen/logs/shared-vpc-local.log for more details
Step #6 - "converge shared-vpc-local": >>>>>> ----------------------
Step #6 - "converge shared-vpc-local": >>>>>> Please see .kitchen/logs/kitchen.log for more details
Step #6 - "converge shared-vpc-local": >>>>>> Also try running `kitchen diagnose --all` for configuration
Finished Step #6 - "converge shared-vpc-local"
ERROR
ERROR: build step 6 "gcr.io/cloud-foundation-cicd/cft/developer-tools:1.0" failed: step exited with non-zero status: 20

jackwhelpton · 2021-11-30T05:08:42Z

Thanks for the repro on that linked ticket; it does indeed look like a provider bug. Damnit.

* this should be removed once hashicorp/terraform-provider-google#10494 is addressed

jackwhelpton · 2021-12-01T20:17:54Z

I'm trying to set up an environment where I can run the integration tests locally... I actually made some progress towards this for some earlier work I did on the Workload Identity module. I've got a point where I can prepare the test environment using make docker_test_prepare, but when I execute make_docker_test_integration everything fails.

Looking in the .kitchen/logs directory that's been created I find lots of errors of the form shown below: it seems to be struggling to realize that a particular .tf contains no direct content, but simply a link to another file. Any ideas how I can fix this?

I've tried walking through https://codelabs.developers.google.com/codelabs/cft-onboarding/#7 and running in interactive mode (executing a single example), but the results are the same.

I, [2021-12-01T18:46:22.019544 #1714]  INFO -- disable-client-cert-local: �[31m�[0mThere are some problems with the configuration, described below.
I, [2021-12-01T18:46:22.019854 #1714]  INFO -- disable-client-cert-local: 
I, [2021-12-01T18:46:22.020045 #1714]  INFO -- disable-client-cert-local: The Terraform configuration must be valid before initialization so that
I, [2021-12-01T18:46:22.020224 #1714]  INFO -- disable-client-cert-local: Terraform can determine which modules and providers need to be installed.�[0m�[0m�[0m
I, [2021-12-01T18:46:22.020439 #1714]  INFO -- disable-client-cert-local: �[31m�[31m╷�[0m�[0m
I, [2021-12-01T18:46:22.020629 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m�[1m�[31mError: �[0m�[0m�[1mArgument or block definition required�[0m
I, [2021-12-01T18:46:22.020844 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m
I, [2021-12-01T18:46:22.021037 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m�[0m  on outputs.tf line 1:
I, [2021-12-01T18:46:22.021208 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m   1: �[4m.�[0m./shared/outputs.tf�[0m
I, [2021-12-01T18:46:22.021369 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m
I, [2021-12-01T18:46:22.021552 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0mAn argument or block definition is required here.
I, [2021-12-01T18:46:22.021736 #1714]  INFO -- disable-client-cert-local: �[31m╵�[0m�[0m
I, [2021-12-01T18:46:22.021932 #1714]  INFO -- disable-client-cert-local: �[0m�[0m
I, [2021-12-01T18:46:22.022722 #1714]  INFO -- disable-client-cert-local: �[31m�[31m╷�[0m�[0m
I, [2021-12-01T18:46:22.022962 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m�[1m�[31mError: �[0m�[0m�[1mArgument or block definition required�[0m
I, [2021-12-01T18:46:22.023193 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m
I, [2021-12-01T18:46:22.023391 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m�[0m  on variables.tf line 1:
I, [2021-12-01T18:46:22.023644 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m   1: �[4m.�[0m./shared/variables.tf�[0m
I, [2021-12-01T18:46:22.023921 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0m
I, [2021-12-01T18:46:22.024166 #1714]  INFO -- disable-client-cert-local: �[31m│�[0m �[0mAn argument or block definition is required here.
I, [2021-12-01T18:46:22.024395 #1714]  INFO -- disable-client-cert-local: �[31m╵�[0m�[0m
I, [2021-12-01T18:46:22.024692 #1714]  INFO -- disable-client-cert-local: �[0m�[0m
E, [2021-12-01T18:46:22.026027 #1714] ERROR -- disable-client-cert-local: Destroy failed on instance <disable-client-cert-local>.

jackwhelpton · 2021-12-07T19:23:35Z

Oh boo. @bharathkkb , could you share the reason for this recent failure so I can look into it? As far as I'm aware all I did was update to use the newly published versions of a couple of dependencies.

I'm assuming it's just down to the known firewall bug which we're now discussing here: GoogleCloudPlatform/magic-modules#5526

…terraform-google-kubernetes-engine into feature/provider-upgrade # Conflicts: # autogen/main/versions.tf.tmpl # examples/node_pool_update_variant_beta/main.tf # examples/node_pool_update_variant_public_beta/main.tf # examples/regional_private_node_pool_oauth_scopes/provider.tf # examples/safer_cluster/main.tf # examples/safer_cluster_iap_bastion/provider.tf # examples/simple_regional_beta/main.tf # examples/simple_regional_private_beta/main.tf # examples/simple_zonal_with_asm/main.tf # examples/workload_metadata_config/main.tf # modules/beta-private-cluster-update-variant/versions.tf # modules/beta-private-cluster/versions.tf # modules/beta-public-cluster-update-variant/versions.tf # modules/beta-public-cluster/versions.tf

bharathkkb · 2021-12-23T17:18:43Z

@jackwhelpton looks like the latest error is from workload-metadata-config possibly via

terraform-google-kubernetes-engine/examples/workload_metadata_config/main.tf

Line 57 in 9278265

node_metadata = "SECURE"

Error: expected node_pool.0.node_config.0.workload_metadata_config.0.mode to be one of [MODE_UNSPECIFIED GCE_METADATA GKE_METADATA], got SECURE
       
         with module.example.module.gke.google_container_cluster.primary,
         on ../../../modules/private-cluster/cluster.tf line 137, in resource "google_container_cluster" "primary":
        137:     node_config {

jackwhelpton · 2021-12-23T18:27:50Z

That makes sense, as SECURE has been deprecated now... fixed that (hopefully?)

raj-saxena · 2022-01-17T13:45:46Z

Thanks for the work so far on this @jackwhelpton & @bharathkkb. We have been eagerly waiting on this module to support Google provider version > 4.0.0.
Is there an ETA to get this out? Is there some way in which I can help?

bharathkkb

Mostly LGTM with one comment below. Looks like we also need to rebase this @jackwhelpton

bharathkkb · 2022-01-18T18:40:14Z

autogen/main/firewall.tf.tmpl

@@ -81,6 +81,7 @@ resource "google_compute_firewall" "master_webhooks" {
  direction   = "INGRESS"

  source_ranges = [local.cluster_endpoint_for_nodes]
+  source_tags   = [""]


CI seems to be failing due to this. IIRC we added this due hashicorp/terraform-provider-google#10494. Maybe we should do source_tags = [] as a workaround

Error: Error creating Firewall: googleapi: Error 400: Invalid value for field 'resource.sourceTags[0]': ''. Must be a match of regex '(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)', invalid with module.example.module.gke.google_compute_firewall.master_webhooks[0], on ../../../firewall.tf line 63, in resource "google_compute_firewall" "master_webhooks": 63: resource "google_compute_firewall" "master_webhooks" {

I thought the "correct" (?) fix for that was covered by this:

GoogleCloudPlatform/magic-modules#5526

so we may still see the CI failing until that (or something better) is merged.

On a more personal note, I left my previous employer at the end of last year, so it may be hard for me to take this much further, as the CLA etc. was signed with that email. I'm in touch with a former coworker who I'm going to try and persuade to finish this off for me; I'll let you know how that goes.

On a more personal note, I left my previous employer at the end of last year, so it may be hard for me to take this much further, as the CLA etc. was signed with that email. I'm in touch with a former coworker who I'm going to try and persuade to finish this off for me; I'll let you know how that goes.

Thanks, we can probably follow through if necessary as well.

bharathkkb · 2022-01-22T00:03:09Z

superseded by #1129

cloud-foundation-bot and others added 2 commits November 16, 2021 19:16

feat: update TPG version constraints to allow 4.0

5f8e135

Removes basic auth, renames namespace_identity

860f790

jackwhelpton requested review from bharathkkb, Jberlinsky and a team as code owners November 22, 2021 18:06

Regenerates modules and documentation

6e4b330

jackwhelpton mentioned this pull request Nov 22, 2021

Support Google provider 4.0 for Workload Identity #1068

Closed

bharathkkb reviewed Nov 22, 2021

View reviewed changes

Updates tests to use latest Google provider

0027c3b

* addresses warning about multiple provider blocks

Updates network module for Google provider 4.0 compatibility

12a6834

jackwhelpton added 7 commits November 22, 2021 13:46

Temporarily uses "main" for gcloud module (until next release is cut)

6138b52

Comments out version constraint (temporary change)

6f99d53

* fetches main branch by default?

Uses master branch for gcloud module (until release is cut)

6531be7

Merge branch 'master' of https://github.com/rakuten-gcloud/terraform-…

586e6f2

…google-kubernetes-engine into feature/provider-upgrade

Merge branch 'master' of https://github.com/terraform-google-modules/…

2e3f3d1

…terraform-google-kubernetes-engine into feature/provider-upgrade # Conflicts: # examples/node_pool/main.tf

Applies fmt

a8f5dbd

Uses kubectl-wrapper where appropriate

73b3891

Uses released version of gcloud module

9f88922

morgante reviewed Nov 29, 2021

View reviewed changes

autogen/main/outputs.tf.tmpl Show resolved Hide resolved

morgante reviewed Nov 29, 2021

View reviewed changes

Addresses code review comments

2d41101

jackwhelpton added 2 commits November 30, 2021 18:21

Temporarily applies an empty source_tags setting.

019182a

* this should be removed once hashicorp/terraform-provider-google#10494 is addressed

Fixes indentation

092ef7a

jackwhelpton added 2 commits December 7, 2021 10:24

Uses newly-released version of project factory

3bebdce

Uses released version of bastion host

03ff18a

nlamirault mentioned this pull request Dec 9, 2021

Update Terraform google to v4.6.0 portefaix/portefaix-kubernetes#1427

Closed

1 task

jackwhelpton added 2 commits December 19, 2021 13:10

Addresses linting warnings

0d0a4c8

jackwhelpton mentioned this pull request Dec 20, 2021

Upstream firewalls DSF update to allow unknown values GoogleCloudPlatform/magic-modules#5526

Merged

5 tasks

Adds missing newline as per linting warnings

3f8f9fd

Removes use of SECURE mode (deprecated)

345fa58

bharathkkb reviewed Jan 18, 2022

View reviewed changes

bharathkkb mentioned this pull request Jan 19, 2022

feat!: update TPG version constraints to 4.0 #1129

Merged

bharathkkb closed this Jan 22, 2022

[WIP] Require Google provider 4.0.0 #1071

[WIP] Require Google provider 4.0.0 #1071

Conversation

jackwhelpton commented Nov 22, 2021

comment-bot-dev commented Nov 22, 2021 • edited Loading

jackwhelpton commented Nov 22, 2021 • edited Loading

bharathkkb left a comment

Choose a reason for hiding this comment

jackwhelpton commented Nov 22, 2021

jackwhelpton commented Nov 22, 2021

bharathkkb commented Nov 22, 2021

jackwhelpton commented Nov 23, 2021 • edited Loading

jackwhelpton commented Nov 24, 2021

bharathkkb commented Nov 24, 2021

jackwhelpton commented Nov 24, 2021

bharathkkb commented Nov 24, 2021

bharathkkb commented Nov 24, 2021

jackwhelpton commented Nov 24, 2021

Choose a reason for hiding this comment

jackwhelpton Nov 29, 2021 • edited Loading

Choose a reason for hiding this comment

morgante commented Nov 29, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackwhelpton commented Nov 29, 2021

bharathkkb commented Nov 29, 2021

jackwhelpton commented Nov 30, 2021

jackwhelpton commented Dec 1, 2021 • edited Loading

jackwhelpton commented Dec 7, 2021 • edited Loading

bharathkkb commented Dec 23, 2021

jackwhelpton commented Dec 23, 2021

raj-saxena commented Jan 17, 2022

bharathkkb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bharathkkb commented Jan 22, 2022

comment-bot-dev commented Nov 22, 2021 •

edited

Loading

jackwhelpton commented Nov 22, 2021 •

edited

Loading

jackwhelpton commented Nov 23, 2021 •

edited

Loading

jackwhelpton Nov 29, 2021 •

edited

Loading

jackwhelpton commented Dec 1, 2021 •

edited

Loading

jackwhelpton commented Dec 7, 2021 •

edited

Loading