Skip to content

Add Cluster Internal Service Error code along with runcontext #5491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 22, 2021

Conversation

tejal29
Copy link
Contributor

@tejal29 tejal29 commented Mar 4, 2021

Fixes #5485

Review Order

  1. Refactoring events to use Config interface for init #5532 - event refactor to use Config instead of variables from config
  2. Remove global single instance of skaffoldOpts in skaffold error package and use runContext Config #5537 - to use runContext and Config for error evaluation instead of singleton skafffoldOpts package variable.

Merge After #5490, #5532, #5537

In this PR, summary of changes

  1. define a status code for cluster internal system error
  2. change all deployer userError to see if the K8 cluster system error.

=== Testing Notes====

Force a Internal System Error via code.

diff --git a/pkg/skaffold/deploy/kubectl/kubectl.go b/pkg/skaffold/deploy/kubectl/kubectl.go
index c9831986d..9fc6a383a 100644
--- a/pkg/skaffold/deploy/kubectl/kubectl.go
+++ b/pkg/skaffold/deploy/kubectl/kubectl.go
@@ -84,6 +84,7 @@ func NewDeployer(cfg Config, labels map[string]string, d *latest.KubectlDeploy)
 // Deploy templates the provided manifests with a simple `find and replace` and
 // runs `kubectl apply` on those manifests
 func (k *Deployer) Deploy(ctx context.Context, out io.Writer, builds []graph.Artifact) ([]string, error) {
+       return nil, userError(fmt.Errorf("Internal Server Error: \"/api/v1/namespaces/skaffoldq6hn5/services\": the server is currently unable to handle the request has prevented the request from succeeding (post services))"))
        var (
                manifests manifest.ManifestList
                err       error

and then

LOCAL=true make
➜  getting-started git:(fix_internal_sys_err) ✗ ../../out/skaffold dev -d gcr.io/tejal-gke1
Listing files to watch...
 - skaffold-example
Generating tags...
 - skaffold-example -> gcr.io/tejal-gke1/skaffold-example:v1.21.0-39-g26f9698fd
Checking cache...
 - skaffold-example: Found Remotely
Starting test...
Tags used in deployment:
 - skaffold-example -> gcr.io/tejal-gke1/skaffold-example:v1.21.0-39-g26f9698fd@sha256:73be4a21d9fc1ac5c32a7c332cb71672a935d9bbd77d4395c0b64cfbc57eb84f
Starting deploy...
Deploy Failed. exiting dev mode because first deploy failed: Internal Server Error: "/api/v1/namespaces/skaffoldq6hn5/services": the server is currently unable to handle the request has prevented the request from succeeding (post services)). Something went wrong with your cluster "gke_tejal-gke1_us-central1-c_dump". Try again.
If this keeps happening please open an issue at https://github.com/GoogleContainerTools/skaffold/issues/new.

@tejal29 tejal29 requested a review from a team as a code owner March 4, 2021 19:28
@google-cla google-cla bot added the cla: yes label Mar 4, 2021
Copy link
Member

@briandealwis briandealwis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we should avoid passing in a RunContext and instead follow the spirit of #4598 with Config interfaces. We can pass in the RunContext as an interface{}, but the actual code should be trying to cast it to the appropriate config.

@codecov
Copy link

codecov bot commented Mar 11, 2021

Codecov Report

Merging #5491 (a653a00) into master (4e67ac7) will decrease coverage by 0.12%.
The diff coverage is 64.28%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5491      +/-   ##
==========================================
- Coverage   70.64%   70.51%   -0.13%     
==========================================
  Files         411      413       +2     
  Lines       15858    15992     +134     
==========================================
+ Hits        11203    11277      +74     
- Misses       3827     3884      +57     
- Partials      828      831       +3     
Impacted Files Coverage Δ
pkg/skaffold/deploy/error/errors.go 50.00% <51.61%> (+3.84%) ⬆️
pkg/skaffold/deploy/deploy_problems.go 82.35% <100.00%> (+9.27%) ⬆️
pkg/skaffold/deploy/helm/errors.go 89.79% <100.00%> (-1.12%) ⬇️
pkg/skaffold/deploy/kubectl/errors.go 14.63% <100.00%> (-7.59%) ⬇️
pkg/skaffold/deploy/kustomize/errors.go 100.00% <100.00%> (ø)
pkg/skaffold/errors/errors.go 95.31% <100.00%> (+4.98%) ⬆️
pkg/skaffold/errors/err_def.go 78.57% <0.00%> (-14.29%) ⬇️
cmd/skaffold/app/flags/image.go 72.97% <0.00%> (-11.03%) ⬇️
pkg/skaffold/build/scheduler.go 92.98% <0.00%> (-1.36%) ⬇️
cmd/skaffold/app/cmd/flags.go 89.02% <0.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e67ac7...a653a00. Read the comment docs.

@google-cla
Copy link

google-cla bot commented Mar 11, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Mar 11, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@tejal29 tejal29 changed the base branch from fix_helm_1 to master March 11, 2021 19:03
@tejal29 tejal29 added cla: yes and removed cla: no labels Mar 11, 2021
@tejal29 tejal29 changed the base branch from master to refactor_event_init March 11, 2021 19:11
@tejal29 tejal29 force-pushed the refactor_event_init branch from 0243bb8 to db8a6b4 Compare March 11, 2021 19:12
@tejal29 tejal29 changed the base branch from refactor_event_init to only_runCtx March 11, 2021 21:22
Copy link
Member

@briandealwis briandealwis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing the "Cluster Internal Service Error code"

@tejal29 tejal29 added this to the v1.21.0 milestone Mar 12, 2021
@tejal29 tejal29 changed the title Add Cluster Internal Service Error code along with runcontext [Paused] [DNR] Add Cluster Internal Service Error code along with runcontext Mar 16, 2021
@tejal29 tejal29 modified the milestones: v1.21.0, v1.22.0 Mar 18, 2021
@tejal29 tejal29 changed the title [Paused] [DNR] Add Cluster Internal Service Error code along with runcontext Add Cluster Internal Service Error code along with runcontext Apr 14, 2021
@tejal29 tejal29 changed the base branch from only_runCtx to master April 15, 2021 07:12
@tejal29
Copy link
Contributor Author

tejal29 commented Apr 15, 2021

/cc @gsquared94 Please review!

)

var (
ClusterInternalSystemErr = regexp.MustCompile(".*Internal Server Error")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is exported only to be used in the error package. Maybe just move it there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was used in deploy and deploy/error package previously and hence exported.
37603cc#diff-a8b4a0e99c70a1b6f90aff684f68c5f25ec7650177441b92ceac2557ae3d9f35R68
and
37603cc#diff-8da1821d1e711320629c6b9807272ea867c0dd81e63186cfb87a9ee087b37465R66

Made private after refactor.

@tejal29 tejal29 merged commit 0e506fa into GoogleContainerTools:master Apr 22, 2021
@tejal29 tejal29 deleted the fix_helm_2 branch July 21, 2021 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

internal server error on helm install should be system error
3 participants