Skip to content

Debug probes #5474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 16, 2021
Merged

Debug probes #5474

merged 6 commits into from
Mar 16, 2021

Conversation

briandealwis
Copy link
Member

Fixes: #5425

Description
Cause skaffold debug to rewrite any readiness/liveness/startup probes using HTTP-Get to set the timeout value to 10 minutes (from the default of 1s). This change allows debugging calls from such probes, and also avoids timeouts when a probe is made but the application is suspended because of an ongoing debug session.

User facing changes (remove if N/A)

  • skaffold debug alters Kubernetes HTTP-Get probes to have a 10 minute timeout to allow debugging probe calls

Follow-up Work (remove if N/A)
Consider making this configurable somehow.

@briandealwis briandealwis requested review from loosebazooka and a team as code owners March 2, 2021 04:58
@google-cla google-cla bot added the cla: yes label Mar 2, 2021
@codecov
Copy link

codecov bot commented Mar 2, 2021

Codecov Report

Merging #5474 (402914c) into master (827139d) will decrease coverage by 0.22%.
The diff coverage is 91.17%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5474      +/-   ##
==========================================
- Coverage   71.48%   71.25%   -0.23%     
==========================================
  Files         397      400       +3     
  Lines       14573    14968     +395     
==========================================
+ Hits        10418    10666     +248     
- Misses       3387     3513     +126     
- Partials      768      789      +21     
Impacted Files Coverage Δ
pkg/skaffold/debug/transform.go 90.17% <91.17%> (+0.17%) ⬆️
pkg/skaffold/test/test_factory.go 72.09% <0.00%> (-6.86%) ⬇️
pkg/skaffold/util/tar.go 50.66% <0.00%> (-5.34%) ⬇️
pkg/skaffold/runner/new.go 63.13% <0.00%> (-4.46%) ⬇️
cmd/skaffold/app/cmd/diagnose.go 33.33% <0.00%> (-1.45%) ⬇️
pkg/skaffold/docker/image.go 78.13% <0.00%> (-1.40%) ⬇️
pkg/skaffold/deploy/helm/deploy.go 75.48% <0.00%> (-0.50%) ⬇️
pkg/skaffold/event/event.go 90.78% <0.00%> (ø)
pkg/skaffold/errors/errors.go 87.23% <0.00%> (ø)
pkg/skaffold/runner/runner.go 0.00% <0.00%> (ø)
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 827139d...402914c. Read the comment docs.

@briandealwis briandealwis added the kokoro:force-run forces a kokoro re-run on a PR label Mar 8, 2021
@kokoro-team kokoro-team removed the kokoro:force-run forces a kokoro re-run on a PR label Mar 8, 2021
Copy link
Contributor

@tejal29 tejal29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
few nits

}

changed := false
var minTimeout int32 = 10 * 60 // make it configurable?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add seconds to clarify this is in seconds as default duration is Nanosecond

Suggested change
var minTimeout int32 = 10 * 60 // make it configurable?
var minTimeoutSeconds int32 = 10 * 60 // make it configurable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change to a duration.

func rewriteProbes(metadata *metav1.ObjectMeta, podSpec *v1.PodSpec) bool {
annotation, found := metadata.Annotations[DebugConfigAnnotation]
if !found {
logrus.Warn("no debug config found")
Copy link
Contributor

@tejal29 tejal29 Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug annotation not present on a pod was not surfaced as a warning before. Is there a reason why we want to show it now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move checking the pod debug annotation at L202 and skip both if no annotation is found?

 func transformPodSpec(metadata *metav1.ObjectMeta, podSpec *v1.PodSpec, retrieveImageConfiguration configurationRetriever, debugHelpersRegistry string) bool {
	// skip annotated podspecs — allows users to customize their own image
	if _, found := metadata.Annotations[DebugConfigAnnotation]; found {
		return false
	}
        containers := rewriteContainers(metadata, podSpec, retrieveImageConfiguration, debugHelpersRegistry)
	timeouts := rewriteProbes(metadata, podSpec) // must rewriteProbes after
	return containers || timeouts
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right: this shouldn't be a warning. There's an assumption that the probes are only rewritten for containers that were transformed for debugging (I'll make that explicit).

@briandealwis
Copy link
Member Author

@tejal29 could you take a quick peek: I added support for disabling the probes and also overriding the probe timeout value too.

@tejal29 tejal29 merged commit 664a82c into GoogleContainerTools:master Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Debug docs should warn about re-entrancy with probes
3 participants