Skip to content

Cluster Autoscaler resets unneeded since time to 0s #5618

Closed as not planned
Closed as not planned
@wreed4

Description

@wreed4

Which component are you using?:

cluster-autoscaler

What version of the component are you using?:

1.21.3
Component version:

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-03-01T02:23:41Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.17-eks-48e63af", GitCommit:"47b89ea2caa1f7958bc6539d6865820c86b4bf60", GitTreeState:"clean", BuildDate:"2023-01-24T09:34:06Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

What environment is this in?:

aws, EKS

What did you expect to happen?:

unneeded since would continue to increase until the node either is removed or becomes needed

What happened instead?:

unneeded since dropped to 0 for many nodes at once (but not all) even though they were never determined to be needed per the logs, causing the --scale-down-unneeded-time timer to be reset.

How to reproduce it (as minimally and precisely as possible):

I'm unsure what causes this, but I know we have a fairly high churn rate on our cluster, around 300 nodes, mostly default settings with CA, and I did see a Watch on replicasets close in the loop that this happened in.. if that matters.

Anything else we need to know?:

I'm happy to answer more questions, but I'm unsure what else to put here. The logs are far too verbose to copy in entirety, but I'll say, this is the piece of code I'm at that I think might possibly be lying.

// Update stores nodes along with a time at which they were found to be
// unneeded. Previously existing timestamps are preserved.

Metadata

Metadata

Labels

area/cluster-autoscalerarea/core-autoscalerDenotes an issue that is related to the core autoscaler and is not specific to any provider.kind/bugCategorizes issue or PR as related to a bug.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions