-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Patches based implementation for DRA snapshot. #8090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
/assign towca |
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset.go
Outdated
Show resolved
Hide resolved
merged[key] = value | ||
} | ||
|
||
for key := range patch.Deleted { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to address map access race conditions between:
- enumerate through the set of patches and begin copying into a new, merged map
- one or more patches in the set begin a delete operation
func (p *patch[K, V]) Delete(key K) {
p.Deleted[key] = true
delete(p.Modified, key)
}
- ensure that our source patch wasn't copied from a state in between the 1st of the above 2 statements by double-checking the same key against any corresponding existence in the
Deleted
map
?
(If so have we considered the tradeoffs of inverting the order of operations in the Delete()
method?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say that Snapshot is not thread-safe in general, but that's a fair call, I'll move the order of operations in Delete()
method so that it'll become consistent with other data manipulation functions. In case we need it to be truly suitable for concurrent usage (and as I see based on simulations code - we don't) - we'll probably need to use sync.Map
to actually store the data, in current implementation we are using bare maps just the sake of performance gain
The reason why we account for Deleted keys here - is to handle following situations:
Prerequisite: Type in the example -> PatchSet[int, int]
Patch#1: Modified: {1: 1, 2: 2, 3: 3}
, Deleted: {}
Patch#2: Modified: {}
, Deleted: {1, 2}
Patch#3: Modified: {1: 5}
, Deleted: {}
The result of the AsMap()
call on the PatchSet holding these 3 patches should be: {1: 5, 3: 3}
, because keys 1
and 2
are getting deleted in the second patch, but key 1
is getting reintroduced in Patch#3
If I misunderstood your comment - LMK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we're aligned, do we have a UT case for that example scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get the difference between the order of operations in Delete()
, we're not planning for any concurrency here for now, right?
(I do get the need for iterating over Deleted
and deleting from merged
here, but not sure how it relates to Jack's comment 😅)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason to invert the order in Delete()
(actually delete first, take an accounting of that deletion second) is so that we don't have to double-check that the data is really deleted before composing the merge from the data.
I agree with @towca's other comment that this is a great addition, and was really fun to review. I did have to try really hard to find something to criticize! 😀😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least at this point of time I don't see Snapshot being used concurrently anywhere - and if it will be, I think DRA Snapshot would be the only blocker there, so let's keep it as is at least for now, WDYT?
I agree with @towca's other comment that this is a great addition, and was really fun to review. I did have to try really hard to find something to criticize! 😀😂
I forgot the last time I had to implement something that technical and fun so it's coming from both sides :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least at this point of time I don't see Snapshot being used concurrently anywhere - and if it will be, I think DRA Snapshot would be the only blocker there, so let's keep it as is at least for now, WDYT?
Yup, the current version looks good to me! I was just curious about the comment.
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset.go
Outdated
Show resolved
Hide resolved
1516669
to
58fe449
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed everything but the tests (I'll do that in a second pass). In general the PR looks very good, the only bigger comments are about the caching strategy.
Two general comments:
- The patchSet is a very elegant solution, I'm a big fan! (also yay generics, this would be a nightmare otherwise..)
- (for future reference) Breaking the PR down into smaller, meaningful commits would have seriously helped the review here. For example a structure like: change Snapshot to be a pointer (this just pollutes the diff) -> introduce patch&patchset -> add caching to patchset -> use patchset in Snapshot -> use the new dra.Snapshot in ClusterSnapshot implementations -> add benchmark.
cluster-autoscaler/simulator/dynamicresources/snapshot/snapshot.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/dynamicresources/snapshot/snapshot.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/dynamicresources/snapshot/snapshot.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/dynamicresources/snapshot/snapshot.go
Outdated
Show resolved
Hide resolved
@@ -22,18 +22,16 @@ import ( | |||
resourceapi "k8s.io/api/resource/v1beta1" | |||
) | |||
|
|||
type snapshotClassLister Snapshot | |||
type snapshotClassLister struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this (here and for the other subtypes)? IMO the previous pattern was more readable - no need for redundant methods on Snapshot
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a specific preference tbh, but first iteration implementation was way harder in terms of interaction with patchSet and I wanted to keep all the interaction with it inside the Snapshot itself, while this introduces some duplication - it makes me thinking about these wrapper objects as just wrappers without any logic apart of snapshot function calls.
If you want - I may change it back, right now it doesn't make a lot of difference from my point of view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a specific preference tbh, but first iteration implementation was way harder in terms of interaction with patchSet
Interesting, I'm really curious why?
If you want - I may change it back, right now it doesn't make a lot of difference from my point of view
I think right now it only introduces some code duplication for no clear reason. I'd prefer to revert back to the previous approach unless it's less extensible for the future or something (I still don't understand the limitations here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I'm really curious why?
As I've mentioned previously - first iteration was based on the ability for each operation to have an operation that would cancel the effect of the first one, essentially if you have Add1
function on the API surface - you should have Remove1
available at least internally, we dropped this idea because of the unoptimized behavior of Revert operation which was truly O(N) and way worse compared to what's implemented right now.
The rationale behind bringing all the logic into the Snapshot - hiding the complexity behind human readable functions as each operation essentially looked like this
func DoSomething() {
return applyPatch({
Apply: func() { doSomethingImpl() }
Revert: func() {undoSomethingImpl()}
})
}
And I wanted to hide this complexity within the snapshot itself and not bring it into wrapper objects so that they will simply call internal handlers specifically made available for them, so that was the idea
cluster-autoscaler/simulator/dynamicresources/snapshot/test_utils.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/clustersnapshot/predicate/predicate_snapshot_dra_benchmark_test.go
Show resolved
Hide resolved
5ceceaf
to
0c600e0
Compare
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset_test.go
Outdated
Show resolved
Hide resolved
@mtrqq Thanks for addressing my comments! I see the following remaining items:
We're planning to cut 1.33 CA on 2025-06-11 and this is the most important change we want to get in. Points 3. and 4. are test-specific and IMO we can address them in a follow-up PR after the release. Points 1. and 2. should be fairly quick to address so that we can merge this PR before the cut, right? |
|
||
// Set marks a key-value pair as modified in the current patch. | ||
// If the key was previously marked as deleted, the deletion mark is removed. | ||
func (p *patch[K, V]) Set(key K, value V) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the public methods on the private type here? I don't see us using this as the fulfillment of an interface. Do we want Set|Delete|Get|IsDeleted
instead of set|delete|get|isDeleted
? (same comment for private patchSet type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now It doesn't serve any interface fulfilment, instead I'm using this notation to separate public and private API surfaces of the structs, in case of lower case methods - it means that those are not supposed to be used to interact with an object (by external entity), while Set|Delete|Get|IsDeleted
are more explicitly telling that they are supposed to be used, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to slow down progress but in my view until "external entity" actually means "external" entity I think it's preferable to keep the methods private to express the fact that they are currently implemented as implementation details underneath the publicly facing DRA business logic.
If we anticipate reusability we can put it into a common utils directory and import it?
cc @towca for thoughts as well
cluster-autoscaler/simulator/dynamicresources/snapshot/patchset_test.go
Outdated
Show resolved
Hide resolved
49bb6d8
to
74e0b71
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mtrqq The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
c90ef9e
to
07e2cac
Compare
@mtrqq: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Instead of exposing DeepCloning API - dynamicresources.Snapshot now exposes Fork/Commit/Revert methods mimicking operations in ClusterSnapshot. Now instead of storing full-scale copies of DRA snapshot we are only storing a single object with patches list stored inside, which allows very effective Clone/Fork/Revert operations while sacrificing some performance and memory allocation while doing fetch requests and inplace objects modifications (ResourceClaims).
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
This PR enhances the performance of DRA snapshot which directly impacts scheduling simulations speed and cluster-autoscaler decision making process overall. In this PR we are changing the approach for its state management from deep copies based into patches based one.
Which issue(s) this PR fixes:
Fixes #7681
Special notes for your reviewer:
This PR removes the original implementation for dynamicresources.Snapshot and replaces it with the patches based approach, while we can keep the original implementation for the sake of safety and ability to switch store implementation in the running cluster autoscaler, but it would require maintaining two implementations, I've attempted to use clone-based Snapshot as a baseline for the new changes, but it only resulted in complex code while yielding minimal benefits.
In the change you may find a benchmark test which uses exagerrated scheduling scenario to test the performance of two implementations, what I've found that patches based option is roughly 50x times faster in terms of overall runtime while allocating 40x less memory on the heap primarily because Fork/Commit/Revert operations are used A LOT in the suite
Here's a few profiling insights in differences:
CPU Profile / Forking
CPU Profile / GC
Memory Profile / Allocated Space
Memory Profile / Allocated Objects
Grab a copy of profiling samples -> Profiles.zip
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: