-
Notifications
You must be signed in to change notification settings - Fork 107
BYOC: Support template build for team clusters #766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
BYOC: Support template build for team clusters #766
Conversation
345856a
to
e58cdf7
Compare
d3c8031
to
36016c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces support for building templates on remote clusters by abstracting build placement and integrating a cluster edge pool.
- Added database migrations and queries to track clusters and build nodes.
- Defined a
BuildPlacement
interface with local and clustered implementations. - Updated API handlers and cache to propagate cluster and node IDs.
Reviewed Changes
Copilot reviewed 65 out of 65 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
packages/db/migrations/20250611172307_clusgter_node_for_builds.sql | Adds cluster_node_id column to track build node assignment |
packages/api/internal/template-manager/placement_local.go | Implements local build placement and log fetching via Loki |
packages/api/internal/template-manager/template_manager.go | Central TemplateManager wiring for placement, creation, deletion |
packages/api/internal/template-manager/template_manager_test.go | Tests for polling status adjusted for new client fields |
Comments suppressed due to low confidence (1)
packages/api/internal/template-manager/template_manager_test.go:118
- The test initializer still uses
statusClient
, butPollBuildStatus
no longer has that field. Update the test to construct with the new fields (placement
,templateManagerClient
, etc.).
c := &PollBuildStatus{
packages/db/migrations/20250611172307_clusgter_node_for_builds.sql
Outdated
Show resolved
Hide resolved
3a0b224
to
1c8c32e
Compare
0051d27
to
eba9e11
Compare
3e0648a
to
3875fc4
Compare
1de1bfd
to
1de2706
Compare
7db10cf
to
afc1b03
Compare
Need to implement migrations and schema changes removed from #795 |
25f2f66
to
f534515
Compare
a908f7c
to
1fb8a74
Compare
7f1e73b
to
d963648
Compare
e95b23d
to
09032a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mainly unite the syncing logic, as it's there now twice, but doing basically the same thing
packages/db/migrations/20250624001048_cluster_for_templates.sql
Outdated
Show resolved
Hide resolved
We have multiple places where we need to implement workflow that synchronizes items storage with some pool that do lot of different stuff. Basics such as adding newly discovered and removing not discovered items is used always so we abstracted this
logs = append(logs, line["message"].(string)) | ||
} | ||
} | ||
l, err := a.templateManager.GetLogs(ctx, buildUUID, templateID, team.ClusterID, buildInfo.ClusterNodeId, params.LogsOffset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l, err := a.templateManager.GetLogs(ctx, buildUUID, templateID, team.ClusterID, buildInfo.ClusterNodeId, params.LogsOffset) | |
l, err := a.templateManager.GetLogs(ctx, buildUUID, templateID, team.ClusterID, buildInfo.ClusterNodeID, params.LogsOffset) |
) | ||
|
||
type PlacementLogsProvider interface { | ||
GetLogs(ctx context.Context, buildId string, templateId string, offset *int32) ([]string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetLogs(ctx context.Context, buildId string, templateId string, offset *int32) ([]string, error) | |
GetLogs(ctx context.Context, templateID string, buildID string, offset *int32) ([]string, error) |
TemplateId string | ||
|
||
ClusterId *uuid.UUID | ||
ClusterNodeId *string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TemplateId string | |
ClusterId *uuid.UUID | |
ClusterNodeId *string | |
TemplateID string | |
ClusterID *uuid.UUID | |
ClusterNodeID *string |
|
||
localClient: client, | ||
localClientMutex: sync.RWMutex{}, | ||
localClientStatus: infogrpc.ServiceInfoStatus_OrchestratorHealthy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it always healthy when you add it?
} | ||
func (tm *TemplateManager) getBuilderClient(clusterID *uuid.UUID, nodeID *string, placement bool) (*grpclient.GRPCClient, metadata.MD, PlacementLogsProvider, error) { | ||
if clusterID == nil || nodeID == nil { | ||
tm.localClientMutex.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe put the RUnlock to defer here?
err = utils.UnwrapGRPCError(err) | ||
if err != nil { | ||
zap.L().Error("Failed to get health status of template manager", zap.Error(err)) | ||
tm.setLocalClientStatus(orchestratorinfo.ServiceInfoStatus_OrchestratorDraining) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
draining or unhealthy?
@@ -0,0 +1,123 @@ | |||
package synchronization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice ❤️
return p, nil | ||
} | ||
|
||
func (p *Pool) syncBackground() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be probably something like start, etc. The function doesn't (and shouldn't) know if it's gonna be run in the background or not
Store: poolSynchronizationStore{pool: p}, | ||
} | ||
|
||
synchronize.SyncInBackground(p.close, poolSyncInterval, poolSyncTimeout, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for this name, the SyncInBackground runs it actually in the foreground
return nil | ||
} | ||
|
||
func (s *Synchronize[SourceItem, SourceKey, PoolItem]) SyncInBackground(cancel chan struct{}, syncInterval time.Duration, syncRoundTimeout time.Duration, runInitialSync bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the cancel channel passing here. I would prefer separate method to close/stop the loop, so the cancel will be inner property of the Synchronize type
Implements abstraction for template managers in API to support edge clusters.