Skip to content

Restructure Admin TaskList commands to operate on multiple types #6712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 6, 2025

Conversation

natemort
Copy link
Member

@natemort natemort commented Mar 5, 2025

While Activity/Decision TaskLists with a given name are independent, they typically represent a common purpose and we have never operated on just a single type. Handling them entirely separately within the admin commands doubles the amount of commands needed to run to operate on the TaskList and makes it harder to understand whether a given TaskList name is used for decision and/or activity tasks. Particularly as we move TaskList partition data from dynamic config to the DB it's important that we provide good operator ergonomics to avoid errors.

What changed?
Update the Admin TaskList commands with the following:

  • The absence of the tasklisttype flag now indicates both Activity and Decision. It previously was inconsistent between defaulting to Decision or being required.
  • The describe command will output one row per type specified rather than a single row.
  • The describe command will output that a TaskList has 1 read/write partition rather than no output if there is no config.
  • The describe command will no longer output the list of pollers and instead print out the number of pollers. This functionality is still available via the non-admin version of the command, which does nothing other than printing the pollers.
  • The describe command now supports json as an output format.
  • The list command will consolidate Activity and Decision TaskLists with a given name to a single row, including separate columns for the number of decision and activity pollers.
  • The list command now sorts its output by TaskList name.
  • The list command now supports json as an output format.
  • The update-partition command now operates on multiple TaskListTypes at once rather than Decision or Activity. It performs a safety check on both before performing any updates.

Why?

  • Simplify operating on TaskLists

How did you test it?

  • Unit tests
  • Manual testing against a local Cadence instance

Potential risks

  • This is a breaking change to CLI users of the admin tasklist commands. Their usage is likely rather low, with the commands to update partition data being newly added.

Release notes

Documentation Changes

prettyPrintJSONObject(getDeps(c).Output(), response)
return nil
}

fmt.Println("Task Lists for domain " + domain + ":")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this diff, but mixing print statements with the stdout writer is :sadpanda:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, good call.

} else if c.String(FlagTaskListType) == "" {
taskListTypes = []types.TaskListType{types.TaskListTypeActivity, types.TaskListTypeDecision}
} else {
return nil, commoncli.Problem("Invalid task list type: valid types are [activity, decision]", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, commoncli.Problem("Invalid task list type: valid types are [activity, decision]", nil)
return nil, commoncli.Problem("Invalid task list type: valid types are 'activity', 'decision', or empty (both)", nil)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 283 to 284
msg := fmt.Sprintf("Successfully updated %s:%s", tl.Name, tlType)
_, err = getDeps(c).Output().Write([]byte(msg))
Copy link
Member

@Groxx Groxx Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be able to

fmt.Fprintf(getDeps(c).Output(), "Successfully updated %s:%s", tl.Name, tlType)

and I think errcheck won't complain about the fprintf err. but this is just an FYI / if-convenient-simplification, no need to change imo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That reads better, switched to that. Goland at least complains about it, so I'll add the explicit ignore.

if len(errors) > 0 {
return commoncli.Problem("Potentially unsafe operation. Specify '--force' to proceed anyway", multierr.Combine(errors...))
}
if !hasPollers {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to silently allow changing config when specifying more than one type, but only one has pollers? kinda seems like a "warn first" scenario to me.

unless I'm misreading this?

Copy link
Member Author

@natemort natemort Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you're describing is correct.

We pretty frequently partition a tasklist for both types despite only a single type being in use. Looking at our DynamicConfig we've never once partitioned a TaskList and used the TaskListType as one of the filters. So I think it provides better continuity and doesn't really create any problems for the system.

The intention behind this check is to catch cases where we specify the incorrect name. We recently had this happen with a large TaskList that we intended to change to 10 partitions but we partitioned task_list_name instead of task-list-name which caused some really bad hotsharding during the next monthly burst.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, definitely useful either way. and makes sense if it's kinda common 👍

Comment on lines +165 to +167
slices.SortFunc(table, func(a, b TaskListRow) int {
return strings.Compare(a.Name, b.Name)
})
Copy link
Member

@Groxx Groxx Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll want slices.SortStableFunc here, or sort by both name and type. Sort is unstable, so you might see activity/decision reorders under the same name.

e.g. I think this is possible currently:

decision  tasklist-A   3
activity  tasklist-A   3
activity  tasklist-B   3
decision  tasklist-B   3

but SortStable will ensure it's always decision/activity/decision/activity (for same name pairs).

I don't think the tests exercise this, so it's rather minor at the moment, but probably still worth doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually changed the table format entirely, it's now this:

Name Activity Pollers Decision Pollers
tasklist-A 3 3
tasklist-B 3 3

So the name is unique and we've aggregated the counts across both types. My goal was to align it more with the Cadence UI, which lists the pollers of a TaskList and has a checkbox for whether each has a Decision/Activity poller.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see, it's just iteratively filling the same row in both loops.

yea that's fine then 👍

Comment on lines +123 to +127
output := td.consoleOutput()
var result map[types.TaskListType]*types.DescribeTaskListResponse
err := json.Unmarshal([]byte(output), &result)
require.NoError(t, err)
assert.Equal(t, expected, result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh. practical - I kinda like it.

},
}, nil).Times(1)
},
expectedError: "Decision",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could probably use a more descriptive expected-error, or some kind of comment. this is rather vague about what it is checking, since "Decision" could be in lots of error messages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@Groxx Groxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only minor comments / tackle whatever you agree with, otherwise looks good to me

While Activity/Decision TaskLists with a given name are independent, they typically represent a common purpose and we have never operated on just a single type. Handling them entirely separately within the admin commands doubles the amount of commands needed to run to operate on the TaskList and makes it harder to understand whether a given TaskList name is used for decision and/or activity tasks. Particularly as we move TaskList partition data from dynamic config to the DB it's important that we provide good operator ergonomics to avoid errors.

Update the Admin TaskList commands with the following:
- The absence of the `tasklisttype` flag now indicates both Activity and Decision. It previously was inconsistent between defaulting to Decision or being required.
- The describe command will output one row per type specified rather than a single row.
- The describe command will output that a TaskList has 1 read/write partition rather than no output if there is no config.
- The describe command will no longer output the list of pollers and instead print out the number of pollers. This functionality is still available via the non-admin version of the command, which does nothing other than printing the pollers.
- The describe command now supports json as an output format.
- The list command will consolidate Activity and Decision TaskLists with a given name to a single row, including separate columns for the number of decision and activity pollers.
- The list command now sorts its output by TaskList name.
- The list command now supports json as an output format.
- The update-partition command now operates on multiple TaskListTypes at once rather than Decision or Activity. It performs a safety check on both before performing any updates.
@natemort natemort merged commit 58bf63b into cadence-workflow:master Mar 6, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants