-
Notifications
You must be signed in to change notification settings - Fork 134
(torchx/scheduler) Fill hostnames for each replica in slurm scheduler's describe API #1080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request was exported from Phabricator. Differential Revision: D76485112 |
…'s describe API (#1080) Summary: Additionally fill hostname, resource (cpu, memMB), image, entrypoint in `describe_squeue` for each role/replica. Differential Revision: D76485112
707637e
to
b700e74
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
return DescribeAppResponse( | ||
app_id=app_id, | ||
roles=list(roles.values()), | ||
roles_statuses=list(roles_statuses.values()), | ||
state=app_state, | ||
msg=msg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
msg isn't needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea msg
defaults to an empty string if not specified. We were just setting msg=state
so no real functional value added + describe_sacct didn't set msg
…'s describe API (#1080) Summary: Additionally fill hostname, resource (cpu, memMB), image, entrypoint in `describe_squeue` for each role/replica. Reviewed By: d4l3k Differential Revision: D76485112
b700e74
to
6eca839
Compare
…'s describe API (#1080) Summary: Additionally fill hostname, resource (cpu, memMB), image, entrypoint in `describe_squeue` for each role/replica. Reviewed By: d4l3k Differential Revision: D76485112
6eca839
to
3a3772c
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D76485112 |
3a3772c
to
d94f744
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
d94f744
to
256b6fe
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
256b6fe
to
8659984
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
8659984
to
ff1e48f
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
ff1e48f
to
0489983
Compare
cf83fbd
to
811a65f
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
…'s describe API (#1080) Summary: Additionally fill hostname, resource (cpu, memMB), image, entrypoint in `describe_squeue` for each role/replica. Reviewed By: d4l3k Differential Revision: D76485112
811a65f
to
cc6ef31
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
…'s describe API (#1080) Summary: Additionally fill hostname, resource (cpu, memMB), image, entrypoint in `describe_squeue` for each role/replica. Reviewed By: d4l3k Differential Revision: D76485112
cc6ef31
to
5e2fad7
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
…'s describe API (#1080) Summary: Additionally fill hostname, resource (cpu, memMB), image, entrypoint in `describe_squeue` for each role/replica. Reviewed By: d4l3k Differential Revision: D76485112
5e2fad7
to
923250a
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
…'s describe API (#1080) Summary: Additionally fill hostname, resource (cpu, memMB), image, entrypoint in `describe_squeue` for each role/replica. Reviewed By: d4l3k Differential Revision: D76485112
923250a
to
bd24bf9
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
…'s describe API (#1080) Summary: Additionally fill hostname, resource (cpu, memMB), image, entrypoint in `describe_squeue` for each role/replica. Reviewed By: d4l3k Differential Revision: D76485112
bd24bf9
to
ba2ffea
Compare
This pull request was exported from Phabricator. Differential Revision: D76485112 |
…ion and hostnames to mesh_spec Summary: TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296) Summary: TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296) Summary: TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Reviewed By: suo Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296) Summary: TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Reviewed By: suo Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296) Summary: TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Reviewed By: suo Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296) Summary: TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Reviewed By: suo Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296) Summary: Pull Request resolved: #296 TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Reviewed By: suo Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296) Summary: Pull Request resolved: #296 TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Reviewed By: suo Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296) Summary: TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Reviewed By: suo Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296) Summary: Pull Request resolved: #296 TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information. This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX. This information will be used in PR (5/n) to implement a `TorchXAllocator` Reviewed By: suo Differential Revision: D76847192 fbshipit-source-id: 4d55083009ef9dd6ed46717fd375f5a49ee86a95
Summary: Use
scontrol
to implement the describe API that fills out the hostnames for each replica.Differential Revision: D76485112