Skip to content

Support DeepSpeed FastGen #1538

Open
Open
@thiner

Description

@thiner

Is your feature request related to a problem? Please describe.

No.

Describe the solution you'd like

DeepSpeed FastGen is an inference framework developed by MicroSoft. They claim that it's two times faster than vllm. https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-fastgen

Describe alternatives you've considered

No.

Additional context

I haven't tested FastGen, just attracted by their blog. I searched in this repo, seems no one mentioned this framework yet, so I'd like to bring it to the attention of community.

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions