[Feature]: Support for Running Classification Task in Online Server #13567
Labels
feature request
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
🚀 The feature, motivation and pitch
I would like it to be easy to stand up models for sequence classification using the vllm online inference pattern. Currently this is available for offline inference but it would be nice to expose this server in kubernetes similar to how we host OpenAI compatible servers.
Alternatives
We could train a causal lm where we treat special tokens as the classification labels. We could then take the softmaxed logprobs for those 2 tokens to threshold. However this is going to require slightly more code on the client side.
Additional context
No response
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: