Skip to content

[Feature]: Support for Running Classification Task in Online Server #13567

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
sam-h-bean opened this issue Feb 19, 2025 · 6 comments · May be fixed by #17032
Open
1 task done

[Feature]: Support for Running Classification Task in Online Server #13567

sam-h-bean opened this issue Feb 19, 2025 · 6 comments · May be fixed by #17032
Assignees
Labels
feature request New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@sam-h-bean
Copy link

🚀 The feature, motivation and pitch

I would like it to be easy to stand up models for sequence classification using the vllm online inference pattern. Currently this is available for offline inference but it would be nice to expose this server in kubernetes similar to how we host OpenAI compatible servers.

Alternatives

We could train a causal lm where we treat special tokens as the classification labels. We could then take the softmaxed logprobs for those 2 tokens to threshold. However this is going to require slightly more code on the client side.

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@sam-h-bean sam-h-bean added the feature request New feature or request label Feb 19, 2025
@DarkLight1337 DarkLight1337 added help wanted Extra attention is needed good first issue Good for newcomers labels Feb 20, 2025
@dipatidar
Copy link

@DarkLight1337 I'd like to work on this.

@DarkLight1337
Copy link
Member

Thanks for helping out!

@frieda-huang
Copy link

frieda-huang commented Apr 20, 2025

Hi @dipatidar Are you still working on this? If not, I would like to pick up the issue!

@frieda-huang
Copy link

Hi @DarkLight1337! I’m digging into the “sequence classification” issue. I noticed that OpenAI deprecated the /v1/classifications endpoint in December 2022. If we add our own /v1/classifications route in vLLM for online inference, we’d be diverging from the current OpenAI spec. Do you think it’s worth offering that “bonus” endpoint for ergonomics, or should we stick strictly to OpenAI’s live endpoints (completions/embeddings) for compatibility, as suggested here?

@DarkLight1337
Copy link
Member

I think to keep things simple, we can simply create an online version of LLM.classify. To distinguish from the OpenAI endpoint we can have the endpoint located at /classify

@frieda-huang
Copy link

I think to keep things simple, we can simply create an online version of LLM.classify. To distinguish from the OpenAI endpoint we can have the endpoint located at /classify

Gotcha. Could you assign the issue to me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
4 participants