-
Notifications
You must be signed in to change notification settings - Fork 433
[BUG]: DecodeSpecialTokens is exposed by the StreamingTokenDecoder but is not accessible when using an ILlamaExecutor #1201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Would you be interested in putting together a PR to fix this?
LLamaSharp should expose all native functionality of the llama.cpp API, that's the core goal of the library! Which bits are missing/inaccessible? |
Sure - I’ll get one started. In terms of general direction, is simply adding the parameter as a property to any Executor that uses the decoder in order to pass it through an appropriate design? Or would it be better to add it to one of the structures passed in to construct the Executor? The problem line came from LLamaSharp/LLama/LLamaStatelessExecutor.cs Line 159 in e52f5ef
I was trying to duplicate behavior (naively) in my own class and project, and this internal api was the problem: LLamaSharp/LLama/Native/NativeApi.cs Line 345 in e52f5ef
Not sure that should be exposed, though. |
It's up to you, but from a quick skim of the relevant code it looks to me like adding it as a property to
That's exposed through LLamaSharp/LLama/Native/SafeLLamaContextHandle.cs Lines 867 to 873 in e52f5ef
It used to be that the entire llama.cpp API was exposed through NativeApi completely unchanged (no checks, accepting raw pointers etc). Over time we're slowly replacing those methods with slightly safer wrappers in the SafeHandle classes. |
Awesome - I agree with that from a user perspective 😄 I'll get this fired off quickly. Makes sense that I got confused if the code is in the process of being changed - since I saw the |
Uh oh!
There was an error while loading. Please reload this page.
Description
In the PR #777, the support for DecodeSpecialTokens was added. In some of my testing, I thought it would be useful to toggle this feature (right now, it appears to default to false!), but found that due to the way StreamingTokenDecoder is used (at least in StatelessExecutor), I am unable to toggle it.
It seems an implementation bug, rather than a feature request, since the feature is there, just not exposed. Additionally, I cannot implement my own StatelessExecutor implementation because it relies on methods that are internal to NativeApi.
The actual EFFECT of this is that models like Devstral that, in GGUF form, classify the token [TOOL_CALLS] as a special token to be output before any tool call-formatted output, will end up returning this token but not actually stream it out of the executor to the caller.
Reproduction Steps
Run a Devstral GGUF from LlamaSharp source with an appropriate tool call prompt - compare, for example, to Ollama in raw mode to see the [TOOL_CALLS] token.
Environment & Configuration
Known Workarounds
No response
The text was updated successfully, but these errors were encountered: