Adding top-n-sigma sampler #489

ikawrakow · 2025-06-03T09:30:33Z

Given popular demand, adding top-n $\sigma$ sampler.

Set to off by default.

Add to sampling chain using --sampling-chain ...n... or --samplers ...top-n-sigma...
Set parameter using --top-n-sigma value

saood06 · 2025-06-03T09:48:28Z

Since this PR is still open could the documentation for this and XTC be added to examples/server/README.md and examples/main/README.md.

ikawrakow · 2025-06-03T10:04:08Z

Sure, will do.

What else do people want for sampling?

DRY?

saood06 · 2025-06-03T10:23:49Z

What else do people want for sampling?

DRY?

That does seem to be more popular than the other two you just added (based on what I've seen reported in other places). Looking at the main/README.md of mainline that is the only one that is missing. (We also have TFS which was removed in mainline due to low usage and bugs).

I do personally think DRY is the best repeat penalty (of the ones that are publicly used), and so I would use it if I ever encounter looping again (but I wouldn't ever turn it on unless needed, since it does definitely affect quality if left on and there is no looping you want to avoid). I fortunately haven't seen looping in a while (and I think it is because newer models have this issue a lot less if at all)

saood06 · 2025-06-03T10:38:23Z

examples/main/README.md

+### XTC Sampling
+
+-   --xtc-probability p: xtc probability (default: 0.0 => disabled)
+-   --xtc-threshold t  : xtc threshold   (default: 1.0 => disabled)
+


Maybe add something like the following:

XTC probability sets how likely the XTC sampler is to engage.
XTC threshold is the lower-bound for what probability is needed for a token to be considered a "Top choice" and when engaged only the lowest probability top choice is kept.

And maybe change ### XTC Sampling to ### XTC Sampling (Exclude Top Choices) since the description above refers to the full name

saood06 · 2025-06-03T10:41:21Z

examples/main/README.md

+### Top-n-sigma Sampling
+
+Sets all logits $L_i$ to $-\infty$ where $L_i < L_{\rm max} - n \sigma$. Here $L_{\rm max}$ is the maximum logit, $\sigma$ is the logit standard deviation, and $n$ is the top-n-sigma parameter.
+
+-   --top-n-sigma t          top-n-sigma parmeter (default: 0.0 => disabled)
+


Maybe add something letting people know that increasing top-n-sigma results in more tokens being considered, while decreasing it makes less tokens be considered as not all users will be able to figure that out from the mathematical description you provided.

Ph0rk0z · 2025-06-03T11:33:23Z

Yep, DRY is good. XTC threshold is usually .1 and below to get anything meaningful out of it. Not sure how that compares here. Super interesting how this one is going to compare to the one I stole from mainline.

saood06 · 2025-06-03T12:22:28Z

examples/main/README.md

+The function of this sampler is conrolled by `--xtc-probability` and `--xtc-threshold`. `--xtc-probability` takes values between
+0 and 1 (<=0 turns this sampler off) and defines the probability for randomly invoking the sampler. `--xtc-threshold`
+defines the token probability threshold. Tokens with probability greater than this threshold will be excluded from the sampling.
+The sampler is turned off for `threshold > 0.5`.


"conrolled" -> controlled

This isn't really accurate, as the lowest "top choice" is retained. As it is written it makes it seem like it removes all tokens with probability greater than the threshold.

Also I think the conditions for it to be turned off should be consistent instead of having the probability one in the beginning and the threshold one at the bottom

ikawrakow · 2025-06-03T13:38:26Z

Why don't you make your changes on top of the PR? Or, we merge the way it is and you make a new PR with better description.

saood06 · 2025-06-03T14:04:51Z

Or, we merge the way it is and you make a new PR with better description.

Sure. I can do that.

Kawrakow added 2 commits June 3, 2025 12:23

Adding top-n-sigma sampler

1ce02eb

Fix typos in XTC PR

63c6f15

Update README.md for main and server

f75ee6d

saood06 reviewed Jun 3, 2025

View reviewed changes

Kawrakow added 2 commits June 3, 2025 14:59

More README

dc8b4c8

More README

106e326

saood06 reviewed Jun 3, 2025

View reviewed changes

Ph0rk0z mentioned this pull request Jun 3, 2025

Feature Request: Top n-sigma sampler #440

Closed

4 tasks

ikawrakow merged commit f6d5fbd into main Jun 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding top-n-sigma sampler #489

Adding top-n-sigma sampler #489

Uh oh!

ikawrakow commented Jun 3, 2025

Uh oh!

saood06 commented Jun 3, 2025

Uh oh!

ikawrakow commented Jun 3, 2025

Uh oh!

saood06 commented Jun 3, 2025

Uh oh!

saood06 Jun 3, 2025 •

edited

Loading

Uh oh!

saood06 Jun 3, 2025

Uh oh!

Ph0rk0z commented Jun 3, 2025 •

edited

Loading

Uh oh!

saood06 Jun 3, 2025 •

edited

Loading

Uh oh!

ikawrakow commented Jun 3, 2025

Uh oh!

saood06 commented Jun 3, 2025

Uh oh!

Uh oh!

Adding top-n-sigma sampler #489

Adding top-n-sigma sampler #489

Uh oh!

Conversation

ikawrakow commented Jun 3, 2025

Uh oh!

saood06 commented Jun 3, 2025

Uh oh!

ikawrakow commented Jun 3, 2025

Uh oh!

saood06 commented Jun 3, 2025

Uh oh!

saood06 Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

saood06 Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

Ph0rk0z commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saood06 Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ikawrakow commented Jun 3, 2025

Uh oh!

saood06 commented Jun 3, 2025

Uh oh!

Uh oh!

saood06 Jun 3, 2025 •

edited

Loading

Ph0rk0z commented Jun 3, 2025 •

edited

Loading

saood06 Jun 3, 2025 •

edited

Loading