Skip to content

[sprintf] Wrong trailing zeros for %g (and disregarded '#' for float specifiers) #1791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
stefano-zanotti-88 opened this issue May 4, 2025 · 0 comments

Comments

@stefano-zanotti-88
Copy link
Contributor

The standard says:

g, G: A double argument representing a floating-point number is converted in style f or e (or in style F or E in the case of a G conversion specifier), depending on the value converted and the precision. Let P equal the precision if nonzero, 6 if the precision is omitted, or 1 if the precision is zero. Then, if a conversion with style E would have an exponent of X:
if P > X ≥ −4, the conversion is with style f (or F) and precision P − (X + 1).
otherwise, the conversion is with style e (or E) and precision P − 1.
Finally, unless the # flag is used, any trailing zeros are removed from the fractional portion of the result and the decimal-point character is removed if there is no fractional portion remaining.

This means that %g must use a dynamic precision:

  • generate the number as per %f, with the specified or default precision (with the usual rounding and trailing-0 padding)
  • if '#' is not specified, remove all trailing 0s. Then remove the decimal point, if it is now trailing
  • if '#' is specified, no trailing 0 must be removed, just like for %f

STB vs standard-conforming (only the '#' calls are wrong):

printf("%g" , 1.234); // "1.234" vs "1.234"
printf("%g" , 1.0);   // "1"     vs "1"
printf("%g" , 0.0);   // "0"     vs "0"
printf("%g" , 1e-9);  // "1e-09" vs "1e-09"
printf("%#g", 1.234); // "1.234" vs "1.23400"
printf("%#g", 1.0);   // "1"     vs "1.00000"
printf("%#g", 0.0);   // "0"     vs "0.000000"
printf("%#g", 1e-9);  // "1e-09" vs "1.00000e-09"

Also notice that "precision", for %g", means the number of significant digits, and not the number of decimal digits, as for "%f" (but it does mean number of decimal digits, if the value is +-0). STB handles that correctly; I'm just mentioning it since it might be something to look out for when fixing the current issue, not to introduce problems there.

Fix:
replace lines 719-745:

stb/stb_sprintf.h

Lines 719 to 745 in f056911

// read the double into a string
if (stbsp__real_to_str(&sn, &l, num, &dp, fv, (pr - 1) | 0x80000000))
fl |= STBSP__NEGATIVE;
// clamp the precision and delete extra zeros after clamp
n = pr;
if (l > (stbsp__uint32)pr)
l = pr;
while ((l > 1) && (pr) && (sn[l - 1] == '0')) {
--pr;
--l;
}
// should we use %e
if ((dp <= -4) || (dp > (stbsp__int32)n)) {
if (pr > (stbsp__int32)l)
pr = l - 1;
else if (pr)
--pr; // when using %e, there is one digit before the decimal
goto doexpfromg;
}
// this is the insane action to get the pr to match %g semantics for %f
if (dp > 0) {
pr = (dp < (stbsp__int32)l) ? l - dp : 0;
} else {
pr = -dp + ((pr > (stbsp__int32)l) ? (stbsp__int32) l : pr);
}

with the following:

         // read the double into a string
         if (stbsp__real_to_str(&sn, &ln, num, &dp, fv, (pr - 1) | 0x80000000, (f[0] == 'G')))
            fl |= STBSP__NEGATIVE;
         if (dp == STBSP__SPECIAL) {
            goto dofloatfromg;
         }

         // clamp the precision and delete extra zeros after clamp
         if (l > 1 + (stbsp__uint32)pr)
            l = 1 + pr;

         // should we use %e
         // X = dp-1 = the exponent that an %e conversion would have
         // P = n = the precision specified by the caller
         // if P > X >= -4, use %f with P' = P - (X + 1)  [below]
         // Otherwise, use %e with P' = P - 1
         // Also, if '#' is not specified, any trailing 0 (and decimal point) are removed
         if (!((stbsp__int32)pr > dp-1 && dp-1 >= -4)) {
            if (pr)
               --pr;
            if(!(fl & STBSP__LEADING_0X)) {
               if(pr + 1 > l) {
                  pr = l - 1;
               }
               while ((pr) && (l <= 0 || sn[l - 1] == '0')) {
                  --pr;
                  --l;
               }
               if(l < 0)
                  l = 0;
            }
            goto doexpfromg;
         }
         // this is the insane action to get the pr to match %g semantics for %f
         if((fl & STBSP__LEADING_0X)) {
            pr = pr - dp;
            if(pr < 0) {
              pr = 0;
            }
         } else {
            if (dp > 0) {
               pr = (dp < (stbsp__int32)l) ? l - dp : 0;
            } else {
               pr = -dp + ((pr > (stbsp__int32)l) ? (stbsp__int32) l : pr);
            }
            while ((pr) && (l <= 0 || sn[l - 1] == '0')) {
               --pr;
               --l;
            }
            if(l < 0)
               l = 0;
         }

I suspect this could be simplified further (the original was more optimized than this), but I stopped before losing my sanity.
It might also be that some corner cases are still wrong, though all the calls mentioned above now produce the correct output.

Note that to get a complete fix, we also need to fix another loosely related bug: '#' is ignored for all float specifiers.
To fix that one, every statement of the type:

*s++ = stbsp__period;

Should be guarded not by:

if (pr)

but by:

if (pr || (fl & STBSP__LEADING_0X))

The only place that affects the current bug is line 914, though this related fix should also be applied to lines 672, 770, 842, 933

Note also that the main fix (that for %g) also corrects for another non-standard behavior: the precision should be ignore when printing nan/inf. So, we have:

printf("%.1g", NAN);    // "NaN" rather than "N"
printf("%.1g", 1/0.0);  // "Inf" rather than "I"
printf("%.1g", -1/0.0); // "-Inf" rather than "-I"
printf("%.2g", NAN);    // "NaN" rather than "Na"
printf("%.2g", 1/0.0);  // "Inf" rather than "In"
printf("%.2g", -1/0.0); // "-Inf" rather than "-In"

The shortened nan/inf are checked in the STB tests (they will need to be changed), and they seem like a deliberate choice. Maybe some version of MSVC do the same?
However, the standard requires full "nan"/"inf" in all cases (also with the proper case, as already reported).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant