Description
I know this has been discussed before, but I didn't see a specific proposal filed for it yet and I think it's important.
Unexpected integer overflow can lead to serious bugs, including bugs in Go itself. Go's bounds-checking on slices and arrays mitigates some of the harmful effects of overflow, but not all of them. For example, programs that make system calls may pass data structures into the kernel, bypassing Go's usual bounds checks. Programs that marshal data-structures to be sent over the wire (such as protocol buffers) may send silently-corrupted data instead of returning errors as they ought to. And programs that use unsafe
to access addresses with offsets are vulnerable to exactly the same overflow bugs as in C.
In my experience, Go programs and libraries are often written assuming "reasonable inputs" and no overflow. For such programs, it would be clearer for overflow to cause a run-time panic (similar to dividing by zero) rather than silently wrapping around. Even in the case where the unintended overflow is subsequently caught by a slice bounds check, reporting the error at the overflowing operation rather than the slice access would make the source of the bug easier to diagnose.
The potential performance impact of this proposal is similar to bounds-checking in general, and likely lower than using arbitrary-precision ints (#19623). The checks can be omitted when the compiler can prove the result is within bounds, any new branches will be trivially predictable (they'll occupy some CPU resources in the branch-predictor but otherwise add little overhead), and in some cases the checks might be able to use bounds-check instructions or other hardware traps.
For the subset of programs and libraries that intentionally make use of wraparound, we could provide one of several alternatives:
- "comma, ok" forms
or "comma, carry" forms (proposal: spec: extend comma-ok expressions to + - * / arithmetic #6815)that ignore overflow panics, analogous to how the "comma, ok" form of a type-assertion ignores the panic from a mismatched type. - Separate "integer mod 2ⁿ" types (requiring explicit conversions from ordinary integer types), perhaps named along the lines of
int32wrap
orint32mod
. - Implicit wrapping only for unsigned types (
uint32
and friends), since they're used for bit-manipulation code more often than the signed equivalents.
Those alternatives could also be used to optimize out the overflow checks in inner-loop code when the programmer has already validated the inputs by some other means.
[Edit: added this section in response to comments.]
Concretely, the proposed changes to the spec are:
Integer operators
For two integer values x
and y
, the integer quotient q = x / y
and remainder r = x % y
satisfy the following relationships:
[…]
As an exception to this rule, if the dividend x
is the most negative value for the int type of x
, the quotient q = x / -1
is equal to x
(and r = 0
).
[…]
The shift operators shift the left operand by the shift count specified by the right operand. They implement arithmetic shifts if the left operand is a signed integer and logical shifts if it is an unsigned integer. The result of a logical shift is truncated to the bit width of the type: a logical shift never results in overflow. Shifts behave as if the left operand is shifted n times by 1 for a shift count of n. As a result, x << 1 is the same as x*2 and x >> 1 is the same as x/2 but truncated towards negative infinity.
[…]
Integer overflow
If the result of any arithmetic operator or conversion to an integer type cannot be represented in the type, a run-time panic occurs.
An expression consisting of arithmetic operators and / or conversions between integer types used in an assignment or initialization of the special form
v, ok = expr
v, ok := expr
var v, ok = expr
var v, ok T1 = expr
yields an additional untyped boolean value. The value of ok
is true
if the results of all arithmetic operators and conversions could be represented in their respective types. Otherwise it is false
and the value of v
is computed as follows. No run-time panic occurs in this case.
For unsigned integer values, the operations +, -, *, and << are computed modulo 2ⁿ upon overflow, where n is the bit width of the unsigned integer's type. Loosely speaking, these unsigned integer operations discard high bits upon overflow, and programs may rely on ``wrap around''.
For signed integers, the operations +, -, *, and << are computed using two's complement arithmetic and truncated to the bit width of the signed integer's type upon overflow. No exception is raised as a result of overflow. A compiler may not optimize code under the assumption that overflow does not occur. For instance, it may not assume that x < x + 1 is always true.
If the dividend x
of a quotient or remainder operation is the most negative value for the int type of x
, evaluation of x / -1
overflows and its result upon overflow is equal to x
. In contrast, evaluation of x % -1
does not overflow and yields a result of 0
.
[…]
Conversions between numeric types
For the conversion of non-constant numeric values, the following rules apply:
- When converting between integer types, if the value is a signed integer, it is sign extended to implicit infinite precision; otherwise it is zero extended. If the value cannot be represented in the destination type, an overflow occurs; see the section on integer overflow. Upon overflow, the result is truncated to fit in the result type's size. For example, if
v := uint16(0x10F0)
, thenw, _ := uint32(int8(v))
results inw == 0xFFFFFFF0
.
This proposal is obviously not compatible with Go 1, but I think we should seriously consider it for Go 2.