skip to Main Content

It seems the GCC and Clang interpret addition between a signed and unsigned integers differently, depending on their size. Why is this, and is the conversion consistent on all compilers and platforms?

Take this example:

#include <cstdint>
#include <iostream>

int main()
{

    std::cout <<"16 bit uint 2 - int 3 = "<<uint16_t(2)+int16_t(-3)<<std::endl;
    std::cout <<"32 bit uint 2 - int 3 = "<<uint32_t(2)+int32_t(-3)<<std::endl;
    return 0;
}

Result:

$ ./out.exe   
16 bit uint 2 - int 3 = -1
32 bit uint 2 - int 3 = 4294967295

In both cases we got -1, but one was interpreted as an unsigned integer and underflowed. I would have expected both to be converted in the same way.

So again, why do the compilers convert these so differently, and is this guaranteed to be consistent? I tested this with g++ 11.1.0, clang 12.0. and g++ 11.2.0 on Arch Linux and Debian, getting the same result.

4

Answers


  1. A 16-bit unsigned int can be promoted to a 32-bit int without any lost values due to range differences, so that’s what happens. Not so for the 32-bit integers.

    Login or Signup to reply.
  2. When you do uint16_t(2)+int16_t(-3), both operands are types that are smaller than int. Because of this, each operand is promoted to an int and signed + signed results in a signed integer and you get the result of -1 stored in that signed integer.

    When you do uint32_t(2)+int32_t(-3), since both operands are the size of an int or larger, no promotion happens and now you are in a case where you have unsigned + signed which results in a conversion of the signed integer into an unsigned integer, and the unsigned value of -1 wraps to being the largest value representable.

    Login or Signup to reply.
  3. So again, why do the compilers convert these so differently,

    Standard quotes for [language-lawyer]:

    [expr.arith.conv]

    Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way.
    The purpose is to yield a common type, which is also the type of the result.
    This pattern is called the usual arithmetic conversions, which are defined as follows:

    • Otherwise, the integral promotions ([conv.prom]) shall be performed on both operands.
      Then the following rules shall be applied to the promoted operands:

      • If both operands have the same type, no further conversion is needed.
      • Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank shall be converted to the type of the operand with greater rank.
      • Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.
      • Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.
      • Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

    [conv.prom]

    A prvalue of an integer type other than bool, char8_­t, char16_­t, char32_­t, or wchar_­t whose integer conversion rank ([conv.rank]) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.

    These conversions are called integral promotions.

    std::uint16_t type may have a lower conversion rank than int in which case it will be promoted when used as an operand. int may be able to represent all values of std::uint16_t in which case the promotion will be to int. The common type of two int is int.

    std::uint32_t type may have the same or a higher conversion rank than int in which case it won’t be promoted. The common type of an unsigned type and a signed of same rank is an unsigned type.


    For an explanation why this conversion behaviour was chosen, see chapter "6.3.1.1 Booleans, characters, and integers" of "
    Rationale for
    International Standard—
    Programming Languages—
    C". I won’t quote the entire chapter here.


    is this guaranteed to be consistent?

    The consistency depends on relative sizes of the integer types which are implementation defined.

    Login or Signup to reply.
  4. Why is this,

    C (and hence C++) has a rule that effectively says when a type smaller than int is used in an expression it is first promoted to int (the actual rule is a little more complex than that to allow for multiple distinct types of the same size).

    Section 6.3.1.1 of the Rationale for International Standard Programming Languages C claims that in early C compilers there were two versions of the promotion rule. "unsigned preserving" and "value preserving" and talks about why they chose the "value preserving" option. To summarise they believed it would produce correct results in a greater proportion of situations*.

    It does not however explain why the concept of promotion exists in the first place. I would speculate that it existed because on many processors, including the PDP-11 for which C was originally designed, arithmetic operations only operated on words, not on units smaller than words. So it was simpler and more efficient to convert everything smaller than a word to a word at the start of an expression.

    On most platforms today int is 32 bits. So both uint16_t and int16_t are promoted to int. The artithmetic proceeds to produce a result of type int with a value of -1.

    OTOH uint32_t and int32_t are not smaller than int, so they retain their original size and signedness through the promotion step. The rules for when the operands to an arithmetic operator are of different types come into play and since the operands are the same size the signed operand is converted to unsigned.

    The rationale does not seem to talk about this rule, which suggests it goes back to pre-standard C.

    and is the conversion consistent on all compilers and platforms?

    On an Ansi C or ISO C++ platform it depends on the size of int. With 16 bit int both examples would give large positive values. With 64-bit int both examples would give -1.

    On pre-standard implementations it’s possible that both expressions might return large positive numbers.

    * This belief is somewhat shattered by modern C compilers that treat integer overflow as an optimisation opportunity.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search