Converting from short to unsigned short and preserving the bit pattern confusion - Debian

Engineer999
May 5, 2020
130 views
2 votes
5 Answers

I am working on a project where I need to get a range of signed 16-bit ints, negative and positive values, and send them to a function to analyse during unit tests.

For different reasons, the function only takes an array of unsigned 16-bit ints, so I need to store the signed ints in an unsigned 16-bit integer array and completely preserve the same bit pattern. I am using gcc (Debian 8.3.0-6) 8.3.0.

unsigned short arr[450];
unsigned short arrIndex = 0;

for (short i = -32768; i < (32767 - 100) ; i = i + 100 )
{
    arr[arrIndex] = i; 

    printf("short value is          : %dn", i);
    printf("unsigned short value is : %dn", arr[arrIndex]);
    arrIndex++;
}

Even tho i’m telling printf to print signed values, I am surprised to see that the bit patterns are actually different for those values less than zero. The first few values are below:

short value is           : -32768
unsigned short value is  : 32768

short value is           : -32668
unsigned short value is  : 32868

short value is           : -32568
unsigned short value is  : 32968

What is happening here, and how would I preserve the bit pattern for the values of i below zero?

Tags: c#data-conversion signed type-conversion unsigned

Answers

- templatetypedef
- May 5, 2020 at 8:53 pm
- 0 votes
0
In C, if you call a variadic function and pass in an integral type of any sort, the language will automatically promote it to a signed or unsigned int of the same type. When you then print things out using the %d modifier, you’re seeing the promoted int as a result.

For example, when you call
```
printf("short value is          : %dn", i);
```
The (negative) value of i is getting promoted to a signed int with the same value, which is why it prints out as negative. When you then call
```
printf("unsigned short value is : %dn", arr[arrIndex]);
```
The (unsigned) value of arr[arrIndex] gets promoted to an unsigned int, which is why you see the positive value displayed.

To fix this, change your printf so that you tell the compiler to display the results specifically as short variables:
```
printf("short value is          : %hdn", i);
printf("unsigned short value is : %hdn", arr[arrIndex]);
```
Now, you’ll see the values agreeing.
Login or Signup to reply.

- adentinger
- May 5, 2020 at 8:55 pm
- 0 votes
0
The data is copied correctly, bit-by-bit as you wanted. It’s the just the printing that displays it as a signed value because arr is declared as an array of unsigned values.

%d prints the data passed as ints (by standard definition? not sure), which on common platforms is 4-bytes. The argument passed to printf is upgraded to an int before being printed, which, depending on whether the argument in question is signed or not, will require sign-extension or not.

When printing i, which is a signed value, the value will be sign-extended before being printed. For example, if i is -1 (which, is represented as 0xFFFF on a 2-byte signed value using two’s complement), then i will be upgraded as the int value 0xFFFFFFFF (which also is -1, but represented with four bytes).

However, if i is equal to -1, then, when doing arr[arrIndex] = i, arr[arrIndex] will indeed be set to 0xFFFF, copied bit-by-bit as you wanted. However, since arr[arrIndex] is unsigned, in the world of the unsigned, 0xFFFF represents 65535. Then, when the time comes to print arr[arrIndex], since arr[arrIndex] is unsigned, the value will not be sign-extended, since it is an unsigned value. 0xFFFF would therefore be upgraded to 0x0000FFFF, which is equal to 65535, and printed as such.

We can verify this by forcing arr to be considered signed before being printed. That way, arr will be treated the same way i is treated.
```
#include <stdio.h>
int main() {
    unsigned short arr[450];
    unsigned short arrIndex = 0;

    for (signed short i = -32768; i < (32767 - 100) ; i = i + 100 )
    {
        arr[arrIndex] = i;
        printf("short value is          : %dn", i);
        printf("unsigned short value is : %dn", ((signed short*)arr)[arrIndex]);
        arrIndex++;
    }
}
```
Output:
```
short value is          : -32768
unsigned short value is : -32768
short value is          : -32668
unsigned short value is : -32668
short value is          : -32568
unsigned short value is : -32568
short value is          : -32468
unsigned short value is : -32468
short value is          : -32368
unsigned short value is : -32368
short value is          : -32268
unsigned short value is : -32268
short value is          : -32168
unsigned short value is : -32168
```
Or, we could directly declare arr as an array of signed values to achieve the same result:
```
#include <stdio.h>
int main() {
    signed short arr[450];
    unsigned short arrIndex = 0;

    for (signed short i = -32768; i < (32767 - 100) ; i = i + 100 )
    {
        arr[arrIndex] = i;
        printf("short value is          : %dn", i);
        printf("unsigned short value is : %dn", arr[arrIndex]);
        arrIndex++;
    }
}
```
Login or Signup to reply.

- chuxReinstateMonica
- May 5, 2020 at 9:08 pm
- 0 votes
0
how would I preserve the bit pattern for the values of i below zero?

With the very common 2’s complement encoding the following is sufficient.
```
unsigned short us = (unsigned short) some_signed_short;
```
BITD day with ones’ complement and sign-magnitude, this was not sufficient and code would use a union of short and unsigned short.

Yet by virtue of how a negative value 2’s complement is converted to unsigned, the bit pattern is preserved for same sized types.

the bit patterns are actually different for those values less than zero.

The bit patterns are the same. They went through different conversion paths to get printed and so have different text output.

When printing short, unsigned short, best to use a h printf modifier.
```
//printf("short value is          : %dn", i);
//printf("unsigned short value is : %dn", arr[arrIndex]);
printf("short value is          : %hdn", i);
printf("unsigned short value is : %hun", arr[arrIndex]);
```
Login or Signup to reply.

- BobJarvis
- May 5, 2020 at 9:28 pm
- 0 votes
0
The values are being copied correctly. Let’s look at the following code:
```
#include <stdio.h>

void printit(char *name, short int val)
  {
  printf("%s  %hd  %hu  0x%hXn", name, val, val, val);
  }

int main()
  {
  short int v1 = 0x8000;
  short int v2 = 0x8064;
  short int v3 = 0x80C8;

  printit("v1", v1);
  printit("v2", v2);
  printit("v3", v3);
  }
```
Here I’m created four signed short variables, and setting them to bit patterns. Forget “positive” and “negative” for a moment – I’m just shoving a bit pattern into those variables. In the subroutine printit those values are printed as signed decimal, unsigned decimal, and hex (to verify it’s all the same bit pattern). Now, look at the results:
```
v1  -32768  32768  0x8000
v2  -32668  32868  0x8064
v3  -32568  32968  0x80C8
```
Now you can see that I just copied the values you used (-32768, -32668, and -32568) and assigned them to the variables. The only difference is that I converted them to hexadecimal first. Same bit pattern. Same results. But, except in a few rare cases, the signed decimal value interpretation of a bit pattern where the decimal value is negative is NOT the same as the unsigned decimal interpretation of a bit pattern. I suggest reading up on One’s Complement for binary numbers, and Two’s Complement representation of negative binary numbers.
Login or Signup to reply.

Please, check the for loop limits, as if you go from -32768 to <(32767-100) in leaps of 100 values, you fill 655 array elements, and you have only declared 450.

Also, to print an unsigned short value, you need to use %u (or the equivalent %hu, as shorts are converted to int for printf() usage) format specifier.

Use this example:

#include <stdio.h>

int main()
{
        short i;
        for (i = -32768; i < (32767 - 100); i += 100) {
                unsigned short j = i;
                printf("Signed  : %dn", i);
                printf("Unsigned: %un", j);
        }

        return 0;
}

It will produce:

$ a.out
Signed  : -32768
Unsigned: 32768
Signed  : -32668
Unsigned: 32868
Signed  : -32568
Unsigned: 32968
Signed  : -32468
...
Signed  : -268
Unsigned: 65268
Signed  : -168
Unsigned: 65368
Signed  : -68
Unsigned: 65468
Signed  : 32
Unsigned: 32
Signed  : 132
Unsigned: 132
...
Signed  : 32432
Unsigned: 32432
Signed  : 32532
Unsigned: 32532
Signed  : 32632
Unsigned: 32632
$ _

Please signup or login to give your own answer.

Click here to cancel reply.

Converting from short to unsigned short and preserving the bit pattern confusion – Debian

Answers