Conversion between `long double` and `std::uintmax_t` loses precision on whole numbers - Ubuntu

PiotrSiupa
February 17, 2023
132 views
0 votes
2 Answers

I’ve made a class that allows to split long double to an integer value and a binary exponent (for some precise calculations).

My problem is pretty hard to reproduce because the class usually works great but on one specific machine I’ve tested it on, it was losing a few least significant bits on each conversion. (More on that later.)

Here is the code. (It needs to stay in separate files for this bug to occur.)

SplitLD.hh:

#include <cstdint>

// Splits `long double` into an integer and an exponent.
class SplitLD
{
public: // Everything is public to make the example easier to test.
    std::uintmax_t integer;
    int exponent;

    SplitLD(const long double number);
    operator long double() const;
};

SplitLD.cc:

#include <cfloat>
#include <cmath>
#include <limits>
#include <climits>
#include "SplitLD.hh"

SplitLD::SplitLD(long double number) // For the sake of simplicity, we ignore negative numbers and various corner cases.
{
    static_assert(FLT_RADIX == 2);
    static_assert(sizeof(std::uintmax_t) * CHAR_BIT >= std::numeric_limits<long double>::digits);
    // The following two operations change the exponent to make the represented value a whole number.
    number = std::frexp(number, &exponent);
    number = std::ldexp(number, std::numeric_limits<long double>::digits);
    exponent -= std::numeric_limits<long double>::digits;
    integer = number; // cast from `long double` to `std::uintmax_t`
}

SplitLD::operator long double() const
{
    long double number = integer; // cast from `std::uintmax_t` to `long double`
    number = std::ldexp(number, exponent);
    return number;
}

main.cc:

#include "SplitLD.hh"

int main()
{
    const long double x = 12345.67890123456789l; // arbitrarily chosen number for the test
    const SplitLD y = x;
    const long double z = y;
    return z == x ? 0 : 1;
}

If you try to run this code it will probably work fine.
However, I have one machine, on which the problem can be consistently reproduced.

The conditions that (might) trigger the error are as follow:

The floating point type has to be long double. I tried float and double and they seem to work fine.
Both GCC and Clang behave similarly and I can reproduce the problem on both.
If I put all the code into a single file, it starts to work, possibly because functions are inlined or evaluated during compilation.
I encountered the error on WSL (Windows Subsystem for Linux) with Ubuntu.
It may have something to do with hardware configuration.

I’ve tried to print the binary representation of the numbers (formatted for readability).
(I’m pretty sure that the second group is the sign, the third one is the exponent and the forth one is the mantissa. I’m not sure what the first group is but it’s probably just padding.)

Normally the binary values are as follow (for y I print only the integer):

x 000000000000000000000000000000000000000000000000'0'100000000001100'1100000011100110101101110011000111100010100111101011101110000010
y                                                                    1100000011100110101101110011000111100010100111101011101110000010
z 000000000000000000000000000000000000000001000000'0'100000000001100'1100000011100110101101110011000111100010100111101011101110000010

However, when the error occurs, they look like this:

x 000000000000000001111111100110001001110111101001'0'100000000001100'1100000011100110101101110011000111100010100111101011101110000010
y                                                                    1100000011100110101101110011000111100010100111101011110000000000
z 000000000000000001111111100110001001110111101001'0'100000000001100'1100000011100110101101110011000111100010100111101100000000000000

What can cause this problem?

Is the program well formed?
Is there an UB somewhere or anything that allows compiler to do some weird optimization?

Here is a live demo. However, its utility is very limited because it works correctly.
(It’s includes the code that prints binary representations, which was omitted here to not make the example too long.)

Update 1:

I’ve modified the test program to print binary data after each operation, in order to determine which exact instruction causes the data loss.
It looks like guilty instruction are specifically assignments of long double to std::uintmax_t and std::uintmax_t to long double.
Neither std::frexp nor std::ldexp seems to change the mantissa.

Here’s how it looks on the machine where the error occurs:

========== `long double` to `std::uintmax_t` ==========
Initial `long double`
000000000000000001111111001100101001101100000010'0'100000000001100'1100000011100110101101110011000111100010100111101011101110000010
Calling `frexp`...
000000000000000001111111001100101001101100000010'0'011111111111110'1100000011100110101101110011000111100010100111101011101110000010
Calling `ldexp`...
000000000000000001111111001100101001101100000010'0'100000000111110'1100000011100110101101110011000111100010100111101011101110000010
Converting to `std::uintmax_t`
                                                                   1100000011100110101101110011000111100010100111101011110000000000
========== `std::uintmax_t` to `long double` ==========
Initial `std::uintmax_t`
                                                                   1100000011100110101101110011000111100010100111101011110000000000
Converting to `long double`
000000000000000000000000000000000000000000000000'0'100000000111110'1100000011100110101101110011000111100010100111101100000000000000
Calling `ldexp`
000000000000000000000000000000000000000000000000'0'100000000001100'1100000011100110101101110011000111100010100111101100000000000000

Update 2:

It looks like the problem is connected with WSL.
The code works correctly on the same machine when it’s ran on a live Linux system or a Linux in a virtual machine.
I cannot install compiler in Windows to test it.

Tags: c#casting floating-point long-double

Answers

Chosen as BEST ANSWER
- PiotrSiupa
- April 6, 2023 at 8:13 am
- 0 votes
0
Reinstalling the system on the WSL solved the problem. It might be a bug that already has been fixed.

(Edit)

- chuxReinstateMonica
- February 19, 2023 at 9:42 am
- 0 votes
0
What can cause this problem?

Different precisions in long double with different machines/compilers.

12345.67890123456789l risks different unclear bit patterns depending on long double precision. Easier to analyze issues with a hexadecimal-floating-constant or maybe with a well understood constant like 4.0L/3 with its repeated pattern:
```
// const long double x = 12345.67890123456789l;

// 4.0L/3
// In binary notation
// Odd number of significant bits
1.010101...010101
// Even number of significant bits
1.010101...0101011
```
integer = number; is risky. (integer is not defined, yet comment implies uintmax_t.)

long double, on various machines/compilers comes in various flavors: 64-bit, 80-bit with size 80-bit. 80-bit with size 128 bit due to padding, 128-bit and others.

64: number ranges from [-0x1F FFFF FFFF FFFF to +0x1F FFFF FFFF FFFF], something like a int54_t.

80: number ranges from [-0xFFFF FFFF FFFF FFFF to +0xFFFF FFFF FFFF FFFF], something like a int65_t.

128: Even wider.

Saving integer = number is insufficient to recreate all values with 80/128 and uintmax_t is only 64-bit.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Conversion between `long double` and `std::uintmax_t` loses precision on whole numbers – Ubuntu

Answers