I’ve made a class that allows to split long double
to an integer value and a binary exponent (for some precise calculations).
My problem is pretty hard to reproduce because the class usually works great but on one specific machine I’ve tested it on, it was losing a few least significant bits on each conversion. (More on that later.)
Here is the code. (It needs to stay in separate files for this bug to occur.)
SplitLD.hh
:
#include <cstdint>
// Splits `long double` into an integer and an exponent.
class SplitLD
{
public: // Everything is public to make the example easier to test.
std::uintmax_t integer;
int exponent;
SplitLD(const long double number);
operator long double() const;
};
SplitLD.cc
:
#include <cfloat>
#include <cmath>
#include <limits>
#include <climits>
#include "SplitLD.hh"
SplitLD::SplitLD(long double number) // For the sake of simplicity, we ignore negative numbers and various corner cases.
{
static_assert(FLT_RADIX == 2);
static_assert(sizeof(std::uintmax_t) * CHAR_BIT >= std::numeric_limits<long double>::digits);
// The following two operations change the exponent to make the represented value a whole number.
number = std::frexp(number, &exponent);
number = std::ldexp(number, std::numeric_limits<long double>::digits);
exponent -= std::numeric_limits<long double>::digits;
integer = number; // cast from `long double` to `std::uintmax_t`
}
SplitLD::operator long double() const
{
long double number = integer; // cast from `std::uintmax_t` to `long double`
number = std::ldexp(number, exponent);
return number;
}
main.cc
:
#include "SplitLD.hh"
int main()
{
const long double x = 12345.67890123456789l; // arbitrarily chosen number for the test
const SplitLD y = x;
const long double z = y;
return z == x ? 0 : 1;
}
If you try to run this code it will probably work fine.
However, I have one machine, on which the problem can be consistently reproduced.
The conditions that (might) trigger the error are as follow:
- The floating point type has to be
long double
. I triedfloat
anddouble
and they seem to work fine. - Both GCC and Clang behave similarly and I can reproduce the problem on both.
- If I put all the code into a single file, it starts to work, possibly because functions are inlined or evaluated during compilation.
- I encountered the error on WSL (Windows Subsystem for Linux) with Ubuntu.
- It may have something to do with hardware configuration.
I’ve tried to print the binary representation of the numbers (formatted for readability).
(I’m pretty sure that the second group is the sign, the third one is the exponent and the forth one is the mantissa. I’m not sure what the first group is but it’s probably just padding.)
Normally the binary values are as follow (for y
I print only the integer
):
x 000000000000000000000000000000000000000000000000'0'100000000001100'1100000011100110101101110011000111100010100111101011101110000010
y 1100000011100110101101110011000111100010100111101011101110000010
z 000000000000000000000000000000000000000001000000'0'100000000001100'1100000011100110101101110011000111100010100111101011101110000010
However, when the error occurs, they look like this:
x 000000000000000001111111100110001001110111101001'0'100000000001100'1100000011100110101101110011000111100010100111101011101110000010
y 1100000011100110101101110011000111100010100111101011110000000000
z 000000000000000001111111100110001001110111101001'0'100000000001100'1100000011100110101101110011000111100010100111101100000000000000
What can cause this problem?
Is the program well formed?
Is there an UB somewhere or anything that allows compiler to do some weird optimization?
Here is a live demo. However, its utility is very limited because it works correctly.
(It’s includes the code that prints binary representations, which was omitted here to not make the example too long.)
Update 1:
I’ve modified the test program to print binary data after each operation, in order to determine which exact instruction causes the data loss.
It looks like guilty instruction are specifically assignments of long double
to std::uintmax_t
and std::uintmax_t
to long double
.
Neither std::frexp
nor std::ldexp
seems to change the mantissa.
Here’s how it looks on the machine where the error occurs:
========== `long double` to `std::uintmax_t` ==========
Initial `long double`
000000000000000001111111001100101001101100000010'0'100000000001100'1100000011100110101101110011000111100010100111101011101110000010
Calling `frexp`...
000000000000000001111111001100101001101100000010'0'011111111111110'1100000011100110101101110011000111100010100111101011101110000010
Calling `ldexp`...
000000000000000001111111001100101001101100000010'0'100000000111110'1100000011100110101101110011000111100010100111101011101110000010
Converting to `std::uintmax_t`
1100000011100110101101110011000111100010100111101011110000000000
========== `std::uintmax_t` to `long double` ==========
Initial `std::uintmax_t`
1100000011100110101101110011000111100010100111101011110000000000
Converting to `long double`
000000000000000000000000000000000000000000000000'0'100000000111110'1100000011100110101101110011000111100010100111101100000000000000
Calling `ldexp`
000000000000000000000000000000000000000000000000'0'100000000001100'1100000011100110101101110011000111100010100111101100000000000000
Update 2:
It looks like the problem is connected with WSL.
The code works correctly on the same machine when it’s ran on a live Linux system or a Linux in a virtual machine.
I cannot install compiler in Windows to test it.
2
Answers
Reinstalling the system on the WSL solved the problem. It might be a bug that already has been fixed.
Different precisions in
long double
with different machines/compilers.12345.67890123456789l
risks different unclear bit patterns depending onlong double
precision. Easier to analyze issues with a hexadecimal-floating-constant or maybe with a well understood constant like4.0L/3
with its repeated pattern:integer = number;
is risky. (integer
is not defined, yet comment impliesuintmax_t
.)long double
, on various machines/compilers comes in various flavors: 64-bit, 80-bit with size 80-bit. 80-bit with size 128 bit due to padding, 128-bit and others.64:
number
ranges from [-0x1F FFFF FFFF FFFF to +0x1F FFFF FFFF FFFF], something like aint54_t
.80:
number
ranges from [-0xFFFF FFFF FFFF FFFF to +0xFFFF FFFF FFFF FFFF], something like aint65_t
.128: Even wider.
Saving
integer = number
is insufficient to recreate all values with 80/128 anduintmax_t
is only 64-bit.