Summary
Recently I encountered a weird issue regarding LTO and -ffast-math
where I got inconsistent result for my "pow" ( in cmath
) calls depending on whether -flto
is used.
Environment:
$ g++ --version
g++ (GCC) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ ll /lib64/libc.so.6
lrwxrwxrwx 1 root root 12 Sep 3 2019 /lib64/libc.so.6 -> libc-2.17.so
$ ll /lib64/libm.so.6
lrwxrwxrwx 1 root root 12 Sep 3 2019 /lib64/libm.so.6 -> libm-2.17.so
$ cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
Minimal Example
Code
fixed.hxx
#include <cstdint>
double Power10f(const int16_t power);
fixed.cxx
#include "fixed.hxx"
#include <cmath>
double Power10f(const int16_t power)
{
return pow(10.0, (double) power);
}
test.cxx
#include <iostream>
#include <cmath>
#include <iomanip>
#include <cstdint>
#include "fixed.hxx"
int main(int argc, char** argv)
{
if (argc >= 3) {
int64_t value = (int64_t)atoi(argv[1]);
int16_t power = (int16_t)atoi(argv[2]);
double x = Power10f(power);
std::cout.precision(17);
std::cout << std::scientific << x << std::endl;
std::cout << std::scientific << (double)value * x << std::endl;
return 0;
}
return 1;
}
Compile & Run
Compile it with -ffast-math
and with/without -flto
gives different results
- With
-flto
will eventually call the__pow_finite
version and gives the an "accurate" result:
$ g++ -O3 -DNDEBUG -ffast-math -std=c++17 -flto -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000000e+20
8.10000000000000000e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
400930: 0f bf ff movswl %di,%edi
400933: 66 0f ef c9 pxor %xmm1,%xmm1
400937: f2 0f 10 05 99 00 00 movsd 0x99(%rip),%xmm0 # 4009d8 <_IO_stdin_used+0x8>
40093e: 00
40093f: f2 0f 2a cf cvtsi2sd %edi,%xmm1
400943: e9 d8 fd ff ff jmpq 400720 <__pow_finite@plt>
400948: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40094f: 00
...
- Without
-flto
eventually calls__exp_finite
( as an optimization enabled by-ffast-math
if I guess right ), and gives an "inaccurate" result.
$ g++ -O3 -DNDEBUG -ffast-math -std=c++17 -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000786e+20
8.10000000000006396e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
400930: 0f bf ff movswl %di,%edi
400933: 66 0f ef c0 pxor %xmm0,%xmm0
400937: f2 0f 2a c7 cvtsi2sd %edi,%xmm0
40093b: f2 0f 59 05 95 00 00 mulsd 0x95(%rip),%xmm0 # 4009d8 <_IO_stdin_used+0x8>
400942: 00
400943: e9 88 fd ff ff jmpq 4006d0 <__exp_finite@plt>
400948: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40094f: 00
...
Question
Is the above example expected behavior or is there something wrong with my code that caused this unexpected behavior?
Update
The same result can also be observed on some other platforms ( e.g. ArchLinux with g++ 12.1 and glibc 2.35 ).
2
Answers
man gcc:
-ffast-math
gives the compiler permission to be inconsistent for whatever reasons it wants. Modifying even notionally unrelated code in the function could easily lead topow
returning different results thanks to different optimization strategies being chosen. And-flto
changes quite a bit about how/when optimization is done, so there’s a lot of room for that to happen.If you care about numerical precision, or numeric consistency, or numerics in general, do not use
-ffast-math
. The transformations it performs are generally available to you as a programmer, and if you do them yourself, you can rely on their consistency.