skip to Main Content

Summary

Recently I encountered a weird issue regarding LTO and -ffast-math where I got inconsistent result for my "pow" ( in cmath ) calls depending on whether -flto is used.

Environment:

$ g++ --version
g++ (GCC) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ ll /lib64/libc.so.6
lrwxrwxrwx 1 root root 12 Sep  3  2019 /lib64/libc.so.6 -> libc-2.17.so

$ ll /lib64/libm.so.6
lrwxrwxrwx 1 root root 12 Sep  3  2019 /lib64/libm.so.6 -> libm-2.17.so

$ cat /etc/redhat-release 
CentOS Linux release 7.5.1804 (Core) 

Minimal Example

Code

  • fixed.hxx
#include <cstdint>
double Power10f(const int16_t power);
  • fixed.cxx
#include "fixed.hxx"
#include <cmath>

double Power10f(const int16_t power)
{
    return pow(10.0, (double) power);
}
  • test.cxx
#include <iostream>
#include <cmath>
#include <iomanip>
#include <cstdint>
#include "fixed.hxx"

int main(int argc, char** argv)
{
    if (argc >= 3) {
        int64_t value = (int64_t)atoi(argv[1]);
        int16_t power = (int16_t)atoi(argv[2]);
        double x = Power10f(power);
        std::cout.precision(17);
        std::cout << std::scientific << x << std::endl;
        std::cout << std::scientific << (double)value * x << std::endl;
        return 0;   
    }
    return 1;
}

Compile & Run

Compile it with -ffast-math and with/without -flto gives different results

  • With -flto will eventually call the __pow_finite version and gives the an "accurate" result:
$ g++ -O3 -DNDEBUG -ffast-math -std=c++17 -flto  -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG   -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000000e+20
8.10000000000000000e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
  400930:       0f bf ff                movswl %di,%edi
  400933:       66 0f ef c9             pxor   %xmm1,%xmm1
  400937:       f2 0f 10 05 99 00 00    movsd  0x99(%rip),%xmm0        # 4009d8 <_IO_stdin_used+0x8>
  40093e:       00 
  40093f:       f2 0f 2a cf             cvtsi2sd %edi,%xmm1
  400943:       e9 d8 fd ff ff          jmpq   400720 <__pow_finite@plt>
  400948:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  40094f:       00
...
  • Without -flto eventually calls __exp_finite ( as an optimization enabled by -ffast-math if I guess right ), and gives an "inaccurate" result.
$ g++ -O3 -DNDEBUG -ffast-math -std=c++17  -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG   -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000786e+20
8.10000000000006396e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
  400930:       0f bf ff                movswl %di,%edi
  400933:       66 0f ef c0             pxor   %xmm0,%xmm0
  400937:       f2 0f 2a c7             cvtsi2sd %edi,%xmm0
  40093b:       f2 0f 59 05 95 00 00    mulsd  0x95(%rip),%xmm0        # 4009d8 <_IO_stdin_used+0x8>
  400942:       00 
  400943:       e9 88 fd ff ff          jmpq   4006d0 <__exp_finite@plt>
  400948:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  40094f:       00
...

Question

Is the above example expected behavior or is there something wrong with my code that caused this unexpected behavior?

Update

The same result can also be observed on some other platforms ( e.g. ArchLinux with g++ 12.1 and glibc 2.35 ).

2

Answers


  1. man gcc:

    To use the link-time optimizer, -flto and optimization options should be specified at compile time and during the final link. It is recommended that you compile all the files participating in the same link with the same options and also specify those options at link time. For example:

                  gcc -c -O2 -flto foo.c
                  gcc -c -O2 -flto bar.c
                  gcc -o myprog -flto -O2 foo.o bar.o
    
    Login or Signup to reply.
  2. -ffast-math gives the compiler permission to be inconsistent for whatever reasons it wants. Modifying even notionally unrelated code in the function could easily lead to pow returning different results thanks to different optimization strategies being chosen. And -flto changes quite a bit about how/when optimization is done, so there’s a lot of room for that to happen.

    If you care about numerical precision, or numeric consistency, or numerics in general, do not use -ffast-math. The transformations it performs are generally available to you as a programmer, and if you do them yourself, you can rely on their consistency.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search