skip to Main Content

I’ve got an algorithm I’ve used for years to emulate a real-time clock in a non-real-time environment. It’s always worked exceptionally well until used on Ubuntu 24.04 in WSL2 in Windows 11. The logic predicts in system clock time when each next time-step should occur and repeatedly calls chrono’s now() in a loop until that point in time is reached, then continues with execution of the next frame. When set up in this environment, now()’s return value will occasionally jump a significant amount forward in time (e.g., ~25 seconds) between successive calls. Simply observing the application run reveals that its not a matter of now() taking ~25 seconds to return a value as the whole app will complete in ~5 seconds when it should have taken ~30. The implication is that calling now() has actually altered the clock’s time. To verify this, I wrapped my sample code in a bash script where I could monitor and resync WSL’s clock … now() is indeed altering the clock’s time.

The sample code has been built in this environment with both g++ and clang++, C++17 and C++20. Same effect observed in all four cases.

The sample code has been observed to work properly with no alteration to clock’s time on a machine running Ubuntu-24.04 (older h/w) and under Windows11/WSL2/Ubuntu-22.04 (identical h/w)

I’ve found countless posts discussing a Linux clock drift problem when running under WSL. Most claim to have ‘fixed’ the problem by resyncing with the h/w clock. This does not ‘fix’ the drift problem, it merely resets it. There is clearly a bug somewhere in the system, could this be related? Is now() in some WSL configurations affecting the time and is my extreme case of hammering now() in a loop simply pronouncing the problem. How do I fix this?

reproducible example code:

#include <iostream>
#include <chrono>
#include <ctime>
#include <iomanip>
#include <sstream>

void print(std::chrono::duration<long double> elapsed_seconds, 
    std::chrono::_V2::system_clock::time_point, 
    std::chrono::duration<long double> error);

int main() 
{
    std::cout << "commanded_elapsed_time, corresponding_measured_time_points, error" << std::endl; 
    using namespace std::chrono_literals;
    auto start_tp = std::chrono::system_clock::now();
    auto dt = 1.0s;
    auto elapsed = 0.0s;
    auto next_tp = start_tp + elapsed;
    std::chrono::_V2::system_clock::time_point now;
    print(elapsed, start_tp, start_tp - next_tp);
    for (int t = 1; t < 30; t++)
    {
        elapsed = dt * t;
        next_tp = start_tp + elapsed;
        while((now = std::chrono::system_clock::now()) < next_tp) {}
        print(elapsed, now, now - next_tp);
    }
    return 0;
}

void print(
    std::chrono::duration<long double> elapsed_seconds, 
    std::chrono::_V2::system_clock::time_point tp, 
    std::chrono::duration<long double> error)
{
    std::time_t currentTime = std::chrono::system_clock::to_time_t(tp);
    std::tm* localTime = std::localtime(&currentTime);
    std::stringstream ss;
    ss << std::fixed << std::setprecision(3) << std::setfill('0') << std::setw(6) << elapsed_seconds.count() << ", ";
    ss << std::put_time(localTime, "%H:%M:%S");
    auto nowInMicroseconds = std::chrono::time_point_cast<std::chrono::microseconds>(tp);
    auto epoch = nowInMicroseconds.time_since_epoch();
    auto microseconds = std::chrono::duration_cast<std::chrono::microseconds>(epoch).count() % 1000000;
    ss << '.' << std::setfill('0') << std::setw(9) << microseconds * 1000 << ", ";
    ss << std::fixed << std::setprecision(9) << error.count();
    std::cout << ss.str() << std::endl;
}

test script:

#! /bin/bash
TRUE=0
FALSE=1
using_wsl=$FALSE
echo "kernel name:........" $(uname -s)
echo "kernel release:....." $(uname -r)
echo "kernel version:....." $(uname -v)
echo "machine:............" $(uname -m)
echo "processor:.........." $(uname -p)
echo "hardware platform..." $(uname -i)
echo "operating system...." $(uname -o)
if [ "$(systemd-detect-virt)" = "wsl" ]; then
    echo "running in wsl...... yes"
    using_wsl=$TRUE
    if ! command -v hwclock >/dev/null 2>&1; then
        echo ""
        echo "When running this script within WSL, the 'hwclock'"
        echo "command is used to sync the WSL clock with the"
        echo "host system's clock. 'hwclock' is not found on"
        echo "this system. Install and rerun test."
        exit 1
    fi
else
    echo "running in wsl...... no"
    using_wsl=$FALSE
fi
echo ""
if [ $using_wsl -eq $TRUE ]; then
    echo "WSL time before syncing with host...:" $(date +"%T.%6N")
    echo "syncing WSL clock with host ..."
    sudo hwclock -s
fi
start_time=$(date +"%s.%N")
echo "test system start time..............: $(date -d "@$start_time" +"%T.%6N")"
echo "test starting ..."
echo ""
./zero_drift_dt
end_test_time=$(date +"%s.%N")
echo ""
echo "... test complete"
echo "test system finish time............................: $(date -d "@$end_test_time" +"%T.%6N")"
if [ $using_wsl -eq $TRUE ]; then
    sudo hwclock -s
    echo "WSL time after syncing with host (again)...........:" $(date +"%T.%6N")
fi
final_time=$(date +"%s.%N")
expected_duration=$(awk "BEGIN {print $end_test_time - $start_time}")
actual_duration=$(awk "BEGIN {print $final_time - $start_time}")
error_duration=$(awk "BEGIN {print $end_test_time - $final_time}")
echo "Expected approximate test duration (s).............:" $expected_duration
echo "Measured test duration (s).........................:" $actual_duration
echo "During this test, now() moved clock forward (s)....:" $error_duration

Results on problem system:

~/Projects/proving_grounds/now_error$ sudo ./test.sh
[sudo] password for blumert:
kernel name:........ Linux
kernel release:..... 5.15.153.1-microsoft-standard-WSL2
kernel version:..... #1 SMP Fri Mar 29 23:14:13 UTC 2024
machine:............ x86_64
processor:.......... x86_64
hardware platform... x86_64
operating system.... GNU/Linux
running in wsl...... yes

WSL time before syncing with host...: 13:39:30.320309
syncing WSL clock with host ...
test system start time..............: 13:39:31.001844
test starting ...

commanded_elapsed_time, corresponding_measured_time_points, error
00.000, 13:39:31.003072000, 0.000000000
01.000, 13:39:32.003072000, 0.000000070
02.000, 13:39:52.464335000, 19.461263572
03.000, 13:39:52.464537000, 18.461465685
04.000, 13:39:52.464569000, 17.461497372
05.000, 13:39:52.464573000, 16.461501714
06.000, 13:39:52.464575000, 15.461503561
07.000, 13:39:52.464577000, 14.461505319
08.000, 13:39:52.464579000, 13.461506932
09.000, 13:39:52.464580000, 12.461508617
10.000, 13:39:52.464582000, 11.461510324
11.000, 13:39:52.464584000, 10.461512480
12.000, 13:39:52.464586000, 9.461514319
13.000, 13:39:52.464588000, 8.461515990
14.000, 13:39:52.464589000, 7.461517739
15.000, 13:39:52.464591000, 6.461519320
16.000, 13:39:52.464593000, 5.461520922
17.000, 13:39:52.464594000, 4.461522511
18.000, 13:39:52.464596000, 3.461524055
19.000, 13:39:52.464597000, 2.461525592
20.000, 13:39:52.464599000, 1.461527167
21.000, 13:39:52.464600000, 0.461528720
22.000, 13:39:57.473118000, 4.470046550
23.000, 13:39:57.473178000, 3.470106269
24.000, 13:39:57.473182000, 2.470110610
25.000, 13:39:57.473184000, 1.470112218
26.000, 13:39:57.473185000, 0.470113725
27.000, 13:40:02.476388000, 4.473316514
28.000, 13:40:02.476452000, 3.473380680
29.000, 13:40:02.476457000, 2.473385528

... test complete
test system finish time............................: 13:40:02.477504
WSL time after syncing with host (again)...........: 13:39:43.002630
Expected approximate test duration (s).............: 31.4757
Measured test duration (s).........................: 12.0018
During this test, now() moved clock forward (s)....: 19.4738
  • Note that when the clock at the end of the test, it reset backward in time. If the system is left alone for awhile, the first resync will exhibit this behavior as well.
  • By its defined purpose, the app should have taken slightly more than 30 seconds to execute. As measured with respect to resynced times, it only took 12.
  • In a perfect run (as in all other environments described above), the corresponding_measured_time_points progress in exact 1 second increments, and the error is rarely more than a few 10 nanoseconds (i.e., like the first two samples).
  • Every run is a bit different in that the first big jump can seemingly happen anytime in the first 10 seconds or so, but usually occur in the first 3.

2

Answers


  1. Chosen as BEST ANSWER

    While steady_clock is probably the more appropriate clock to use for this application, it is not the major part of the described problem. The erratic behavior is a combination of using system_clock with the system clock not synchronized (with host) and with local NTP service active. Ideally, the WSL/Ubuntu-24.04 would not be syncing with the NTP service but rather would rely on the Windows11 host to do that and simply sync with the host. The fix:

    Check current status:

    $ timedatectl status
                   Local time: Mon 2024-09-23 10:13:11 CDT
               Universal time: Mon 2024-09-23 15:13:11 UTC
                     RTC time: Mon 2024-09-23 15:13:10
                    Time zone: America/Chicago (CDT, -0500)
    System clock synchronized: no
                  NTP service: active
              RTC in local TZ: no
    

    Enable system clock syncing. Add or modify the following lines in /etc/wsl.conf:

    [interop]
    enabled=true
    appendWindowsPath=true
    
    [automount]
    enabled=true
    options="metadata"
    
    [boot]
    systemd=true
    

    Exit Linux, restart WSL instance:

    > wsl.exe --shutdown
    > wsl ~
    

    Set the NTP service to inactive:

    sudo systemctl stop systemd-timesyncd
    sudo systemctl disable systemd-timesyncd
    

    Check status again:

    $ timedatectl status
                   Local time: Mon 2024-09-23 10:28:05 CDT
               Universal time: Mon 2024-09-23 15:28:05 UTC
                     RTC time: Mon 2024-09-23 15:28:06
                    Time zone: America/Chicago (CDT, -0500)
    System clock synchronized: yes
                  NTP service: inactive
              RTC in local TZ: no
    

    Results of posted test after this fix:

    kernel name:........ Linux
    kernel release:..... 5.15.153.1-microsoft-standard-WSL2
    kernel version:..... #1 SMP Fri Mar 29 23:14:13 UTC 2024
    machine:............ x86_64
    processor:.......... x86_64
    hardware platform... x86_64
    operating system.... GNU/Linux
    running in wsl...... yes
    
    WSL time before syncing with host...: 10:37:01.290106
    syncing WSL clock with host ...
    test system start time..............: 10:37:02.004335
    test starting ...
    
    commanded_elapsed_time, corresponding_measured_time_points, error
    00.000, 0:11:39.930802312, 0.000000000
    01.000, 0:11:40.930802335, 0.000000023
    02.000, 0:11:41.930802336, 0.000000024
    03.000, 0:11:42.930802314, 0.000000002
    04.000, 0:11:43.930802313, 0.000000001
    05.000, 0:11:44.930802328, 0.000000016
    06.000, 0:11:45.930802326, 0.000000014
    07.000, 0:11:46.930802323, 0.000000011
    08.000, 0:11:47.930802339, 0.000000027
    09.000, 0:11:48.930802339, 0.000000027
    10.000, 0:11:49.930802333, 0.000000021
    11.000, 0:11:50.930802318, 0.000000006
    12.000, 0:11:51.930802333, 0.000000021
    13.000, 0:11:52.930802335, 0.000000023
    14.000, 0:11:53.930802328, 0.000000016
    15.000, 0:11:54.930802315, 0.000000003
    16.000, 0:11:55.930802317, 0.000000005
    17.000, 0:11:56.930802324, 0.000000012
    18.000, 0:11:57.930802327, 0.000000015
    19.000, 0:11:58.930802327, 0.000000015
    20.000, 0:11:59.930802335, 0.000000023
    21.000, 0:12:00.930802328, 0.000000016
    22.000, 0:12:01.930802317, 0.000000005
    23.000, 0:12:02.930802343, 0.000000031
    24.000, 0:12:03.930802349, 0.000000037
    25.000, 0:12:04.930802328, 0.000000016
    26.000, 0:12:05.930804110, 0.000001798
    27.000, 0:12:06.930802334, 0.000000022
    28.000, 0:12:07.930802328, 0.000000016
    29.000, 0:12:08.930802314, 0.000000002
    
    ... test complete
    test system finish time............................: 10:37:31.009184
    WSL time after syncing with host (again)...........: 10:37:32.001678
    Expected approximate test duration (s).............: 29.0048
    Measured test duration (s).........................: 29.998
    During this test, now() moved clock forward (s)....: -0.993178
    

    Note that the last three values (after test complete) do not account for the time required to start up and shutdown the app, which is likely where the .99s difference is coming from.


  2. System clock is the wrong type to use, as it can (and does) vary. From cppreference (emphasis mine):

    Class std::chrono::system_clock represents the system-wide real time wall clock.

    It may not be monotonic: on most systems, the system time can be adjusted at any moment

    You want to use steady_clock

    Class std::chrono::steady_clock represents a monotonic clock. The time points of this clock cannot decrease as physical time moves forward and the time between ticks of this clock is constant. This clock is not related to wall clock time (for example, it can be time since last reboot), and is most suitable for measuring intervals

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search