I’ve got an algorithm I’ve used for years to emulate a real-time clock in a non-real-time environment. It’s always worked exceptionally well until used on Ubuntu 24.04 in WSL2 in Windows 11. The logic predicts in system clock time when each next time-step should occur and repeatedly calls chrono’s now() in a loop until that point in time is reached, then continues with execution of the next frame. When set up in this environment, now()’s return value will occasionally jump a significant amount forward in time (e.g., ~25 seconds) between successive calls. Simply observing the application run reveals that its not a matter of now() taking ~25 seconds to return a value as the whole app will complete in ~5 seconds when it should have taken ~30. The implication is that calling now() has actually altered the clock’s time. To verify this, I wrapped my sample code in a bash script where I could monitor and resync WSL’s clock … now() is indeed altering the clock’s time.
The sample code has been built in this environment with both g++ and clang++, C++17 and C++20. Same effect observed in all four cases.
The sample code has been observed to work properly with no alteration to clock’s time on a machine running Ubuntu-24.04 (older h/w) and under Windows11/WSL2/Ubuntu-22.04 (identical h/w)
I’ve found countless posts discussing a Linux clock drift problem when running under WSL. Most claim to have ‘fixed’ the problem by resyncing with the h/w clock. This does not ‘fix’ the drift problem, it merely resets it. There is clearly a bug somewhere in the system, could this be related? Is now() in some WSL configurations affecting the time and is my extreme case of hammering now() in a loop simply pronouncing the problem. How do I fix this?
reproducible example code:
#include <iostream>
#include <chrono>
#include <ctime>
#include <iomanip>
#include <sstream>
void print(std::chrono::duration<long double> elapsed_seconds,
std::chrono::_V2::system_clock::time_point,
std::chrono::duration<long double> error);
int main()
{
std::cout << "commanded_elapsed_time, corresponding_measured_time_points, error" << std::endl;
using namespace std::chrono_literals;
auto start_tp = std::chrono::system_clock::now();
auto dt = 1.0s;
auto elapsed = 0.0s;
auto next_tp = start_tp + elapsed;
std::chrono::_V2::system_clock::time_point now;
print(elapsed, start_tp, start_tp - next_tp);
for (int t = 1; t < 30; t++)
{
elapsed = dt * t;
next_tp = start_tp + elapsed;
while((now = std::chrono::system_clock::now()) < next_tp) {}
print(elapsed, now, now - next_tp);
}
return 0;
}
void print(
std::chrono::duration<long double> elapsed_seconds,
std::chrono::_V2::system_clock::time_point tp,
std::chrono::duration<long double> error)
{
std::time_t currentTime = std::chrono::system_clock::to_time_t(tp);
std::tm* localTime = std::localtime(¤tTime);
std::stringstream ss;
ss << std::fixed << std::setprecision(3) << std::setfill('0') << std::setw(6) << elapsed_seconds.count() << ", ";
ss << std::put_time(localTime, "%H:%M:%S");
auto nowInMicroseconds = std::chrono::time_point_cast<std::chrono::microseconds>(tp);
auto epoch = nowInMicroseconds.time_since_epoch();
auto microseconds = std::chrono::duration_cast<std::chrono::microseconds>(epoch).count() % 1000000;
ss << '.' << std::setfill('0') << std::setw(9) << microseconds * 1000 << ", ";
ss << std::fixed << std::setprecision(9) << error.count();
std::cout << ss.str() << std::endl;
}
test script:
#! /bin/bash
TRUE=0
FALSE=1
using_wsl=$FALSE
echo "kernel name:........" $(uname -s)
echo "kernel release:....." $(uname -r)
echo "kernel version:....." $(uname -v)
echo "machine:............" $(uname -m)
echo "processor:.........." $(uname -p)
echo "hardware platform..." $(uname -i)
echo "operating system...." $(uname -o)
if [ "$(systemd-detect-virt)" = "wsl" ]; then
echo "running in wsl...... yes"
using_wsl=$TRUE
if ! command -v hwclock >/dev/null 2>&1; then
echo ""
echo "When running this script within WSL, the 'hwclock'"
echo "command is used to sync the WSL clock with the"
echo "host system's clock. 'hwclock' is not found on"
echo "this system. Install and rerun test."
exit 1
fi
else
echo "running in wsl...... no"
using_wsl=$FALSE
fi
echo ""
if [ $using_wsl -eq $TRUE ]; then
echo "WSL time before syncing with host...:" $(date +"%T.%6N")
echo "syncing WSL clock with host ..."
sudo hwclock -s
fi
start_time=$(date +"%s.%N")
echo "test system start time..............: $(date -d "@$start_time" +"%T.%6N")"
echo "test starting ..."
echo ""
./zero_drift_dt
end_test_time=$(date +"%s.%N")
echo ""
echo "... test complete"
echo "test system finish time............................: $(date -d "@$end_test_time" +"%T.%6N")"
if [ $using_wsl -eq $TRUE ]; then
sudo hwclock -s
echo "WSL time after syncing with host (again)...........:" $(date +"%T.%6N")
fi
final_time=$(date +"%s.%N")
expected_duration=$(awk "BEGIN {print $end_test_time - $start_time}")
actual_duration=$(awk "BEGIN {print $final_time - $start_time}")
error_duration=$(awk "BEGIN {print $end_test_time - $final_time}")
echo "Expected approximate test duration (s).............:" $expected_duration
echo "Measured test duration (s).........................:" $actual_duration
echo "During this test, now() moved clock forward (s)....:" $error_duration
Results on problem system:
~/Projects/proving_grounds/now_error$ sudo ./test.sh
[sudo] password for blumert:
kernel name:........ Linux
kernel release:..... 5.15.153.1-microsoft-standard-WSL2
kernel version:..... #1 SMP Fri Mar 29 23:14:13 UTC 2024
machine:............ x86_64
processor:.......... x86_64
hardware platform... x86_64
operating system.... GNU/Linux
running in wsl...... yes
WSL time before syncing with host...: 13:39:30.320309
syncing WSL clock with host ...
test system start time..............: 13:39:31.001844
test starting ...
commanded_elapsed_time, corresponding_measured_time_points, error
00.000, 13:39:31.003072000, 0.000000000
01.000, 13:39:32.003072000, 0.000000070
02.000, 13:39:52.464335000, 19.461263572
03.000, 13:39:52.464537000, 18.461465685
04.000, 13:39:52.464569000, 17.461497372
05.000, 13:39:52.464573000, 16.461501714
06.000, 13:39:52.464575000, 15.461503561
07.000, 13:39:52.464577000, 14.461505319
08.000, 13:39:52.464579000, 13.461506932
09.000, 13:39:52.464580000, 12.461508617
10.000, 13:39:52.464582000, 11.461510324
11.000, 13:39:52.464584000, 10.461512480
12.000, 13:39:52.464586000, 9.461514319
13.000, 13:39:52.464588000, 8.461515990
14.000, 13:39:52.464589000, 7.461517739
15.000, 13:39:52.464591000, 6.461519320
16.000, 13:39:52.464593000, 5.461520922
17.000, 13:39:52.464594000, 4.461522511
18.000, 13:39:52.464596000, 3.461524055
19.000, 13:39:52.464597000, 2.461525592
20.000, 13:39:52.464599000, 1.461527167
21.000, 13:39:52.464600000, 0.461528720
22.000, 13:39:57.473118000, 4.470046550
23.000, 13:39:57.473178000, 3.470106269
24.000, 13:39:57.473182000, 2.470110610
25.000, 13:39:57.473184000, 1.470112218
26.000, 13:39:57.473185000, 0.470113725
27.000, 13:40:02.476388000, 4.473316514
28.000, 13:40:02.476452000, 3.473380680
29.000, 13:40:02.476457000, 2.473385528
... test complete
test system finish time............................: 13:40:02.477504
WSL time after syncing with host (again)...........: 13:39:43.002630
Expected approximate test duration (s).............: 31.4757
Measured test duration (s).........................: 12.0018
During this test, now() moved clock forward (s)....: 19.4738
- Note that when the clock at the end of the test, it reset backward in time. If the system is left alone for awhile, the first resync will exhibit this behavior as well.
- By its defined purpose, the app should have taken slightly more than 30 seconds to execute. As measured with respect to resynced times, it only took 12.
- In a perfect run (as in all other environments described above), the corresponding_measured_time_points progress in exact 1 second increments, and the error is rarely more than a few 10 nanoseconds (i.e., like the first two samples).
- Every run is a bit different in that the first big jump can seemingly happen anytime in the first 10 seconds or so, but usually occur in the first 3.
2
Answers
While steady_clock is probably the more appropriate clock to use for this application, it is not the major part of the described problem. The erratic behavior is a combination of using system_clock with the system clock not synchronized (with host) and with local NTP service active. Ideally, the WSL/Ubuntu-24.04 would not be syncing with the NTP service but rather would rely on the Windows11 host to do that and simply sync with the host. The fix:
Check current status:
Enable system clock syncing. Add or modify the following lines in /etc/wsl.conf:
Exit Linux, restart WSL instance:
Set the NTP service to inactive:
Check status again:
Results of posted test after this fix:
Note that the last three values (after test complete) do not account for the time required to start up and shutdown the app, which is likely where the .99s difference is coming from.
System clock is the wrong type to use, as it can (and does) vary. From cppreference (emphasis mine):
You want to use
steady_clock