I have run up against an interesting performance issue when comparing the exact same .net code in Windows and macOS. I don’t understand why there’s such a significant performance difference, and I’m not sure of the best way to proceed.
The code is for a .net (v6) console application (C# v9) that I’ve been developing on macOS using Visual Studio for Mac. It’s a turn-based game that waits for user input and only redraws the console window just prior to prompting for keyboard input. I do this using a backing store, and only update the parts of the console window that need to be re-drawn (typically just a few characters). As a result, performance appeared to be good under macOS.
I then copied the code to Windows and re-compiled it in Visual Studio 2022. To my surprise, performance was quite poor – unusably so.
So I started some simple performance investigations using the Stopwatch class, beginning with the method that writes to the console window: On Windows, it was taking between 98-108ms to update the console window. The same code on macOS was consistently measured as taking 0ms.
Obviously, 0ms values are not useful, so to get better numbers, I looked at the stopwatch ticks instead of ms, and quickly determined that these can’t be compared directly: I measured a 1000ms thread delay at about 10134729 ticks on Windows, but 1018704390 ticks on macOS. The MSDN library says that "The timer used by the Stopwatch class depends on the system hardware and operating system" (In both Windows and macOS, Stopwatch.IsHighResolutionTimer was ‘true’). Assuming this ratio should carry forward to all my other performance tests using the stopwatch class in the same app (?), we could say that – to compare numbers between macOS and Windows – I would have to divide the macOS numbers by (roughly) 100.
When I time the console window update in ticks, I get rough averages like this:
- Windows: 988,000-1,020,000 ticks
- macOS: 595,000-780,000 ticks
(Remember, divide macOS by 100 to compare with Windows, i.e. very roughly a 170x performance difference)
Notes:
- I’m running Windows 10 as a guest in VMWare Fusion. The macOS is host. Neither the host nor the guest should be resource constrained. Update: I’ve tried running the minimal reproducible code example below on real hardware, and I get consistent results (that it, that Windows is much slower than macOS)
- I’m using a 80×25 console window for testing.
- I’ve tried tweaking console window properties in Windows, with no effect. The buffer size is the same size as the console window size.
- The app programatically sets the console output encoding to UTF8, sets the cursor to ‘not visible’, and sets TreatControlCAsInput to ‘true’. Leaving all these as their defaults makes no difference.
- I’m not using the ‘legacy console’ in Windows.
- I’ve tried publishing a release version under Windows that specifically targets Windows and my computer’s architecture. There’s no perceivable difference.
- Debug version in Windows was targeting ‘Any CPU’.
- If I turn the cursor on, I can see it ‘skidding’ down the screen (left-to-right, top-to-bottom in Windows).
This doesn’t seem like the sort of differential that I can just optimise away (and in any event, I’d like to understand it). What could account for such a significant performance difference given the the code is the same on both OS? Has anyone else encountered this?
The code in question is as follows (in two methods):
private void FlushLine (int y)
{
ConsoleColor? lastForegroundColour = null;
ConsoleColor? lastBackgroundColour = null;
int lastX = -1, lastY = -1;
for (int x = 0; x < Math.Min (this.Width, this.currentLargestWindowWidth); ++x)
{
// write only when the current backing store is different from the previous backing store
if (ConsoleWindow.primary.characters[y][x] != ConsoleWindow.previous.characters[y][x]
|| ConsoleWindow.primary.foreground[y][x] != ConsoleWindow.previous.foreground[y][x]
|| ConsoleWindow.primary.background[y][x] != ConsoleWindow.previous.background[y][x])
{
// only change the current console foreground and/or background colour
// if necessary because it's expensive
if (!lastForegroundColour.HasValue || lastForegroundColour != ConsoleWindow.primary.foreground[y][x])
{
Console.ForegroundColor = ConsoleWindow.primary.foreground[y][x];
lastForegroundColour = ConsoleWindow.primary.foreground[y][x];
}
if (!lastBackgroundColour.HasValue || lastBackgroundColour != ConsoleWindow.primary.background[y][x])
{
Console.BackgroundColor = ConsoleWindow.primary.background[y][x];
lastBackgroundColour = ConsoleWindow.primary.background[y][x];
}
// only set the cursor position if necessary because it's expensive
if (x != lastX + 1 || y != lastY)
{
Console.SetCursorPosition(x, y);
lastX = x; lastY = y;
}
Console.Write(ConsoleWindow.primary.characters[y][x]);
ConsoleWindow.previous.foreground[y][x] = ConsoleWindow.primary.foreground[y][x];
ConsoleWindow.previous.background[y][x] = ConsoleWindow.primary.background[y][x];
ConsoleWindow.previous.characters[y][x] = ConsoleWindow.primary.characters[y][x];
}
}
}
public void FlushBuffer ()
{
int cursorX = Console.CursorLeft;
int cursorY = Console.CursorTop;
for (int y = 0; y < Math.Min (this.Height, this.currentLargestWindowHeight); ++y)
{
this.FlushLine (y);
}
Console.SetCursorPosition (cursorX, cursorY);
}
A minimally reproducible example – fill the console window with the letter ‘A’
using System.Diagnostics;
Stopwatch stopwatch = new ();
stopwatch.Restart ();
Thread.Sleep (1000);
Debug.WriteLine ($"Thread Sleep 1000ms = {stopwatch.ElapsedTicks} ticks");
while (true)
{
stopwatch.Restart ();
for (int y = 0; y < Console.WindowHeight; ++y)
{
Console.SetCursorPosition (0, y);
for (int x = 0; x < Console.WindowWidth; ++x)
{
Console.Write ('A');
}
}
stopwatch.Stop ();
Debug.WriteLine ($"{stopwatch.ElapsedTicks}");
}
2
Answers
This appears to be due to a difference in the way the .net runtime implementation on the macOS caches values associated with the console, including the console window width and height. See this answer to my question on GitHub.
TBH not an expert here and possibly this question should be better addressed on dotnet runtime github page but for your minimal reproducible example there are two tricks which helped to significantly reduce the ticks per iteration:
My wild guess would be that since Windows is more GUI-first OS than Unix-based ones the performance of the command line can be significantly worse.