The following test shell code test at: centOS 7 with bash shell; The code contains three phrase; phrase 1, call pwd command; phrase 2, read a big file(cat the file); phrase 3, do the same thing as phrase 1;
The phrase 3 time cost is much bigger than phrase 1(eg: 21s vs 7s)
But at the MacOS platform, the time cost of phrase 1 and phrase 3 is equal.
#!/bin/bash
#phrase 1
timeStart1=$(date +%s)
for ((ip=1;ip<=10000;ip++));
do
nc_result=$(pwd)
done
timeEnd1=$(date +%s)
timeDelta=$((timeEnd1-timeStart1))
echo $timeDelta
#phrase 2
fileName='./content.txt' #one big file,eg. a 39M file
content=`cat $fileName`
#phrase 3
timeStart2=$(date +%s)
for ((ip=1;ip<=10000;ip++));
do
nc_result=$(pwd)
done
timeEnd2=$(date +%s)
timeDelta2=$((timeEnd2-timeStart2))
echo $timeDelta2
2
Answers
$() invokes a subshell. The 3rd command starts when the second command is running.
Slawomir’s answer has a key part of the problem, but without full explanation. The answer to What does it mean ‘fork()’ will copy address space of original process? on Unix & Linux StackExchange has some good background.
A command substitution —
$(...)
— is implemented byfork()
ing off a separate copy of your shell, which a command — in this casepwd
— is executed in.Now, on most UNIXlike systems,
fork()
is extremely efficient, and doesn’t actually copy all your memory until an operation is performed that changes those memory blocks: Each copy keeps the same virtual memory ranges as the original (so its pointers remain valid), but with the MMU configured to throw an error when there’s a write to it, so the OS can silently catch that error and allocate separate physical memory for each branch.There’s still a cost to setting up the pages configured to be copied to new physical memory when they change, though! Some platforms — like Cygwin — have worse / more expensive fork implementations; some (apparently MacOS?) have faster ones; that difference is what you’re measuring here.
Two takeaways:
It’s not
pwd
that’s slow, it’s$( )
. It’d be just as slow with$(true)
or any other shell builtin, and considerably slower with any non-builtin command.Don’t use
$(pwd)
at all — there’s no reason to pay that cost to split off a child process to measure its working directory, when you could just ask the parent shell for its working directory directly by usingnc_result=$PWD
.