skip to Main Content

The following test shell code test at: centOS 7 with bash shell; The code contains three phrase; phrase 1, call pwd command; phrase 2, read a big file(cat the file); phrase 3, do the same thing as phrase 1;

The phrase 3 time cost is much bigger than phrase 1(eg: 21s vs 7s)

But at the MacOS platform, the time cost of phrase 1 and phrase 3 is equal.
#!/bin/bash

#phrase 1
timeStart1=$(date +%s)
for ((ip=1;ip<=10000;ip++));
do
nc_result=$(pwd)
done
timeEnd1=$(date +%s)
timeDelta=$((timeEnd1-timeStart1))

echo $timeDelta

#phrase 2
fileName='./content.txt'   #one big file,eg. a 39M file
content=`cat $fileName`

#phrase 3
timeStart2=$(date +%s)
for ((ip=1;ip<=10000;ip++));
do
nc_result=$(pwd)
done
timeEnd2=$(date +%s)
timeDelta2=$((timeEnd2-timeStart2))
echo $timeDelta2

2

Answers


  1. $() invokes a subshell. The 3rd command starts when the second command is running.

    Login or Signup to reply.
  2. Slawomir’s answer has a key part of the problem, but without full explanation. The answer to What does it mean ‘fork()’ will copy address space of original process? on Unix & Linux StackExchange has some good background.

    A command substitution — $(...) — is implemented by fork()ing off a separate copy of your shell, which a command — in this case pwd — is executed in.

    Now, on most UNIXlike systems, fork() is extremely efficient, and doesn’t actually copy all your memory until an operation is performed that changes those memory blocks: Each copy keeps the same virtual memory ranges as the original (so its pointers remain valid), but with the MMU configured to throw an error when there’s a write to it, so the OS can silently catch that error and allocate separate physical memory for each branch.

    There’s still a cost to setting up the pages configured to be copied to new physical memory when they change, though! Some platforms — like Cygwin — have worse / more expensive fork implementations; some (apparently MacOS?) have faster ones; that difference is what you’re measuring here.


    Two takeaways:

    • It’s not pwd that’s slow, it’s $( ). It’d be just as slow with $(true) or any other shell builtin, and considerably slower with any non-builtin command.

    • Don’t use $(pwd) at all — there’s no reason to pay that cost to split off a child process to measure its working directory, when you could just ask the parent shell for its working directory directly by using nc_result=$PWD.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search