I’m trying to understand why the following code is so slow:
import threading
import time
import concurrent.futures
from datetime import datetime
def dump(txt):
print(f'[{datetime.now()}] ({threading.get_ident():05}) {txt}n', end='')
def sleep_(_):
dump('Start')
time.sleep(0.1)
dump('Stop')
def main(n=10, processes=10):
dump('before with')
with concurrent.futures.ProcessPoolExecutor(processes) as pool:
dump('before map')
tmp = list(pool.map(sleep_, range(n)))
dump('after map')
dump('after with')
if __name__ == '__main__':
main()
This is the result:
[2020-10-17 23:34:12.822813] (07100) before with
[2020-10-17 23:34:12.824808] (07100) before map
[2020-10-17 23:34:21.409045] (14100) Start
[2020-10-17 23:34:21.414031] (15408) Start
[2020-10-17 23:34:21.414031] (20292) Start
[2020-10-17 23:34:21.415029] (18972) Start
[2020-10-17 23:34:21.416026] (13660) Start
[2020-10-17 23:34:21.416026] (10904) Start
[2020-10-17 23:34:21.417023] (18828) Start
[2020-10-17 23:34:21.418021] (18616) Start
[2020-10-17 23:34:21.504788] (01776) Start
[2020-10-17 23:34:21.509775] (14100) Stop
[2020-10-17 23:34:21.509775] (14100) Start
[2020-10-17 23:34:21.514761] (20292) Stop
[2020-10-17 23:34:21.514761] (15408) Stop
[2020-10-17 23:34:21.515760] (18972) Stop
[2020-10-17 23:34:21.516757] (13660) Stop
[2020-10-17 23:34:21.516757] (10904) Stop
[2020-10-17 23:34:21.517754] (18828) Stop
[2020-10-17 23:34:21.518751] (18616) Stop
[2020-10-17 23:34:21.605519] (01776) Stop
[2020-10-17 23:34:21.610506] (14100) Stop
[2020-10-17 23:34:21.611503] (07100) after map
[2020-10-17 23:34:23.281562] (07100) after with
What I’m trying to understand here is, is why it takes nearly 9 seconds to start the first process. And why it takes nearly 2 seconds to clear up them?
This is a windows system (under debug).
When I run it normally, it would take +- 0.5s to spin up and to wind down.
Compared to a Debian WSL on the same system: 0.04s to spin up & 0.006s to wind down.
Is this normal behaviour? And/or how to improve on it? And why does this happen?
Thanks!
2
Answers
Windows does not have the equivalent of the
fork
system call to duplicate a process. This means that using a multiprocessing pool with 10 workers in Windows will create 10 new python processes.It doesn’t take 9 seconds to start the first process, it takes 9 seconds to start all 10 processes in the pool and then fire the target function to the first process.
On Linux and other Unix-like operating systems processes are created with
fork
system call which basically creates a lazy copy (copy on write) of the main process. This is a fast operation.Windows has no
fork
functionality so has to create each worker process by starting a new interpreter and executing the full Python code (except the guardedif __name__ == '__main__':
part at the end) to set up the process before it can be used. Python calls this "spawn" method in the multiprocessing docs.Since Python 3.8 the spawn method is also the default on MacOS.