EDIT : Turns out this weird behavior was happening only with python in my WSL ubuntu. Otherwise, sequential does run faster than multi-threaded one.
I understand that, for CPython, in general, multiple-threads just context-switch while utilizing the same CPU-core and not utilize multiple CPU-cores like with multi-processing where several instances of python interpreter gets started.
I know this makes multithreading good for I/O bound tasks if done right. Nevertheless, CPU bound tasks will actually be slower with multi-threading. So, I experimented with 3 code snippets each doing some CPU bound calculations.
- Example 1 : Runs tasks in sequence (single thread)
- Example 2 : Runs each task in different thread (Multithreaded)
- Example 3 : Runs each task in separate processes (Multi-processed)
To my surprise, even though task is CPU bound, Example 2 utilizing multiple threads is executing faster (on avg 1.5 secs) than Example 1 using single thread (on avg 2.2 secs). But Example 3 runs the fastest as expected (on avg 1 sec).
I don’t know what I am doing wrong.
Example 1 : Run tasks Sequentially
import time
import math
nums = [ 8, 7, 8, 5, 8]
def some_computation(n):
counter = 0
for i in range(int(math.pow(n,n))):
counter += 1
if __name__ == '__main__':
start = time.time()
for i in nums:
some_computation(i)
end = time.time()
print("Total time of program execution : ", round(end-start, 4) )
Example 2 : Run tasks with Multithreading
import threading
import time
import math
nums = [ 8, 7, 8, 5, 8]
def some_computation(n):
counter = 0
for i in range(int(math.pow(n,n))):
counter += 1
if __name__ == '__main__':
start = time.time()
threads = []
for i in nums:
x = threading.Thread(target=some_computation, args=(i,))
threads.append(x)
x.start()
for t in threads:
t.join()
end = time.time()
print("Total time of program execution : ", round(end-start, 4) )
Example 3 : Run tasks in parallel with multiprocessing
module
from multiprocessing import Pool
import time
import math
nums = [ 8, 7, 8, 5, 8]
def some_computation(n):
counter = 0
for i in range(int(math.pow(n,n))):
counter += 1
if __name__ == '__main__':
start = time.time()
pool = Pool(processes=3)
for i in nums:
pool.apply_async(some_computation, [i])
pool.close()
pool.join()
end = time.time()
print("Total time of program execution : ", round(end-start, 4) )
2
Answers
Turns out this was happening only in ubuntu that I had installed in Windows Subsystem for Linux. My original snippets runs as expected in Windows or Ubuntu python environment but not in WSL i.e Sequential Execution running faster than Multithreaded one. Thanks @Vlad to double check things on your end.
As stated in my comment, it’s a question of what the function is actually doing.
If we make the nums list longer (i.e., there will be more concurrent threads/processes) and also adjust the way the loop range is calculated then we see this:
Output:
This indicates that the complexity of some_computation() determines how the system is going to behave. With this code and its adjusted parameters we see that threading is slower than running sequentially (as one would typically expect) and, of course, multiprocessing is significantly faster