skip to Main Content

EDIT : Turns out this weird behavior was happening only with python in my WSL ubuntu. Otherwise, sequential does run faster than multi-threaded one.

I understand that, for CPython, in general, multiple-threads just context-switch while utilizing the same CPU-core and not utilize multiple CPU-cores like with multi-processing where several instances of python interpreter gets started.

I know this makes multithreading good for I/O bound tasks if done right. Nevertheless, CPU bound tasks will actually be slower with multi-threading. So, I experimented with 3 code snippets each doing some CPU bound calculations.

  • Example 1 : Runs tasks in sequence (single thread)
  • Example 2 : Runs each task in different thread (Multithreaded)
  • Example 3 : Runs each task in separate processes (Multi-processed)

To my surprise, even though task is CPU bound, Example 2 utilizing multiple threads is executing faster (on avg 1.5 secs) than Example 1 using single thread (on avg 2.2 secs). But Example 3 runs the fastest as expected (on avg 1 sec).

I don’t know what I am doing wrong.

Example 1 : Run tasks Sequentially

import time 
import math

nums = [ 8, 7, 8, 5, 8]

def some_computation(n):
    counter = 0
    for i in range(int(math.pow(n,n))):
        counter += 1

if __name__ == '__main__':
    start = time.time()
    for i in nums:
        some_computation(i)
    end = time.time()
    print("Total time of program execution : ", round(end-start, 4) )

Example 2 : Run tasks with Multithreading

import threading
import time 
import math

nums = [ 8, 7, 8, 5, 8]
def some_computation(n):
    counter = 0
    for i in range(int(math.pow(n,n))):
        counter += 1
    
if __name__ == '__main__':
    start = time.time()
    threads = []
    for i in nums: 
        x = threading.Thread(target=some_computation, args=(i,))
        threads.append(x)
        x.start()
    for t in threads:
        t.join()
    end = time.time()
    print("Total time of program execution : ", round(end-start, 4) )

Example 3 : Run tasks in parallel with multiprocessing module

from multiprocessing import Pool
import time
import math

nums = [ 8, 7, 8, 5, 8]
def some_computation(n):
    counter = 0
    for i in range(int(math.pow(n,n))):
        counter += 1

if __name__ == '__main__':
    start = time.time()
    pool = Pool(processes=3)
    for i in nums:
        pool.apply_async(some_computation, [i])
    pool.close()
    pool.join()
    end = time.time()
    print("Total time of program execution : ", round(end-start, 4) )

2

Answers


  1. Chosen as BEST ANSWER

    Turns out this was happening only in ubuntu that I had installed in Windows Subsystem for Linux. My original snippets runs as expected in Windows or Ubuntu python environment but not in WSL i.e Sequential Execution running faster than Multithreaded one. Thanks @Vlad to double check things on your end.


  2. As stated in my comment, it’s a question of what the function is actually doing.

    If we make the nums list longer (i.e., there will be more concurrent threads/processes) and also adjust the way the loop range is calculated then we see this:

    import time 
    from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
    
    nums = [8,7,8,5,8,8,5,4,8,7,7,8,8,7,8,8,8]
    
    def some_computation(n):
        counter = 0
        for _ in range(n*1_000_000):
            counter += 1
        return counter
    
    def sequential():
        for n in nums:
            some_computation(n)
    
    def threaded():
        with ThreadPoolExecutor() as executor:
            executor.map(some_computation, nums)
    
    def pooled():
        with ProcessPoolExecutor() as executor:
            executor.map(some_computation, nums)
    
    if __name__ == '__main__':
        for func in sequential, threaded, pooled:
            start = time.perf_counter()
            func()
            end = time.perf_counter()
            print(func.__name__, f'{end-start:.4f}')
    

    Output:

    sequential 4.8998
    threaded 5.1257
    pooled 0.7760
    

    This indicates that the complexity of some_computation() determines how the system is going to behave. With this code and its adjusted parameters we see that threading is slower than running sequentially (as one would typically expect) and, of course, multiprocessing is significantly faster

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search