skip to Main Content

I was asked to implement a thread safe dictionary in swift, I used the common approach:

 class MutableDictionary {
    var dictionary: [String : Any] = [:]
    var queue = DispatchQueue(label: "queue", attributes: .concurrent)
    
    func object(for key:  String) -> Any? {
        queue.sync {
            return dictionary[key]   
        }
    }
    
    func set(_ object: Any?, for key: String) {
        queue.async(flags: .barrier) {
            self.dictionary[key] = object
        }
    }
}

However, the following up question is:

  1. What’s the difference with using concurrent + barrier vs just using a serialQueue for setting in this case?
  2. I did a testing on playground, I wrapped the get and set with 1000 time for loop, it turns out that the behavior for both serial and concurrent queue is almost the same.
  3. Why the setting always raise an error?
    enter image description here
  4. What does concurrent queue do better in this case (for both get and set) compared to a serial queue?
  5. In iOS, when should I choose serial queue over concurrent queue? And vice-versa?

4

Answers


  1. 1、with barrier, the concurrent queue is temporary able to execute one task at a time.
    enter image description here

    However, the serialQueue can only perform one task at a time.

    2、Given you only use the queue to read/write, they obviously have the same effect. if you put another kind task to it, apparently the concurrent queue cost less and that is the real difference.

    3、Submitted to async object address will be defined when your object is a class. You will have a bad address when you give the class member a new value, So you cannot access it and the error comes.
    You can have a try by struct
    .

    4、refer to answer 1.

    5、Sometimes when you want the task to be performed faster, concurrentQueue first. If you want to execute tasks orderly, serialQueue do the better.

    Login or Signup to reply.
    1. concurrent + barrier can run multiple read at the same time. serial queue can only run one task (read/write) at a time.

    2. the results are the same, or even that the serial queue is better because you’re using only one thread to run. You can only take advantage of the concurrent + barrier implementation when read/write operations happen on multiple thread/queue. In this case, the serial queue is better because you don’t need to look and switch between queue/thread.

    3. Full source code, please?

    4. Concurrent + barrier might be better or not, as in (2), sometimes if you can ensure all operations happen in the same thread, then serial queue is better.

    5. It depends on your case, as mentioned in (2), (4). One more thing about concurrent + barrier, sometimes it isn’t a good choice as you think. Imagine:

    • You’re implementing a feature that needs to do a heavy write operation, for example, you’re reading your dict and calculating a new value, and updating the dict. You wrap all of these steps in a queue.async(flags: .barrier) block.

    • You expect these steps are running in your thread (background thread) and it doesn’t block the main queue from reading the dict to update UI. But it does block the main queue, right? Read operations from the main queue have to wait for barrier block to finish first.

    • If your application consumes a lot of CPU, you may have to wait for OS to find the thread for your update steps, which means you have to spend more time on it.

    Login or Signup to reply.
  2. As @nghiahoang explained in their answer, the advantage of using a concurrent queue is that you can perform multiple ‘get’ requests simultaneously.

    The reason you are getting a crash is because your set operation is using asynchronous dispatch along with a barrier in a tight loop. When you submit a new set request before the previous request has completed, a new thread is required. There are only a limited number of threads available. Once you have exhausted the thread pool your app crashes.

    I would suggest a couple of changes to your code to help with this.

    The first is to bind your dispatch queue to one of the global dispatch queues, as this is best practice.

    The second is to use sync in your set rather than async. This will prevent thread pool exhaustion, but more importantly it ensures that once set returns that the dictionary has actually been updated. With async, the update will happen at some unspecified future time.

    class MutableDictionary {
        var dictionary: [String : Any] = [:]
        var queue = DispatchQueue(label: "queue", attributes: .concurrent, target:.global(qos: .userInitiated))
        
        func object(for key:  String) -> Any? {
            queue.sync {
                return dictionary[key] 
            }
        }
        
        func set(_ object: Any?, for key: String) {
            queue.sync(flags: .barrier) {
                self.dictionary[key] = object
            }
        }
    }
    
    Login or Signup to reply.
  3. You asked:

    1. What’s the difference with using concurrent + barrier vs just using a serialQueue for setting in this case?

    The former allows concurrent reads and the latter does not. But both are standard patterns for thread-safety.

    FWIW, this “concurrent + barrier” approach is known as the “reader-writer” pattern. It features two key behaviors, namely that reads can happen concurrently (because it is a concurrent queue) and also that the caller does not have to wait for writes (because we invoke it with async with barrier).

    1. I did a testing on playground, I wrapped the get and set with 1000 time for loop, it turns out that the behavior for both serial and concurrent queue is almost the same.

    First, Playgrounds are not an ideal environment for testing subtle performance differences like this. Test in an app with a optimized, “release” build. (This also gives you the chance to debug with tools like the “Thread Sanitizer”.)

    Second, you will only see very modest performance differences (probably only observable if doing millions of iterations, not thousands). But the overall behavior of these two patterns should be similar, as they are doing the same thing, namely synchronizing your access to this dictionary.

    Third, your two for loops invoke this MutableDictionary from a single thread. If you are testing a thread-safe dictionary, I would suggest actually calling it from multiple threads. There is no point in introducing the thread-safety overhead if you are only accessing it from a single thread. E.g., this tests multithreaded behavior:

    let dict = MutableDictionary()
    
    DispatchQueue.concurrentPerform(iterations: 1_000) { i in
        let result = dict.object(for: "(i)")
        print(result)                          // this will always be `nil`, though, because your dictionary has no values yet
    }
    
    DispatchQueue.concurrentPerform(iterations: 1_000) { i in
        dict.set("a", for: "(i)")
    }
    

    DispatchQueue.concurrentPerform is a parallel for loop.

    1. Why the setting always raise an error?

    enter image description here

    This is introduced by Playgrounds, probably its inability to handle the thread explosion of the unbridled dispatching of 1,000 asynchronous tasks. Try it in an actual app, not a playground, and the worker thread exhaustion will not result in this crash.

    That having been said, you should probably avoid scenarios that permit thread explosion at all.

    For example, if you really want to have a thread update 1,000 values in the dictionary, all from a single thread, you might provide a method that allows you to update multiple values with a single synchronization call. E.g.,

    class MutableDictionary<Key: Hashable, Value> {
        private var dictionary: [Key: Value] = [:]
        private let queue = DispatchQueue(label: "queue", attributes: .concurrent)
    
        func object(for key: Key) -> Value? {
            queue.sync {
                dictionary[key]
            }
        }
    
        func set(_ object: Value?, for key: Key) {
            queue.async(flags: .barrier) {
                self.dictionary[key] = object
            }
        }
    
        func synchronized(block: @escaping (inout [Key: Value]) -> Void) {
            queue.async(flags: .barrier) { [self] in
                block(&dictionary)
            }
        }
    }
    
    let dictionary = MutableDictionary<Int, String>()
    
    dictionary.synchronized {
        for i in 0 ... 1_000 {
            $0[i] = "a"
        }
    }
    

    That updates the thousand dictionary values in a single synchronization. This simultaneously eliminates unnecessary synchronizations and solves the thread explosion issue.

    1. What does concurrent queue do better in this case (for both get and set) compared to a serial queue?

    Theoretically, the concurrent queue, “reader-writer” pattern, can offer better performance if you have an app that may be doing many concurrent reads simultaneously. Otherwise the overhead of the concurrent queue is unlikely not necessary/beneficial.

    In practice, I have yet to encountered real-world scenarios where the concurrent queue was observably faster.

    1. In iOS, when should I choose serial queue over concurrent queue? And vice-versa?

    I would benchmark it for your particular use-case and see if the concurrent queue yields observable performance benefits. If so, the overhead of the concurrent queue might be worth it. Otherwise, you might want to stick with the simple serial GCD queue.

    Generally, you will not see an observable difference. And where I have, I found that other approaches (e.g. NSLock or os_unfair_lock) were even faster. But if you are looking for a simple and robust synchronization mechanism, a serial dispatch queue is a good solution.


    A few other observations:

    • If creating a dictionary, I would be inclined to avoid the use of Any. We have a strongly-typed language and it is a shame to lose these benefits. E.g., you might make it a generic as shown in the example above with Key and Value.

    • The wrapped dictionary should be private. It is imprudent to expose the underlying dictionary which would undermine the thread-safe interface.

    • The synchronization mechanism (e.g., the lock or GCD queue or whatever) should be private, too.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search