skip to Main Content

I needed to write 50k+ records to IndexedDB in multiple stores (tables). I noticed, that the more stores you write to, the slower the writes are. After some testing, it seems to me that the writes aren’t actually executed in parallel.

MDN says:

Only specify a readwrite transaction mode when necessary. You can concurrently run multiple readonly transactions with overlapping scopes, but you can have only one readwrite transaction for an object store.

As does W3:

Multiple "readwrite" transactions can’t run at the same time if their scopes are overlapping since that would mean that they can modify each other’s data in the middle of the transaction.

But I am writing to different stores (scopes) so i would expect the writes operations to be parallel too?

This is what i tried. Writing 50k records to store1 takes about 3 seconds as does writing to store2. Writing to both at the same time takes about 6 seconds. So correct me if I’m wrong, but that doesn’t seem to be parallel.

CodeSandbox demo

<!doctype html>
<html lang="en">
    <head>
        <meta charset="UTF-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1.0" />
        <title>Document</title>
    </head>

    <script>
        let db;

        const request = indexedDB.open('idb_test', 1);

        request.onupgradeneeded = () => {
            const db = request.result;
            db.createObjectStore('store1');
            db.createObjectStore('store2');
        };

        request.onsuccess = () => {
            db = request.result;
        };

        function createMany(storeName) {
            console.time(storeName);

            const tx = db.transaction(storeName, 'readwrite');
            console.log('tx durability:', tx.durability);

            for (let i = 0; i < 50000; i++) {
                const key = crypto.randomUUID();
                const val = i.toString();

                tx.objectStore(storeName).put(val, key);
            }

            tx.oncomplete = () => {
                console.timeEnd(storeName);
            };

            tx.commit();
        }

        function store1() {
            createMany('store1');
        }

        function store2() {
            createMany('store2');
        }

        function bothStores() {
            createMany('store1');
            createMany('store2');
        }
    </script>

    <body>
        <button onclick="store1();">store1</button>
        <button onclick="store2();">store2</button>
        <button onclick="bothStores();">store1 + store2</button>
    </body>
</html>



I also tried this with Web Worker just to see if it made any difference, but with same results.

CodeSandbox demo

<!doctype html>
<html lang="en">
    <head>
        <meta charset="UTF-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1.0" />
        <title>Document</title>
    </head>

    <script>
        // dev server is needed for this to work
        const worker1 = new Worker('worker.js');
        const worker2 = new Worker('worker.js');

        worker1.onmessage = (event) => {
            console.log('worker1 done');
        };
        worker2.onmessage = (event) => {
            console.log('worker2 done');
        };

        function store1() {
            worker1.postMessage('store1');
        }

        function store2() {
            worker1.postMessage('store2');
        }

        function bothStores() {
            worker1.postMessage('store1');
            worker2.postMessage('store2');
        }
    </script>

    <body>
        <button onclick="store1();">store1</button>
        <button onclick="store2();">store2</button>
        <button onclick="bothStores();">store1 + store2</button>
    </body>
</html>

worker.js:

let db;

const request = indexedDB.open('idb_test', 1);

request.onupgradeneeded = () => {
    const db = request.result;
    db.createObjectStore('store1');
    db.createObjectStore('store2');
};

request.onsuccess = () => {
    db = request.result;
};

function createMany(storeName) {
    console.time(storeName);

    const tx = db.transaction(storeName, 'readwrite');
    console.log('tx durability:', tx.durability);

    for (let i = 0; i < 50000; i++) {
        const key = crypto.randomUUID();
        const val = i.toString();

        tx.objectStore(storeName).put(val, key);
    }

    return tx;
}

self.onmessage = (event) => {
    const storeName = event.data;

    const tx = createMany(storeName);

    tx.oncomplete = () => {
        console.timeEnd(storeName);

        postMessage(storeName);
    };

    tx.commit();
};

So I tried profiling.

Chrome profiling

I think it’s pretty clear what’s happening. This is from Chrome 121. I’m pretty sure the grey blobs are the actual writes happening (it just says "Task").


Firefox was a bit different but still not parallel. Also almost 2x faster than Chrome.

Firefox profiling


So am I doing something wrong or are IndexedDB writes actually not parallel?


Edit: added tx.commit(), moved console.time closer to transaction, tried all possible transaction durabilities, didn’t make much difference

Edit: Tried single transaction scoped to both stores (store1 and store2), that creates 50k puts in each store. Similar results.

CodeSandbox demo

<!doctype html>
<html lang="en">
    <head>
        <meta charset="UTF-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1.0" />
        <title>Document</title>
    </head>

    <script>
        let db;

        const request = indexedDB.open('idb_test', 1);

        request.onupgradeneeded = () => {
            const db = request.result;
            db.createObjectStore('store1');
            db.createObjectStore('store2');
        };

        request.onsuccess = () => {
            db = request.result;
        };

        function createMany() {
            console.time('timer');

            const tx = db.transaction(['store1', 'store2'], 'readwrite');
            console.log('tx durability:', tx.durability);

            for (let i = 0; i < 50000; i++) {
                const key = crypto.randomUUID();
                const val = i.toString();

                tx.objectStore('store1').put(val, key);
                tx.objectStore('store2').put(val, key);
            }

            tx.oncomplete = () => {
                console.timeEnd('timer');
            };

            tx.commit();
        }
    </script>

    <body>
        <button onclick="createMany();">store1 + store2</button>
    </body>
</html>

Edit: Tried with 100k puts, 2 different machines, Chrome and Firefox, Windows and Debian, didn’t make much difference.


Edit: Tried writing to 2 separate databases, still no real concurrency.

CodeSandbox demo

<!doctype html>
<html lang="en">
    <head>
        <meta charset="UTF-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1.0" />
        <title>Document</title>
    </head>

    <script>
        let db1;
        let db2;

        const request1 = indexedDB.open('idb_test_1', 1);
        request1.onupgradeneeded = () => {
            const db = request1.result;
            db.createObjectStore('store1');
        };
        request1.onsuccess = () => {
            db1 = request1.result;
        };

        const request2 = indexedDB.open('idb_test_2', 1);
        request2.onupgradeneeded = () => {
            const db = request2.result;
            db.createObjectStore('store1');
        };
        request2.onsuccess = () => {
            db2 = request2.result;
        };

        function createMany(db) {
            console.time(db.name);

            const tx = db.transaction('store1', 'readwrite');

            for (let i = 0; i < 50000; i++) {
                const key = crypto.randomUUID();
                const val = i.toString();

                tx.objectStore('store1').put(val, key);
            }

            tx.oncomplete = () => {
                console.timeEnd(db.name);
            };

            tx.commit();
        }

        function fillDb1() {
            createMany(db1);
        }

        function fillDb2() {
            createMany(db2);
        }

        function bothDbs() {
            createMany(db1);
            createMany(db2);
        }
    </script>

    <body>
        <button onclick="fillDb1();">db1</button>
        <button onclick="fillDb2();">db2</button>
        <button onclick="bothDbs();">db1 + db2</button>
    </body>
</html>

2

Answers


  1. Chosen as BEST ANSWER

    I did some more digging and found this in the webkit bug tracker:

    I think saying that the transactions are interleaved is more accurate than saying they run in parallel since they still take turns hitting the database

    and this:

    IndexedDB: Allow multiple transactions to interleave request execution

    Implement spec logic for allowing read-only transactions, and read-write transactions with non-overlapping scopes, to run concurrently. Transactions all still run in the same thread with tasks triggered via timers, so tasks and the underlying database operations are interleaved rather than truly parallelized

    Now this is for Safari, but it also aligns with the first profiling picture from Chrome that I posted in the original question. You can see the transactions are running at the same time and taking turns writing to the DB.

    I will just assume that Chrome works similarly. Chrome is using LevelDB for IndexedDB, which says one of it's limitations is:

    Only a single process (possibly multi-threaded) can access a particular database at a time.

    Furthemore, I realized that I was reading the 2018 W3 spec of IndexedDB and the latest version, that Josh linked, also says this:

    implementations are not required to start non-overlapping read/write transactions in parallel, or may impose limits on the number of started transactions

    That pretty much says it all.


  2. readwrite transactions with overlapping scopes result in sequential transactions instead of parallel transactions. See the spec on this one.

    readwrite transactions with non-overlapping scopes result in parallel transactions, at least according to the docs.

    Within a single transaction, the requests can run in parallel (the puts).

    Across two writable (readwrite) transactions with the same scope, the second tx does not start till the first tx ends, so the puts from the first run in parallel but the puts in the second are all delayed until all of the puts from the first complete.

    Put another way, it does not matter when you start the second transaction in an overlapping scope scenario. You can wait till the first completes, or you can start the second just after starting the first. The second will still be blocked. Basically, the second in the case of starting immediately after the first sits in an abstract pending work queue where only one transaction is popped off the front of the queue at a time, and all the pending transactions sit in the queue until they are at the front of the line.

    So it is a complicated answer because of the definitions:

    • puts in the same transaction are non-blocking (parallel)
    • puts in two or more transactions that have at least one object store in common are blocking (not parallel, aka serial, synchronous)
    • puts in two or more transactions that have no object stores in common are non-blocking (parallel)

    In your example, you are operating on the store1 and store2, so different stores, meaning you are in the 3rd situation. So this is documented by the api as non-blocking because of no overlapping scope (non-blocking). This means all the puts should be happening concurrently (aka in parallel as you say) from both transactions.

    Keep in mind you will find some people on the internet very angry about using the words concurrent, parallel, simultaneous, and non-blocking interchangeably, and someone will inevitably point out that a multithreaded application on a single core is not true parallelism, but it really is all just variation on the concept of time sharing and its subtleties (e.g. context switching, the cost of switching from one thread to another, the strategy for allocating how much time to spend on any one thread, etc).

    But it gets more complicated. There are two conflicting goals when implementing:

    • push data to disk immediately so that if a crash happens no data is lost, also known as immediate consistency, or
    • buffer (queue up) the data and eventually write to disk. Also known as eventual consistency.

    One design will prioritize avoiding data loss, but the other might prioritize speed instead. Eventual consistency, provided it is done right, does give you the illusion of speed, and sounds great, provided you forget about the risk of data loss.

    The browser implementations of indexedDB slightly vary here. It is entirely possible that in certain implementations there is a situation where the disk writes are not truly concurrent, even though the api layer is concurrent. For example, one reason would be how the browser chooses to implement time sharing. There could be one thread, multiple threads, microthreads, cpu bound threads that depend on the number of cores, etc. You could be in a situation where the time slicing is causing blocking in the i/o layer. This contention at the lower layer issue happens all the time in programming projects (e.g. a fun one I once encountered was concurrent requests with sockets in nodejs). Based on that, my guess is that you might be in one of these "single threaded but writing concurrent code" situations that you only experience if you do enough simultaneous puts to cause some backpressure (the puts cannot exit the write queue as fast as the puts enter the write queue, causing a delay for later puts). At the same time, last I checked, Chrome was using something like leveldb, a shared key value disk io library at Google that is supposedly battle-tested and high performance and is unlikely to exhibit such bad behavior.

    I would run a test that tries to use commit to see if preventing the timeout delay causes both transactions to complete more quickly. This is unlikely, i think we are saving almost 0 time by doing this, but it might be something to explore.

    I would amend the code to call console.time from within create helper instead of outside of it.

    I would include in logging the durability type.

    Final note. To really test apples to apples, I would test a 100k puts single transaction against two 50k puts on different stores. You might just be watching the time grow because it takes 3 seconds for 50k puts, so increasing the total number of puts to 100k naturally makes that take longer, and concurrency does nothing to speed that up, you’re at the max write rate, so the blocking concern is less relevant.

    Here is a relevant answer on the concept of being in an i/o bound problem instead of a cpu bound problem: What do the terms "CPU bound" and "I/O bound" mean?. Without doing much more digging this is my guess. Concurrent transactions solve a cpu bound issue but do nothing to solve an i/o issue. Without looking, I think leveldb prefers contiguous blocks of data to avoid the classic seek issue, so I would bet they try for one disk one contiguous data block, so there is no real extra parallelism happening at the i/o layer.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search