skip to Main Content

I did a small benchmark test to compare Couchbase (running in Win) with Redis and MySql (EDIT: added Aerospike to test)

We are inserting 100 000 JSON "documents" into three db/stores:

  • Redis (just insert, there is nothing else)
  • Couchbase (in-memory Ephemeral buckets, JSON Index on JobId)
  • MySql (Simple table; Id (int), Data (MediumText), index on Id)
  • Aerospike (in-memory storage)

The JSON file is 67 lines, about 1800 bytes.

INSERT:

  • Couchbase: 60-100 seconds (EDIT: seems to vary quite a bit!)
  • MySql: 30 seconds
  • Redis: 8 seconds
  • Aerospike: 71 seconds

READ:
We are reading 1000 times, and we do this 10 times and look at averages.

  • Couchbase: 600-700 ms for 1000 GETs (Using KeyValue operations, not Query API. Using Query API, this takes about 1500 ms)
  • MySql: 90-100 ms for 1000 GETs
  • Redis: 50-60 ms for 1000 GETs
  • Aerospike: 750 ms for 1000 GETs

Conclusion:
Couchbase seems slowest (the INSERT times varies a lot it seems), Aerospike is also very slow. Both of these are using in-memory storage (Couchbase => Ephemeral bucket, Aerospike => storage-engine memory).

Question: Why the in-memory write and read on Couchbase so slow, even slower than using normal MySQL (on an SSD)?

CODE

Note: Using Task.WhenAll, or awaiting each call, doesn’t make a difference.

INSERT

Couchbase:

IBucket bucket = await cluster.BucketAsync("halo"); // <-- ephemeral 
IScope scope = bucket.Scope("myScope");
var collection = scope.Collection("myCollection");

// EDIT: Added this to avoid measuring lazy loading:
JObject t = JObject.FromObject(_baseJsonObject);
t["JobId"] = 0;
t["CustomerName"] = $"{firstnames[rand.Next(0, firstnames.Count - 1)]} {lastnames[rand.Next(0, lastnames.Count - 1)]}";
await collection.InsertAsync("0", t);
await collection.RemoveAsync("0");

List<Task> inserTasks = new List<Task>();
sw.Start();
foreach (JObject temp in jsonObjects) // jsonObjects is pre-created so its not a factor in the test
{
    inserTasks.Add(collection.InsertAsync(temp.GetValue("JobId").ToString(), temp));
}
await Task.WhenAll(inserTasks);
sw.Stop();
Console.WriteLine($"Adding {nbr} to Couchbase took {sw.ElapsedMilliseconds} ms");

Redis (using ServiceStack!)

sw.Restart();
using (var client = redisManager.GetClient())
{
    foreach (JObject temp in jsonObjects)
    {
        client.Set($"jobId:{temp.GetValue("JobId")}", temp.ToString());
    }
}
sw.Stop();
Console.WriteLine($"Adding {nbr} to Redis took {sw.ElapsedMilliseconds} ms");
sw.Reset();

Mysql:

MySql.Data.MySqlClient.MySqlConnection mySqlConnection = new MySql.Data.MySqlClient.MySqlConnection("Server=localhost;Database=test;port=3306;User Id=root;password=root;");
mySqlConnection.Open();
sw.Restart();
foreach (JObject temp in jsonObjects)
{
    MySql.Data.MySqlClient.MySqlCommand cmd = new MySql.Data.MySqlClient.MySqlCommand($"INSERT INTO test (id, data) VALUES ('{temp.GetValue("JobId")}', @data)", mySqlConnection);
    cmd.Parameters.AddWithValue("@data", temp.ToString());
    cmd.ExecuteNonQuery();
}
sw.Stop();
Console.WriteLine($"Adding {nbr} to MySql took {sw.ElapsedMilliseconds} ms");
sw.Reset();

READ

Couchbase:

IBucket bucket = await cluster.BucketAsync("halo");
IScope scope = bucket.Scope("myScope");
var collection = scope.Collection("myCollection");


    Stopwatch sw = Stopwatch.StartNew();
    for (int i = 0; i < 1000; i++)
    {
        string key = $"{r.Next(1, 100000)}";
        var result = await collection.GetAsync(key);
    }
    sw.Stop();
    Console.WriteLine($"Couchbase Q: {q}t{sw.ElapsedMilliseconds}");

Redis:

    Stopwatch sw = Stopwatch.StartNew();
    using (var client = redisManager.GetClient())
    {
        for (int i = 0; i < nbr; i++)
        {
            client.Get<string>($"jobId:{r.Next(1, 100000)}");
        }
    }
    sw.Stop();
    Console.WriteLine($"Redis Q: {q}t{sw.ElapsedMilliseconds}");

MySQL:

MySqlConnection mySqlConnection = new MySql.Data.MySqlClient.MySqlConnection("Server=localhost;Database=test;port=3306;User Id=root;password=root;");
mySqlConnection.Open();
            
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < nbr; i++)
{
    MySqlCommand cmd = new MySql.Data.MySqlClient.MySqlCommand($"SELECT data FROM test WHERE Id='{r.Next(1, 100000)}'", mySqlConnection);
    using MySqlDataReader rdr = cmd.ExecuteReader();

    while (rdr.Read())
    {
    }
}
sw.Stop();
Console.WriteLine($"MySql Q: {q} t{sw.ElapsedMilliseconds} ms");
sw.Reset();

Couchbase setup:

enter image description here

and

enter image description here

and Bucket Durability:

enter image description here

I only have 1 Node (no cluster), it’s local on my machine, running Ryzen 3900x 12 cores, M.2 SSD, Win10, 32 GB RAM.

If you made it this far, here is a GitHub repo with my benchmark code:
https://github.com/tedekeroth/CouchbaseTests

2

Answers


  1. I would have to run such a comparison myself to do a full investigation, but two things stand out.

    1. Your parallel execution isn’t truly fully parallel. async methods run synchronously up to the first await, so all of the code in InsertAsync/GetAsync before the first await is running sequentially as you add your tasks, not parallel.

    2. CouchbaseNetClient does some lazy connection setup in the background, and you’re paying that cost in the timed section. Depending on the environment, including SSL negotiation and such things, this can be a significant initial latency.

    You can potentially address the first issue by using Task.Run to kick off the operation, but you may need to pre-size the default Threadpool size.

    You can address the second issue by doing at least one operation on the bucket (including bucket.WaitUntilReadyAsync()) before the timed section.

    60 seconds for inserts still look abnormal. How many nodes and what Durability setting are you using?

    Login or Signup to reply.
  2. I took your CouchbaseTests, commented out the non-Couchbase bits. Fixed the query to select from the collection ( myCollection ) instead of jobcache, and removed the Metrics option. And created an index on JobId.
    create index mybucket_JobId on default:myBucket.myScope.myCollection (JobId)
    It inserts the 100,000 documents in 19 seconds and kv-fetches the documents on average 146 usec and query by JobId on average 965 usec.

    Couchbase Q: 0 187
    Couchbase Q: 1 176
    Couchbase Q: 2 143
    Couchbase Q: 3 147
    Couchbase Q: 4 140
    Couchbase Q: 5 138
    Couchbase Q: 6 136
    Couchbase Q: 7 139
    Couchbase Q: 8 125
    Couchbase Q: 9 129
    average et: 146 ms per 1000 -> 146 usec / request
    
    Couchbase Q: 0 1155
    Couchbase Q: 1 1086
    Couchbase Q: 2 1004
    Couchbase Q: 3 901
    Couchbase Q: 4 920
    Couchbase Q: 5 929
    Couchbase Q: 6 912
    Couchbase Q: 7 911
    Couchbase Q: 8 911
    Couchbase Q: 9 927
    average et: 965 ms per 1000 -> 965 usec / request. (coincidentally exactly the same as with the java api).
    

    This was on 7.0 build 3739 on a Mac Book Pro with the cbserver running locally.

    ######################################################################

    I have a small LoadDriver application for the java sdk that uses the kv api. With 4 threads, it shows an average response time of 54 micro-seconds and throughput of 73238 requests/second. It uses the travel-sample bucket on a cb server on localhost. [email protected]:mikereiche/loaddriver.git

    Run: seconds: 10, threads: 4, timeout: 40000us, threshold: 8000us requests/second: 0 (max), forced GC interval: 0ms
    count: 729873, requests/second: 72987, max: 2796us avg: 54us, aggregate rq/s: 73238

    For the query API I get the following which is 18 times slower.

    Run: seconds: 10, threads: 4, timeout: 40000us, threshold: 8000us requests/second: 0 (max), forced GC interval: 0ms
    count: 41378, requests/second: 4137, max: 12032us avg: 965us, aggregate rq/s: 4144

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search