Efficient way to append data to REDIS hash

WinterTime
May 9, 2020
198 views
0 votes
2 Answers

I am doing some computing where I store the results in the REDIS database before it is sent to the database.

Currently I batch operations for size of 10k items per chunk which I process in separate GAE instance (single threaded computing using NodeJS), while the computing speed is really good, the PUSH action which performs HSET operations is taking a long time, so then it causes some delay in different threads (as REDIS is single threaded – fyi I am using Google REDIS Basic instance).

What am I doing wrong? How can I make it to be pushed faster (like in batch or somehting) than now?

const key = '123';
for (const [column, value] of results) {
   await this.appendRedisHashValue(key, column, value);
}

public async appendRedisHash(key: string, field: string, value: any) {
        const appendRedisHashAsync = promisify(this.redisClient.hset).bind(this.redisClient);
        return appendRedisHashAsync(key, field, JSON.stringify(value));
}

As you can see, I am simply pushing each of the item one by one using HSET wondering if we can do some sort of SQL transactions and push for example 10k items in a single HSET transaction instead of appending REDIS Hash every single time.

Each of the chunk (10k items) has size of ~43MB after it is saved to the REDIS (so in total 100k items gives 430MB). For some architecture design it must be stored in one single REDIS hash.

Current speed (milliseconds), each of the job is running in parallel separate thread:

"push": 13608
"finishedAt": "2020-05-08T22:51:26.045Z"

push": 13591,
"finishedAt": "2020-05-08T22:51:29.640Z"

"push": 15738,
"finishedAt": "2020-05-08T22:51:59.177Z"

"push": 21208,
"finishedAt": "2020-05-08T22:51:44.432Z"

"push": 13332,
"finishedAt": "2020-05-08T22:51:28.303Z"

"push": 10598,
"finishedAt": "2020-05-08T22:51:44.455Z"

"push": 27249,
"finishedAt": "2020-05-08T22:51:58.458Z"

"push": 36270,
"finishedAt": "2020-05-08T22:52:00.708Z"

"push": 25106,
"finishedAt": "2020-05-08T22:52:02.234Z"

"push": 12845,
"finishedAt": "2020-05-08T22:52:02.254Z"

Any feedback would be appreciated.

Answers

Chosen as BEST ANSWER

I tested it using HSET and HMSET over 10000 values, and I created simple bulk function to handle the records, from the simple data perspective it looks fantastic, let see how it is going to end in production environment.

While npm redis library didn't like hset to put it in this way, the hmset did work tho which is weird.

const myarr = [];
const values = 10000;
for(let i = 0; i < values; i++) {
    myarr.push(`key${i}`);
    myarr.push('value');
}
await this.bulkRedisHash('myTest', myarr);
/*
    [Nest] 17800   - 2020-05-09 18:45:30   [FinalizeTaskService] starting +5ms
    [Nest] 17800   - 2020-05-09 18:45:30   [FinalizeTaskService] finished +21ms
 */
for (let i = 0; i < myarr.length; i++) {
    if (i % 2 !== 0) {
        await this.appendRedisHash('myTest2', myarr[i-1], myarr[i]);
    }
}
/*
   [Nest] 18396   - 2020-05-09 18:49:08   [FinalizeTaskService] starting +4ms
   [Nest] 18396   - 2020-05-09 18:49:09   [FinalizeTaskService] finished +795ms
*/

public async appendRedisHash(key: string, field: string, value: any) {
    const appendRedisHashAsync = promisify(this.redisClient.hset).bind(this.redisClient);
    return appendRedisHashAsync(key, field, value);
}

public async bulkRedisHash(key: string, keyValue: string[]) {
    const appendRedisHashAsync = promisify(this.redisClient.hmset).bind(this.redisClient);
    return appendRedisHashAsync(key, [...keyValue]);
}

To bulk append

(Edit)

- TuanAnhTran
- May 9, 2020 at 4:43 am
- 0 votes
0
what you are doing is calling hset multiple times with a single key/value. which is bad because of the round trip latency.

doing it to 10k key/value will be 10k round trips.

you can use hset with multiple key/value so it will be a single trip to redis. eg

hset field1 value1 field2 value2 field3 value3

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.