I am doing some computing where I store the results in the REDIS database before it is sent to the database.
Currently I batch operations for size of 10k items per chunk which I process in separate GAE instance (single threaded computing using NodeJS), while the computing speed is really good, the PUSH action which performs HSET
operations is taking a long time, so then it causes some delay in different threads (as REDIS is single threaded – fyi I am using Google REDIS Basic instance).
What am I doing wrong? How can I make it to be pushed faster (like in batch or somehting) than now?
const key = '123';
for (const [column, value] of results) {
await this.appendRedisHashValue(key, column, value);
}
public async appendRedisHash(key: string, field: string, value: any) {
const appendRedisHashAsync = promisify(this.redisClient.hset).bind(this.redisClient);
return appendRedisHashAsync(key, field, JSON.stringify(value));
}
As you can see, I am simply pushing each of the item one by one using HSET
wondering if we can do some sort of SQL transactions
and push for example 10k items in a single HSET transaction instead of appending REDIS Hash every single time.
Each of the chunk (10k items) has size of ~43MB after it is saved to the REDIS (so in total 100k items gives 430MB). For some architecture design it must be stored in one single REDIS hash.
Current speed (milliseconds), each of the job is running in parallel separate thread:
"push": 13608
"finishedAt": "2020-05-08T22:51:26.045Z"
push": 13591,
"finishedAt": "2020-05-08T22:51:29.640Z"
"push": 15738,
"finishedAt": "2020-05-08T22:51:59.177Z"
"push": 21208,
"finishedAt": "2020-05-08T22:51:44.432Z"
"push": 13332,
"finishedAt": "2020-05-08T22:51:28.303Z"
"push": 10598,
"finishedAt": "2020-05-08T22:51:44.455Z"
"push": 27249,
"finishedAt": "2020-05-08T22:51:58.458Z"
"push": 36270,
"finishedAt": "2020-05-08T22:52:00.708Z"
"push": 25106,
"finishedAt": "2020-05-08T22:52:02.234Z"
"push": 12845,
"finishedAt": "2020-05-08T22:52:02.254Z"
Any feedback would be appreciated.
2
Answers
I tested it using HSET and HMSET over 10000 values, and I created simple bulk function to handle the records, from the simple data perspective it looks fantastic, let see how it is going to end in production environment.
While
npm redis
library didn't like hset to put it in this way, thehmset
did work tho which is weird.To bulk append
what you are doing is calling hset multiple times with a single key/value. which is bad because of the round trip latency.
doing it to 10k key/value will be 10k round trips.
you can use
hset
with multiple key/value so it will be a single trip to redis. eghset field1 value1 field2 value2 field3 value3