skip to Main Content

I’m trying to use protobuf to accelerate data transfers between my front and back.

As a POC, I tried to load a JSON file, turn it into a protobuf buffer, and save the result in a new file.

But it turns out that the new file is heavier than the JSON one. Did I do something wrong?

Here are my files:

// input.proto

syntax = "proto3";

message MyData {
    repeated float a = 1;
    repeated float b = 2;
    repeated float c = 3;
    float d = 4;
    repeated float e = 5;
}
// index.mjs

import protobuf from 'protobufjs';
import fs from 'fs';

protobuf.load('./input.proto', (err, root) => {
    const payload = JSON.parse(fs.readFileSync('./input.json', {encoding: 'utf8'}));

    var Message = root.lookupType("MyData");

    var errMsg = Message.verify(payload);
    if (errMsg)
        throw Error(errMsg);

    var message = Message.create(payload);
    const buffer = Message.encode(message).finish();

    fs.writeFileSync('./output.pb', buffer, 'binary');
}, () => {

});
// input.json
{
  "a": [1, 2.4, 3, 4],
  "b": [1, 2, 3, 4],
  "c": [1, 2, 3.2, 4],
  "d": 10.321,
  "e": [1, 2, 3.7, 4],
}

(my actual json is much bigger than that, but it respects the same format as this one)


And finally :

$ du -h input.json output.pb
2,0M    input.json
2,5M    output.pb

Thanks for your help!

2

Answers


  1. Chosen as BEST ANSWER

    I ended up using NodeJS Buffers to reduce the size of my floats, here's my solution :

    syntax = "proto3";
    
    message MyData {
        bytes a = 1;
        bytes b = 2;
        bytes c = 3;
        bytes d = 4;
        bytes e = 5;
    }
    
    import protobuf from 'protobufjs';
    import fs from 'fs';
    
    protobuf.load('./input.proto', (err, root) => {
        const payload = JSON.parse(fs.readFileSync('./input.json', {encoding: 'utf8'}));
    
        var Message = root.lookupType("MyData");
    
        const formatedPayload = {};
    
        const encodeBuffer = (key) => {
            // Creating buffer, 2 bytes per element
            const buff = Buffer.alloc(2 * payload[key].length);
            payload[key].forEach((num, idx) => {
                // Writing new int in buffer.
                // Multiplying by 10, so that I keep a floating number in memory,
                // I will have to divide by 10 when decoding.
                buff.writeUInt16BE(num*10, idx * 2);
            });
            formatedPayload[key] = buff;
        }
    
        encodeBuffer('a');
        encodeBuffer('b');
        encodeBuffer('c');
        encodeBuffer('e');
    
        const dbuffer = Buffer.alloc(2);
        dbuffer.writeUInt16BE(payload.d * 10);
        formatedPayload.d = dbuffer;
    
        var errMsg = Message.verify(formatedPayload);
        if (errMsg)
            throw Error(errMsg);
    
        // var message = Message.create(formatedPayload);
        const buffer = Message.encode(formatedPayload).finish();
    
        fs.writeFileSync('./output.pb', buffer, 'binary');
    }, () => {
    
    });
    

  2. On reason could be that float in Protocol Buffers is encoded as I32, so every number needs 4 bytes. In JSON (UTF8) a single digit number is represented in 3 bytes (space, number and comma). You can also omit the space, making JSON even more compact.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search