Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

Why is the Protobuf blob heavier than the JSON equivalent?

Zorzi
January 28, 2023
148 views
0 votes
2 Answers

I’m trying to use protobuf to accelerate data transfers between my front and back.

As a POC, I tried to load a JSON file, turn it into a protobuf buffer, and save the result in a new file.

But it turns out that the new file is heavier than the JSON one. Did I do something wrong?

Here are my files:

// input.proto

syntax = "proto3";

message MyData {
    repeated float a = 1;
    repeated float b = 2;
    repeated float c = 3;
    float d = 4;
    repeated float e = 5;
}

// index.mjs

import protobuf from 'protobufjs';
import fs from 'fs';

protobuf.load('./input.proto', (err, root) => {
    const payload = JSON.parse(fs.readFileSync('./input.json', {encoding: 'utf8'}));

    var Message = root.lookupType("MyData");

    var errMsg = Message.verify(payload);
    if (errMsg)
        throw Error(errMsg);

    var message = Message.create(payload);
    const buffer = Message.encode(message).finish();

    fs.writeFileSync('./output.pb', buffer, 'binary');
}, () => {

});

// input.json
{
  "a": [1, 2.4, 3, 4],
  "b": [1, 2, 3, 4],
  "c": [1, 2, 3.2, 4],
  "d": 10.321,
  "e": [1, 2, 3.7, 4],
}

(my actual json is much bigger than that, but it respects the same format as this one)

And finally :

$ du -h input.json output.pb
2,0M    input.json
2,5M    output.pb

Thanks for your help!

Answers

Chosen as BEST ANSWER

I ended up using NodeJS Buffers to reduce the size of my floats, here's my solution :

syntax = "proto3";

message MyData {
    bytes a = 1;
    bytes b = 2;
    bytes c = 3;
    bytes d = 4;
    bytes e = 5;
}

import protobuf from 'protobufjs';
import fs from 'fs';

protobuf.load('./input.proto', (err, root) => {
    const payload = JSON.parse(fs.readFileSync('./input.json', {encoding: 'utf8'}));

    var Message = root.lookupType("MyData");

    const formatedPayload = {};

    const encodeBuffer = (key) => {
        // Creating buffer, 2 bytes per element
        const buff = Buffer.alloc(2 * payload[key].length);
        payload[key].forEach((num, idx) => {
            // Writing new int in buffer.
            // Multiplying by 10, so that I keep a floating number in memory,
            // I will have to divide by 10 when decoding.
            buff.writeUInt16BE(num*10, idx * 2);
        });
        formatedPayload[key] = buff;
    }

    encodeBuffer('a');
    encodeBuffer('b');
    encodeBuffer('c');
    encodeBuffer('e');

    const dbuffer = Buffer.alloc(2);
    dbuffer.writeUInt16BE(payload.d * 10);
    formatedPayload.d = dbuffer;

    var errMsg = Message.verify(formatedPayload);
    if (errMsg)
        throw Error(errMsg);

    // var message = Message.create(formatedPayload);
    const buffer = Message.encode(formatedPayload).finish();

    fs.writeFileSync('./output.pb', buffer, 'binary');
}, () => {

});

(Edit)

- eik
- January 28, 2023 at 9:23 pm
- 0 votes
0
On reason could be that float in Protocol Buffers is encoded as I32, so every number needs 4 bytes. In JSON (UTF8) a single digit number is represented in 3 bytes (space, number and comma). You can also omit the space, making JSON even more compact.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.