skip to Main Content

I am thinking of a implementing a new project that has import/export feature. First, I will have an array of around 45 objects. The object structure is simple like this.

{"id": "someId", "quantity": 3}

So, in order to make it exportable, I will have to change the whole array of these objects into one single string first. For this part, I think I will use JSON.stringify(). After that, I want to make the string as short as possible for the users to use it (copy the string and paste it to share to other users to import it back to get the original array). I know this part is not necessary but I really want to make it as short as possible. Hence, the question. How to convert array of objects to a shortest possible string?

Any techniques such as Encoding, Encryption, or Hashing are acceptable as long as it is reversible to get the original data.

By "shortest possible", I mean you can answer any solution that is shorter than just pure stringification. I will just accept the one that gives shortest string to import.

I tried text minification but it gives almost the same result as the original text. I also tried encryption but it still gives a relatively long result.

Note: The string for import (that comes from export) can be human-readable or unreadable. It does not matter.

2

Answers


  1. Deleting json optional SPACE after : colon and , comma
    is a no-brainer. Let’s assume you have already minified
    in that way.


    xz compression is generally helpful.

    Perhaps you know some strings that are very likely
    to repeatedly appear in the input doc. That might include:

    • "id":
    • "quantity":

    Construct a prefix document which mentions such terms.
    Sender will compress prefix + doc,
    strip the initial unchanging bytes,
    and send the rest.
    Receiver will accept those bytes via TCP,
    prepend the unchanging bytes,
    and decompress.

    Why does this improve compression ratios?
    Lempel-Ziv and related schemes maintain a dictionary,
    and transmit integer indexes into that dictionary
    in order to indicate common words.
    A word can be fairly long, even longer than "quantity".
    The longer it is, the greater the savings.

    If sender and receiver both know a set of words
    that belong in the dictionary, beforehand,
    we can avoid sending the raw text of those words.

    Your chrome browser
    compresses web headers
    in this way already, each time you do a google search.

    Finally, you might want to base64 encode the compressed output.


    Ignore compression, and use a database instead,
    in the way that tinyurl.com has been doing for quite a long time.
    Set serial to 1.
    Accept a new object, or set of objects.
    Ask the DB if you’ve seen this input before.
    If not, store it under a brand new serial ID.
    Now send the matching ID to the receiving end.
    It can query the central database when it gets a new ID,
    and it can cache such results to use in future.

    Login or Signup to reply.
  2. You might opt for a simple CSV export. The export string becomes, if you use the pipe separator, something like:

    id|quantitynsomeId|3notherId|8
    

    which is the equivalent of

    [{"id":"someId","quantity":3},{"id":"otherId","quantity":8}]
    

    This approach will remove the redundant id and quantity tags for each record and remove the unnecessary double quotes.

    The downside is that your records all should have the same data structure but that is generally the case.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search