skip to Main Content

Recently I joined new org and there I am seeing that, for every nested json object they are flattening it for efficiency purpose. I am just wondering does even really improves performance in terms of time required to read/write and memory usage. If yes, then how much?

Original JSON:

{
  key1: {
           key11:val11,
           key12:val12
        }
  key2: {
           key21:val21,
           key22:val22
        }
}

Flatten JSON:

{
  key1_key11:val11,
  key1_key12:val12,
  key2_key21:val21,
  key2_key22:val22
}

As per my understanding there should not be change in memory usage but I am not sure about time required in both cases.

2

Answers


  1. In an ideal implementation, there should be no unnecessary reading.
    Cause you gain each time you don’t read a unnecessary input.
    In top of that you also have smaller file.
    But it can also waste time if you remove important syntax elements for parsing.

    To reply to you question :
    There are many ways to implement JSON parsing.
    And the performance for memory and time depend mainly of the implementation you use to pars the JSON file.

    Login or Signup to reply.
  2. Well, it depends on how you process your data. Basically JSON format allows for one pass interpretation allowing no trace back. So reading/interpreting JSON format with a LALR parser will have no penalties, at least based on the complexity of the document.

    Of course, a deep tree document will require more memory, but will be as efficient as a flattened document.

    In case you are going to read very big documents in JSON format, you will, at some point, need to start processing the document as you are reading it (you will not be able to hold the whole document in memory and you will have to process it sequentially). Most probably the bottleneck will be in the processing you need to do to the input file, than on reading or parsing it. Parsing JSON is a one pass problem, requiring a fixed amount of time per character (O(n))

    By the way, the format you have posted is not actually JSON. JSON format specifications are defined in ECMA-404 and it only accepts false, true and null as the only unquoted identifiers. You need to quote your keys with " characters (and the values you have used too). A strict JSON parser (and I recommend you one to avoid misinterpretations) will reject both documents you posted.

    On other side, flattening (you can simply count) will make your file bigger, and so you can end with a very big file, in which you will have to do a lot of processing if you want to recover the original document structure. (in your document you use 28 characters for the keys, while you use 40 for the flattened keys, this will increment the reading time –I said above the reading time to be O(n)– and also you will have to process the keys to recover the structure of the deep document) IMHO flattening will be a bad decision, as JSON standar was issued to allow structured documents to be exchanged (you break the structure thinking that parsing will be a non-scalable problem)

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search