Lets say I have a table T1. Initially it has ID, Identifier, C1, C2 as columns. As time goes by I will be adding new columns C3, C4 and so on. Will it be better to have the columns as ID, Identifier, Info with Info being a JSON column with C1, C2, C3 as keys and their corresponding values. What will be the read, write, update performance considering a scale of around 10^9 records.
2
Answers
If you are going to use a relational database then you should use a relational model. While many RDBMSs can now handle JSON that doesn’t mean you should hold your data as JSON (by default).
A good example of when to use JSON is if your source data is supplied as JSON. You can load it as-is into your RDBMS and then use SQL to "unpack" it from the staging table JSON into a proper relational format.
Regarding performance, there are so many variables involved it is impossible to provide a useful answer. The only way for you to get a useful answer is for you to test it on your environment with your data and with queries relevant to you
WHERE
orORDER BY
clauses, make them actual columns.LEFT JOINing
to get C3 (etc) — There are pros and cons to this ‘solution’.ALTER TABLE .. ADD COLUMN ..
"instantly". But there are caveats.If you try to do Where/Order on JSON fields for a billion rows, your computer will melt down into a slag heap.
If you want more advice on C1/C2/C3, please explain what your application is, how often columns will be added, how likely the cells will be NULL (missing, omitted, etc). We can discuss the alternatives.