skip to Main Content

I’m trying to create tables to store very large volume of data, and I wonder if I can use null TIMESTAMP to save storage?

I have a variable update_at, which will be updated to NOW() whenever I modify the associated row – most of the rows will not be modified. In this case, when creating the table, will update_at TIMESTAMPTZ save me some space compared to update_at TIMESTAMPTZ NOT NULL DEFAULT NOW()?

Given Checking if string has letter e (case insensitive) sql query using one where condition, I know that:

For fixed width fields like nullable int the storage space required is
always the same regardless of whether the value is null or not.

For variable width nullable fields the value ’NULL‘ takes zero bytes
of storage space (ignoring the bit to store whether the value is null
or not).

But how about TIMESTAMPTZ or DATE data types?
I’m not sure if they were "variable width nullable fields" or "fixed width nullable fields" – consequently, not sure if I can save some storage space using Null.

Thank you so much for your help!

2

Answers


  1. A simple test could resolve this, as

    --create tables with 1 million rows
    create table test_null as
    select id::int8, null::timestamptz
    from generate_series(1, 1000000) as id;
    
    create table test_notnull as
    select id::int8, now()::timestamptz
    from generate_series(1, 1000000) as id;
    
    --refresh table statistics
    analyze test_null;
    analyze test_notnull;
    

    Checking table size in stoge

    select 'test_null' table_name, pg_size_pretty(pg_total_relation_size('test_null')) size
    union all
    select 'test_notnull' table_name, pg_size_pretty(pg_total_relation_size('test_notnull')) size;
    
    table_name size
    test_null 35 MB
    test_notnull 42 MB

    Fiddle to test

    Login or Signup to reply.
  2. This is not the best order for the columns, because of the data types:

    CREATE TABLE author_to_profile (
     authorId INTEGER REFERENCES author(authorId),
     profile VARCHAR(128) NOT NULL,
     created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
     update_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
     is_deleted BOOLEAN NOT NULL DEFAULT false,
     UNIQUE(authorId, created_at) 
    );
    

    The same data can be stored slightly better optimized, in blocks of 8 bytes:

    CREATE TABLE author_to_profile (
     created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), -- 8 bytes
     update_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), -- 8 bytes
     authorId INTEGER REFERENCES author(authorId), -- 4 bytes
     is_deleted BOOLEAN NOT NULL DEFAULT false, -- 1 byte, + 3 empty 
     profile VARCHAR(128) NOT NULL, -- unknown length, up to 128 bytes
     UNIQUE(authorId, created_at) -- why not a primary key?
    );
    

    Is it correct that the autorId can be NULL? If not, why do you create a unique constraint instead of a primary key? It’s also rather unusual to have a table without a primary key.

    Using UpperCase in your object names like table and column names, will cause bugs forever. Also because you now have a mix of snake_case and CamelCase in your naming convention.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search