skip to Main Content

Please take a look at this query:

SELECT COUNT(*) FROM events e WHERE (e.event_data -> 'state')::int = -1;

The query works and everything is actually ok, but…

This is how the JSON object looks like when I read it from the database:

{
  "eventUID": "3ea6baf7-8772-48a5-a32b-b00901534025",
  "data": {<some-data>},
  "state": 1
}

As you can see ‘state’ is an int type.

EXPLAIN ANALYZE show me this:

Aggregate  (cost=15.08..15.09 rows=1 width=8) (actual time=0.042..0.076 rows=1 loops=1)
  ->  Seq Scan on events e  (cost=0.00..15.07 rows=1 width=0) (actual time=0.018..0.027 rows=0 loops=1)
        Filter: (((event_data -> 'state'::text))::integer = -1)
Planning Time: 0.139 ms
Execution Time: 0.198 ms

Obviously postgres casts an already int field to a text and then casts it again to an int.

So is it possible to avoid the cast so the select looks something like this (I know this is not valid select):

SELECT COUNT(*) FROM events e WHERE e.event_data -> 'state' = -1;

2

Answers


  1. It isn’t an integer and can’t be*.

    Firstly, JSON doesn’t have integers. It has "numbers", which could have a decimal part.

    Secondly, a JSON object isn’t of any fixed type anyway. The next one from the database might have a state of a string or array of objects, or perhaps not have any state at all.

    So, if you want PostgreSQL to know that you promise there will always be a 32 bit signed integer value there (or an error) then you need to tell it.

    * You might say "but I know all of them are ints". This is (a) only true until I come along and insert a string state and (b) no use to the database because it can’t read your mind.

    Login or Signup to reply.
  2. It’s the 'state' constant you used as a key that you’re extracting from the JSON, that’s cast to ::text. Combined with the -> operator on a JSON value, gets you a JSON on output:

    json -> text → json
    jsonb -> text → jsonb

    Extracts JSON object field with the given key.
    '{"a": {"b":"foo"}}'::json -> 'a' → {"b":"foo"}

    In effect, you can read your code like this:

    SELECT COUNT(*) 
    FROM events e 
    WHERE (e.event_data --no cast necessary, it's a known `json`
           -> --there are multiple operators like this
           'state'--this is an `unknown` constant, but it makes sense to interpret
                  --it as `text`, because that enables a `json->text` operator
           )--the operation inside the parentheses yields a `json`
        ::int--this only works if you're sure the `json` inside is a number 
             --and if you use ->>, not a -> in there, or go through ::text
        = -1;
    
    

    If you want a SQL integer rather than a JSON with a number primitive in it, that final ::int cast is necessary.

    postgres casts an already int field to a text and then casts it again to an int

    That is not what’s happening. Only the 'state' constant is cast (actually, assigned a type) ::text by Postgres for the -> to work (it requires a text as the right operand).


    If you’re looking to speed up this specific query, you can add an expression index:
    demo at db<>fiddle

    create index on events ((event_data ->> 'state'));
    

    For more performance and flexibility it would make sense to switch from json to jsonb and build a GIN index that would support way more types of queries on this column.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search