I am trying to learn some postgres skills to help me building my flask application and I really care about performance but I faced some issues.
I have the following table (Sessions
) in postgres:
SessionID | Other columns | Statics |
---|---|---|
BDzQDeGr | * | {"excludedOptions": {"questID": ["optID", "optID2"]}} |
Note: Statics is a jsonb.
I wanted to execute a query that append optID
to "Statics"->'excludedOptions'->'questID'
jsonb array if questID
exists else append questID
to excludedOptions
that has ["optID"]
as value.
I spent my last 4 days trying to do that with one query, And finally I did it …
WITH ex AS
(
SELECT "Statics" -> 'excludedOptions'
FROM "Sessions"
WHERE "SessionID" = '{data["sessionId"]}'
)
UPDATE "Sessions"
SET "Statics" = jsonb_set(
"Statics",
ARRAY['excludedOptions', '{data["questionId"]}'],
array_to_json(array_remove((
SELECT
ARRAY['{data["answerId"]}'] || coalesce((
SELECT
jsonb_array_to_text_array((
SELECT value
FROM jsonb_each(( SELECT ex.* FROM ex))
WHERE key = '{data["questionId"]}'
))
), null)
), null))::jsonb,
TRUE
) WHERE "SessionID" = '{data["sessionId"]}'
NOTE: SQL above is a python f-string
but when I measured the executing time it was about 300-600ms witch is a lot of time for this simple operation.
I wanted it to be one query cz somehow I believe that one query is better :).
Can you help me making this query faster?
Would creating a function make it faster? or simpler?
Thank you for spending a little of your time for me
UPDATE
Table Schema:
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description |
---|---|---|---|---|---|---|---|
ID | numeric(16,0) | not null | main | ||||
SessionID | character varying(8) | not null | extended | ||||
Title | text | not null | extended | ||||
Type | text | not null | extended | ||||
Source | text | not null | extended | ||||
Questions | character(6)[] | not null | extended | ||||
Skipped | character(6)[] | ‘{}’::bpchar[] | extended | ||||
Answered | json | ‘{}’::json | extended | ||||
UsedHints | character(6)[] | ‘{}’::bpchar[] | extended | ||||
FinalResult | smallint | 0 | plain | ||||
Statics | jsonb | ‘{"timer":{}, "excludedOptions":{}}’::json | extended | ||||
Marked | character(6)[] | ‘{}’::bpchar[] | extended | ||||
Time | integer | not null | 0 | plain |
EXPLAIN ANALYZE:
Seq Scan on "Sessions" (cost=0.00..1.00 rows=1 width=894) (actual time=0.016..0.017 rows=1 loops=1)
Planning Time: 0.037 ms
Execution Time: 0.030 ms
2
Answers
Your table has no indexes, so Postgresql must scan the whole table to find anything. That’s what a "Seq Scan" is. You need to add some indexes to avoid table scans and make queries efficient.
Since you’re checking
WHERE "SessionID" = ...
an index on SessionID would allow Postgres to find all matching rows without having to scan the whole table.See Use The Index, Luke for more about SQL performance tuning and indexes.
Other notes:
char
, there is no benefit and many drawbacks. Usevarchar
ortext
.json
type.jsonb
is more efficient. Especially don’t mixjson
andjsonb
, it just invites confusion.numeric(16,0)
is an odd choice for a primary key. A simplebigint
will use less space, be more efficient, and store a larger number. PostgreSQL even provides the convenientbigserial
type for auto-incrementing primary keys.time
. Use the "at" and "on" conventions for timestamps and dates. For example,created_at
orcreated_on
.The performance hints given Schwern are absolutely valid. However, your UPDATE statement can be simplified drastically – at least if I understood your question correctly.
If there is a a key
questID
within theexcludedOptions
then this will append the string value"newOptId"
to that array. If there is no keyquestID
it will be created with a single-element array as its value.There is no need for a CTE or unnesting and aggregating the array for
questID
sessions.session_id
should have an index to make this fast.Online example