I’m using PostgreSQL 14.10. I have tables users
| id | name |
and user_actions
| id | dept_id | user_id | num_tickets |
I would like to insert a row of data into user_actions
. dept_id
and num_tickets
are given as known values, but I have only the user’s name, not the id, as input. Therefore I’d like to INPUT
into user_actions
with the user_id
value being taken as the result of a SELECT
statement, something like so:
INSERT INTO user_actions(dept_id,user_id,num_tickets)
VALUES('4', SELECT users.id FROM users WHERE users.name = 'Alice', '2')
This exact form is clearly wrong, because it gives the error
syntax error at or near "SELECT"
LINE 3: VALUES('1', SELECT users.id FROM users WHERE user...
^
This thread indicates it should be possible to INSERT
using the result of a SELECT
statement. However, all of the examples there use a SELECT
to obtain all of the input values, so there’s no example exactly like my case, where only one value is derived from a sub-query.
I can’t make enough sense of the official Postgres docs page for INSERT
to determine the syntax more precisely than I’ve tried. I don’t see any examples where only one of several INSERT
values is derived from a SELECT
statement, though.
When I try the following variations, I get the corresponding errors:
# No single quotes around values
INSERT INTO user_actions(dept_id,user_id,num_tickets)
VALUES(4, SELECT users.id FROM users WHERE users.name = 'Alice', 2)
syntax error at or near "SELECT"
LINE 3: VALUES(1, SELECT users.id FROM users WHERE users...
^
# No VALUES, with quotes
INSERT INTO user_actions(dept_id,user_id,num_tickets)
('4', SELECT users.id FROM users WHERE users.name = 'Alice', '2')
syntax error at or near "'1'"
LINE 3: '1', SELECT monsters.id FROM monsters WHERE monsters.nam...
^
# No VALUES, no parentheses, with quotes
INSERT INTO user_actions(dept_id,user_id,num_tickets)
'4', SELECT users.id FROM users WHERE users.name = 'Alice', '2'
syntax error at or near "'1'"
LINE 3: '1', SELECT users.id FROM users WHERE users.nam...
^
# No VALUES, no quotes
INSERT INTO user_actions(dept_id,user_id,num_tickets)
(4, SELECT users.id FROM users WHERE users.name = 'Alice', 2)
syntax error at or near "1"
LINE 3: (1, SELECT users.id FROM users WHERE users.name...
^
# No VALUES, no quotes, no parentheses
INSERT INTO user_actions(dept_id,user_id,num_tickets)
4, SELECT users.id FROM users WHERE users.name = 'Alice', 2
psycopg2.errors.SyntaxError: syntax error at or near "1"
LINE 3: (1, SELECT users.id FROM users WHERE users.name...
^
# Quotes, no parentheses
INSERT INTO user_actions(dept_id,user_id,num_tickets)
VALUES '4', SELECT users.id FROM users WHERE users.name = 'Alice', '2'
syntax error at or near "'1'"
LINE 3: VALUES '1', SELECT users.id FROM users WHERE users...
^
# No quotes, no parentheses
INSERT INTO user_actions(dept_id,user_id,num_tickets)
VALUES 4, SELECT users.id FROM users WHERE users.name = 'Alice', 2
syntax error at or near "1"
LINE 3: VALUES 1, SELECT users.id FROM users WHERE users...
This accepted answer suggests that doing a SELECT
for everything, even the known quantities, should work. However, when I try it I get an error
# SELECT everything
INSERT INTO user_actions(dept_id,user_id,num_tickets)
SELECT 4, users.id FROM users WHERE users.name = 'Alice', 2
syntax error at or near ","
LINE 3: ...4, users.id FROM users WHERE users.name = 'Alice', 2
^
I’m out of ideas. What’s the syntax?
2
Answers
The last approach is the correct way of performing a SELECT and INSERT.
Here instead of
SELECT 4, users.id FROM users WHERE users.name = 'Alice', 2
The number of fields that the SELECT query returns must match with the list of columns specified in the INSERT part of the query.
Your final query will look like this:
You can use a subselect as a scalar value, but you must put it in ().
This is a dependent subquery, it can be used anywhere a scalar value fits, for example in a SELECT.
which gives the same result as
However the query plan with a dependent subquery can be much worse than a join. This is because the dependent subquery is executed for each row where the value is needed, whereas the JOIN is executed once. So the JOIN can use optimizations that are not available for the dependent subquery. For example in the above case since both tables will have to be read entirely, the JOIN would build a hash or merge only once then use it for all rows, whereas the dependent subquery would not.
For your INSERT there are important differences between dependent subquery and INSERT SELECT as in Ashwin’s answer.
INSERT SELECT allows you to pull several columns out of the secondary table, whereas the dependent subquery only returns one. So if you need two columns or more, INSERT SELECT is vastly more practical.
If users.name is not unique, INSERT SELECT will return all the rows with that name, causing one INSERT per row. The dependent subquery is only allowed to output one value, so if multiple rows match, it will throw an error.