I have two SQL queries that output the same kind of output and have the same grouping and order :
select date_trunc('month', inserted_at)::date as date, count(id) from payment_logs where payment_logs.event_name = 'subscription_created' group by date order by date desc;
select date_trunc('month', inserted_at)::date as date, count(id) from users group by date order by date desc;
I would like to join those two results based on the calculated date field (which is the month), and have a result with 3 columns : date, count_users and count_payment_logs.
How can I achieve that? Thanks.
2
Answers
Something like this
Nothing wrong with the accepted answer, but I wanted to show an alternative and add some color. Instead of subqueries, you can also use common table expressions (CTEs) which improve readability but also have some other features as well. Here is an example using CTEs:
In my opinion the abstraction is neater and makes the code much easier to follow (and thus maintain).
Other notes:
The
order by
is expensive, and you can avoid it within each of the subqueries/CTEs since it’s being done at the end anyway. The ones in the subqueries will be clobbered by whatever you do in the main query anyway, so just omit them completely. Your results will not differ, and your query will be more efficient.In this example, you probably don’t have any missing months, but it’s possible… especially if you expand this concept to future queries. In such a case, you may want to consider a full outer join instead of an inner join (you have months that appear in the users that may not be in the payments or vice versa):
Another benefit of CTEs vs subqueries is that you can reuse them. In this example, I want to mimic the full outer join concept but with one additional twist — I have data from another table by month that I want in the query. The CTE lets me use the CTE for "payments" and "users" as many times as I want. Here I use them in the all_dates CTE and again in the main query. By creating "all_dates" I can now use left joins and avoid weird coalescing in joins (not wrong, just ugly).
The point is you can reuse the CTEs.
A final advantage is that you can declare the CTE materialized or non-materialized (default). The materialized CTE will essentially pre-process and store the results, which in certain cases may have better performance. A non-materialized on, on the other hand, will mimic a standard subquery which is nice because you can pass where clause conditions from outside the query to inside of it.