so I have two tables as shown
TABLE A
CrimeType | CrimeSubType | Neighborhood | Year |
---|---|---|---|
1 | 1 | Jeff Park | 2024 |
2 | 2 | Cong Park | 2023 |
2 | 2 | Cong Park | 2024 |
2 | 2 | Jeff Park | 2024 |
3 | 1 | Jeff Park | 2024 |
3 | 2 | Jeff Park | 2023 |
3 | 1 | Jeff Park | 2023 |
I then have a second table that defines the crime types as a human readable category
TABLE B
CrimeType | CrimeSubType | Category |
---|---|---|
1 | 1 | VIOLENT CRIME |
1 | 2 | VIOLENT CRIME |
2 | 1 | VIOLENT CRIME |
2 | 2 | PROPERTY CRIME |
3 | 1 | PROPERTY CRIME |
3 | 2 | DRUG CRIME |
I would like the output of my postgres (if possible) to display results such as the one below (assuming year for 2024)
Neighborhood | Category | Counts |
---|---|---|
Jeff Park | VIOLENT CRIME | 0 |
Cong Park | VIOLENT CRIME | 1 |
Jeff Park | PROPERTY CRIME | 3 |
Cong Park | PROPERTY CRIME | 2 |
Jeff Park | DRUG CRIME | 1 |
Cong Park | DRUG CRIME | 0 |
I was able to get it to look like the above using something like the following
SELECT
Neighborhood,
Category,
COUNT(Neighborhood)
from A
INNER JOIN B
ON (b.CrimeType =
a.crimeType
AND B.CrimeSubType = b.CrimeSubType
AND a.year = 2024
GROUP BY
Neighborhood,
Category
I looked up some answers and it said to change the INNER JOIN to a left or right join (I believe in my case it would be a right join? however I feel my case is too complex and I am missing something
SELECT
Neighborhood,
Category,
COUNT(Neighborhood)
from A
LEFT JOIN B
ON (b.CrimeType = a.crimeType
AND B.CrimeSubType = b.CrimeSubType
AND a.year = 2024
GROUP BY
Neighborhood,
Category
still does not work, it will display some nulls but not the way I intended it, what am I missing?
3
Answers
Rightfully, there should be bona fide tables containing all categories and neighborhoods. In the absence of that, we can use subqueries to find them, then left join as you were already doing:
You would need a left join if there are un-categorised combinations of crimeType+CrimeSubType, and to avoid NULLs in the output you can used coalesce as follows:
nb: a.year = 2024 should be applied as a where clause (not part of the ON conditions for table b)
Assuming (all of this should be specified in the question):
Table
B
is the list of crime types,(CrimeType, CrimeSubType)
being its PK.You want to group by
Category
, lumping multiple crime types together (exactly for each in the sample data).You want to display the full grid (Cartesian Product) of distinct values in
Neighborhood
andCategory
, along with the sum of all cases in tablefor the given year (2024 in the example)
Sort descending by
Category
andNeighborhood
(as your displayed result suggests.This should be the most efficient query:
fiddle