skip to Main Content

so I have two tables as shown

TABLE A

CrimeType CrimeSubType Neighborhood Year
1 1 Jeff Park 2024
2 2 Cong Park 2023
2 2 Cong Park 2024
2 2 Jeff Park 2024
3 1 Jeff Park 2024
3 2 Jeff Park 2023
3 1 Jeff Park 2023

I then have a second table that defines the crime types as a human readable category

TABLE B

CrimeType CrimeSubType Category
1 1 VIOLENT CRIME
1 2 VIOLENT CRIME
2 1 VIOLENT CRIME
2 2 PROPERTY CRIME
3 1 PROPERTY CRIME
3 2 DRUG CRIME

I would like the output of my postgres (if possible) to display results such as the one below (assuming year for 2024)

Neighborhood Category Counts
Jeff Park VIOLENT CRIME 0
Cong Park VIOLENT CRIME 1
Jeff Park PROPERTY CRIME 3
Cong Park PROPERTY CRIME 2
Jeff Park DRUG CRIME 1
Cong Park DRUG CRIME 0

I was able to get it to look like the above using something like the following

SELECT
  Neighborhood,
  Category,
  COUNT(Neighborhood)
from A
INNER JOIN B
   ON (b.CrimeType = 
a.crimeType
  AND B.CrimeSubType = b.CrimeSubType
  AND a.year = 2024
GROUP BY
  Neighborhood,
  Category

I looked up some answers and it said to change the INNER JOIN to a left or right join (I believe in my case it would be a right join? however I feel my case is too complex and I am missing something

SELECT
  Neighborhood,
  Category,
  COUNT(Neighborhood)
from A
LEFT JOIN B
   ON (b.CrimeType = a.crimeType
  AND B.CrimeSubType = b.CrimeSubType
  AND a.year = 2024
GROUP BY
  Neighborhood,
  Category

still does not work, it will display some nulls but not the way I intended it, what am I missing?

3

Answers


  1. Rightfully, there should be bona fide tables containing all categories and neighborhoods. In the absence of that, we can use subqueries to find them, then left join as you were already doing:

    SELECT n.Neighborhood, c.Category, COUNT(b.CrimeType) AS Counts
    FROM (SELECT DISTINCT Neighborhood FROM A) n
    CROSS JOIN (SELECT DISTINCT Category FROM B) c
    LEFT JOIN A a
        ON a.Neighborhood = n.Neighborhood AND
           a.year = 2024
    LEFT JOIN B b
        ON B b.Category = c.Category AND
             b.CrimeType = a.CrimeType AND
             b.CrimeSubType = b.CrimeSubType
    GROUP BY
        n.Neighborhood,
        c.Category
    ORDER BY
        n.Neighborhood,
        c.Category;
    
    Login or Signup to reply.
  2. You would need a left join if there are un-categorised combinations of crimeType+CrimeSubType, and to avoid NULLs in the output you can used coalesce as follows:

    SELECT
           Neighborhood
        , coalesce(b.Category,'Un-categorised') as Category
        , COUNT(*) as count_of
    FROM A
    LEFT JOIN B ON b.CrimeType = a.crimeType
        AND B.CrimeSubType = b.CrimeSubType
    WHERE a.year = 2024
    GROUP BY 
           Neighborhood
        , coalesce(Category,'Un-categorised')
    

    nb: a.year = 2024 should be applied as a where clause (not part of the ON conditions for table b)

    Login or Signup to reply.
  3. Assuming (all of this should be specified in the question):

    • Table B is the list of crime types, (CrimeType, CrimeSubType) being its PK.

    • You want to group by Category, lumping multiple crime types together (exactly for each in the sample data).

    • You want to display the full grid (Cartesian Product) of distinct values in Neighborhood and Category, along with the sum of all cases in table for the given year (2024 in the example)

    • Sort descending by Category and Neighborhood (as your displayed result suggests.

    This should be the most efficient query:

    FROM   b
    CROSS JOIN (SELECT DISTINCT neighborhood FROM a) n
    LEFT  JOIN (
       SELECT crimetype, crimesubtype, neighborhood, count(neighborhood) AS ct
       FROM   a
       WHERE  year = 2024
       GROUP  BY crimetype, crimesubtype, neighborhood
       ) a  USING (crimetype, crimesubtype, neighborhood)
    GROUP  BY b.category, n.neighborhood
    ORDER  BY b.category DESC, n.neighborhood DESC;
    

    fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search