Table tmch
contains thousands of rows here it is:
CREATE TABLE IF NOT EXISTS public.tmch (
id bigserial NOT NULL,
year integer,
week integer,
my_number integer,
device_id bigint,
CONSTRAINT tmch_pkey PRIMARY KEY (id)
)
Sample data:
1716446 2024 37 13 2
1716447 2024 37 13 2
1716448 2024 37 0 3
1716449 2024 37 11 4
1716450 2024 37 12 4
1716451 2024 37 0 6
1716452 2024 37 0 6
1716453 2024 37 0 6
1716454 2024 37 1 6
1716455 2024 37 1 6
1716456 2024 37 9 7
Here is a query to count how many times my_number
occurs for each group of (week, year, device_id)
:
select count(my_number) c, my_number, device_id, year, week from tmch
group by my_number, device_id, year, week
order by device_id asc, c desc
Result of this query – based on a bigger, different sample:
6 16 2 2024 37
4 17 2 2024 37
4 15 2 2024 37
2 13 2 2024 37
1 2 2 2024 37
8 15 3 2024 37
6 16 3 2024 37
5 14 3 2024 37
3 20 3 2024 37
3 18 3 2024 37
3 19 3 2024 37
1 8 3 2024 37
How to get only rows with the highest count of my_number
per
group of (week, year, device_id)
?
Result for the sample above:
6 16 2 2024 37 -- because my_number=16 occurs 6 times for device_id=2 y=2024 w=37
8 15 3 2024 37 -- because my_number=15 occurs 8 times for device_id=3 y=2024 w=37
I tried row_number() over (partition by ...)
but without success.
2
Answers
DISTINCT ON
does what you want, if you use your query as a subquery:In a
SELECT
query,DISTINCT ON
is applied after aggregation withGROUP BY
and aggregate function. (Even after window functions.) So you can do it all in a single query level without subquery.And since you list
my_number
in theGROUP BY
clause, it makes more sense to usecount(*)
instead ofcount(my_number)
. A bit faster, too.The only logical difference: if
my_number
can be null, you get an actual count for that group, too. Else you get a count of 0 for the null group.The same query with a more intuitive order of columns:
Or with minimal syntax:
fiddle
Just make sure that
ORDER BY
andDISTINCT ON
do not disagree. See: