I have a table people(gid int, name varchar) with data:
gid name
1 Bill
2 Will
2 Musk
2 Jack
1 Martin
1 Jorge
now I want to get a result set like below:
gid representative(varchar)
1 Bill, Martin
2 Will, Musk
I mean to list out each group with two of people’s names of that group as representative. Here the order doesn’t matter, the count matters. How to do it in one query statement, i.e. in PG?
After multiple trying for a long while, I can achieve the goal by this statement:
with indexed as(
select gid,name,row_number() over (partition by gid) as index
from people
),filtered as(
select gid, name
from indexed where index<3
)
select gid,string_agg(name,',')
from filtered
group by gid;
However, the actual table has huge number rows and more complicated schema. It consumes long time to just create the first CTE table "indexed" because of traversing the whole table.
Is there better resolution to direct database engine to select out just parts of rows for each group at first, without full traversing?
2
Answers
Do you have a Table with id as primary key?
Then you could give a double subselect a try:
I don’t have a clue what the optimizer will make out of it
One option is to use Row_Number() analytic function and STRING_AGG()
See the fiddle here.