I have a table
which has id
and group_id
foreign key.
I want to create (N=2) buckets of equal size (approximately), such that all rows with given group_id
end up in the same bucket.
For example, given:
id, group_id
1,1
2,1
3,2
4,NULL
5,1
6,2
I get:
id, group_id, bucket_id
1,1,1
2,1,1
3,2,2
4,NULL,2
5,1,1
6,2,2
Is there an easy way to achieve this that I’m missing?
2
Answers
you should be able to use the NTILE analytic function
This will do a roughly decent job for two buckets, since the rank will be the same for all like group_ids and you just set the cutoff point midway.
Of course this is brittle, if you have a case with 10 records with {group id: count} like {1: 9, 2: 1} it will still be very skewed.
fiddle: https://dbfiddle.uk/lbj5sXgo