skip to Main Content

I am using Snowflake (or MySQL is also fine) and I would like to assign a random number between 1 and 3 (both inclusive) to an ID column based on which group they are in – so that the 1s, 2s and 3s are equally distributed in each group. For example,

Roll_number , Col1
121212      ,    A
131313      ,    A
141414      ,    B
525252      ,    A
626262      ,    B
929292      ,    A
939393      ,    A
232323      ,    B
090909      ,    A
303030      ,    B
505050      ,    A
608960      ,    A

What I would like to have is the following:

Roll_number , Col1 , Random_Number
121212      ,    A ,             1 
131313      ,    A ,             2
525252      ,    A ,             3
929292      ,    A ,             3
939393      ,    A ,             1
505050      ,    A ,             1
608960      ,    A ,             3
090909      ,    A ,             2

141414      ,    B ,             1
232323      ,    B ,             2
626262      ,    B ,             3
303030      ,    B ,             1

so that the 1s, 2s and 3s are equally distributed in groups A and B. Any help is highly appreciated. Many thanks in advance.

2

Answers


  1. Use RN to generate a unique sequential number for each row within a group that you define by PARTITION BY Col1.

    I added ((rn – 1) % 3) + 1 to evenly distribute numbers 1, 2, and 3 within each group. The rn – 1 part makes the ROW_NUMBER start from 0, and % 3 ensures the values cycle through 0, 1, 2. I added 1 to make it shift to 1, 2, 3.

    CREATE TABLE your_table (
        Roll_number VARCHAR(20),
        Col1 CHAR(1)
    )
    
    
    INSERT INTO your_table (Roll_number, Col1) VALUES
    ('121212', 'A'),
    ('131313', 'A'),
    ('141414', 'B'),
    ('525252', 'A'),
    ('626262', 'B'),
    ('929292', 'A'),
    ('939393', 'A'),
    ('232323', 'B'),
    ('090909', 'A'),
    ('303030', 'B'),
    ('505050', 'A'),
    ('608960', 'A')
    
    
    Records: 12  Duplicates: 0  Warnings: 0
    
    WITH RankedData AS (
        SELECT
            Roll_number,
            Col1,
            ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY RAND()) AS rn
        FROM
            your_table
    )
    SELECT
        Roll_number,
        Col1,
        ((rn - 1) % 3) + 1 AS Random_Number
    FROM
        RankedData
    ORDER BY
        Col1, Roll_number
    
    Roll_number Col1 Random_Number
    090909 A 3
    121212 A 2
    131313 A 2
    505050 A 1
    525252 A 1
    608960 A 1
    929292 A 3
    939393 A 2
    141414 B 2
    232323 B 1
    303030 B 1
    626262 B 3

    fiddle

    Login or Signup to reply.
  2. Assuming roll_number is evenly distributed and itself can be considered "random":

    select
        Roll_number,
        Col1,
        Roll_number div 3 + 1 as Random_Number
    from mytable
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search