Azure - Identify the users with minimum votes and allocate new votes to that user

AvanishTomar
January 4, 2025
187 views
0 votes
2 Answers

I came up with one problem where I am looking for a solution as I have two tables:

Table1: Users

UserID	TotalTasks
A3	12
A4	14
A5	11

Table2: Tasks

TaskID	NewTask
1	4
2	5
3	3
4	2

Output:

TaskID	NewTask	UserID	UpdatedTasks
1	4	A5	15
2	5	A3	17
3	3	A4	17
4	2	A5	17

Here for table Tasks, first TaskID in order, I need to check which user has minimum TotalTasks from the users table and add that userID and NewTask + TotalTasks under the column "UpdatedTasks" and for the next TaskID, again need to check userID with updated minimum totalTasks and repeat the same process for each row without using loop as this I need to run in synapse sql.

I tried with recursive cte but not getting right output

Answers

As per this ask,

Recursive CTE is not supported in Azure Synapse SQL.

I have tried your scenario with Recursive CTE, and you see that it is not supported.

In the Synapse SQL, you can use loops as a workaround like below to achieve your requirement. Here, it needs a temporary table which you can drop after getting the desired result.

-- Temporary table with required columns
CREATE TABLE #temp1 (
    TaskID INT,
    NewTask INT,
    UserID NVARCHAR(50),
    UpdatedTasks INT
);

-- For the first task, get the user with min TotalTasks and insert it into the temporary table
INSERT INTO #temp1 (TaskID, NewTask, UserID, UpdatedTasks)
SELECT 
    t.TaskID,
    t.NewTask,
    u.UserID,
    u.TotalTasks + t.NewTask AS UpdatedTasks
FROM Tasks t
CROSS APPLY (
    SELECT TOP 1 UserID, TotalTasks
    FROM Users
    ORDER BY TotalTasks, UserID
) u
WHERE t.TaskID = (SELECT MIN(TaskID) FROM Tasks);

-- Declare the required values to use in the loop
DECLARE @next_TaskID INT, @next_Task INT, @curr_UserID NVARCHAR(50), @new_Tasks INT;

-- Use loops to iterate the remaining tasks. In each iteration, find the next tasks and get the user with minimum new tasks and insert into the temporary table 
WHILE EXISTS (SELECT 1 FROM Tasks WHERE TaskID NOT IN (SELECT TaskID FROM #temp1))
BEGIN
    SELECT TOP 1 
        @next_TaskID = TaskID,
        @next_Task = NewTask
    FROM Tasks
    WHERE TaskID NOT IN (SELECT TaskID FROM #temp1)
    ORDER BY TaskID;

    SELECT TOP 1 
        @curr_UserID = UserID,
        @new_Tasks = UpdatedTasks
    FROM (
        SELECT 
            u.UserID,
            COALESCE(a.UpdatedTasks, u.TotalTasks) + @next_Task AS UpdatedTasks
        FROM Users u
        LEFT JOIN #temp1 a ON u.UserID = a.UserID
    ) UpdatedUsers
    ORDER BY UpdatedTasks, UserID;

   
    INSERT INTO #temp1 (TaskID, NewTask, UserID, UpdatedTasks)
    VALUES (@next_TaskID, @next_Task, @curr_UserID, @new_Tasks);
END;

-- Use order by on TaskID and get the values from temporary table and drop it
SELECT 
    TaskID,
    NewTask,
    UserID,
    UpdatedTasks
FROM #temp1
ORDER BY TaskID;

DROP TABLE #temp1;

Output:

But as the data size increases, this approach might take more time compared to the recursive approach. So, you can try reading the tables as spark dataframe views in synapse notebook and in the spark SQL, you can use recursive CTE to get the desired data and write it back to the Synapse SQL tables.

Don’t know synapse, but if it supports basic analytic functions [ Row_Number Over() and Max() Over() ] then you could try something like here:

WITH
  Users AS
    ( Select 'A3' as UserID, 12 as TotalTasks Union All
      Select 'A4', 14 Union All
      Select 'A5', 11
    ),
  Tasks AS
    ( Select 1 as TaskID, 4 as NewTasks Union All
      Select 2, 5 Union All
      Select 3, 3 Union All
      Select 4, 2
    ), 
  grid AS 
    ( SELECT *
      FROM ( Select   TaskID, NewTasks, Row_Number() Over(Order By TaskID) as TaskOrdNum
             From     Tasks
             Order By TaskID
           ) t
      LEFT JOIN 
           ( Select UserID, TotalTasks, Row_Number() Over(Order By TotalTasks) as TotalsOrdNum
             From   Users
           ) u ON(u.TotalsOrdNum = t.TaskOrdNum)
     )
SELECT g.TaskID,g. NewTasks, 
       Coalesce(g.UserID, (Select UserID From grid Where TaskOrdNum = 1) ) as UserID,
       Coalesce(g.NewTasks + g.TotalTasks, Max(g.NewTasks + g.TotalTasks) Over()) as UpdatedTasks
FROM   grid g

taskid	newtasks	userid	updatedtasks
1	4	A5	15
2	5	A3	17
3	3	A4	17
4	2	A5	17

fiddle

Please signup or login to give your own answer.

Click here to cancel reply.

Azure – Identify the users with minimum votes and allocate new votes to that user

Answers