skip to Main Content

I have a MySQL table temp_employee as follows:

CREATE TABLE `temp_employee` (
  `employee_id` varchar(22) COLLATE utf8mb4_unicode_ci NOT NULL,
  `application_date` datetime NOT NULL,
  PRIMARY KEY (`employee_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

And data in the table as follows:

employee_id application_date
emp.10062808068 2023-03-30 07:09:31
emp.10036984442 2023-03-30 09:48:30
emp.10041499192 2023-03-30 09:58:55
emp.10012314367 2023-03-30 03:33:28

Insert query for temp_employee table:

INSERT INTO `temp_employee` (`employee_id`, `application_date`)
VALUES
    ('emp.10012314367', '2023-03-30 03:33:28'),
    ('emp.10036984442', '2023-03-30 09:48:30'),
    ('emp.10041499192', '2023-03-30 09:58:55'),
    ('emp.10062808068', '2023-03-30 07:09:31');

There is another table employee_logs as follows:

CREATE TABLE `employee_logs` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `employee_id` varchar(22) COLLATE utf8mb4_unicode_ci NOT NULL,
  `type` tinyint(1) NOT NULL,
  `status` tinyint(1) NOT NULL,
  `updated_time` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

The data in the employee_logs table as follows:

id employee_id type status updated_time
1 emp.10000000001 1 2 2021-04-30 10:19:31
2 emp.10000000002 2 2 2022-06-30 17:29:31
3 emp.10062808068 1 2 2023-03-30 11:19:20
4 emp.10000000003 2 3 2023-03-30 03:03:30
5 emp.10000000004 1 2 2023-03-30 04:04:40
6 emp.10012314367 2 1 2023-03-30 13:13:31
7 emp.10062808068 2 3 2023-03-20 17:24:10
8 emp.10036984442 1 2 2023-03-30 09:09:09
9 emp.10012314367 2 1 2023-03-20 10:10:10
10 emp.10012314367 2 2 2023-03-21 12:12:12

Insert query for employee_logs table:

INSERT INTO `employee_logs` (`id`, `employee_id`, `type`, `status`, `updated_time`)
VALUES
    (1, 'emp.10000000001', 1, 2, '2021-04-30 10:19:31'),
    (2, 'emp.10000000002', 2, 2, '2022-06-30 17:29:31'),
    (3, 'emp.10062808068', 1, 2, '2023-03-30 11:19:20'),
    (4, 'emp.10000000003', 2, 3, '2023-03-30 03:03:30'),
    (5, 'emp.10000000004', 1, 2, '2023-03-30 04:04:40'),
    (6, 'emp.10012314367', 2, 1, '2023-03-30 13:13:31'),
    (7, 'emp.10062808068', 2, 3, '2023-03-20 17:24:10'),
    (8, 'emp.10036984442', 1, 2, '2023-03-30 09:09:09'),
    (9, 'emp.10012314367', 2, 1, '2023-03-20 10:10:10'),
    (10, 'emp.10012314367', 2, 2, '2023-03-21 12:12:12');

I want Mysql query to get type, status & updated_time for each employee_id in temp_employee table which select single row from employee_logs table with condition status != 3 AND updated_time <= application_date.

Expected Result (No. of rows same as the count of temp_employee table)

employee_id type status updated_time
emp.10062808068 NULL NULL NULL
emp.10036984442 1 2 2023-03-30 09:09:09
emp.10041499192 NULL NULL NULL
emp.10012314367 2 2 2023-03-21 12:12:12

employee_logs table is huge with more than 40 million rows and can have multiple rows with the same employee_id.

Explanation on the employee_logs table which rows should picked by the condition (status <> 3 AND updated_time <= application_date), as follows:

1   emp.10000000001    1       2      2021-04-30 10:19:31 -- should not as emp_id not exists in temp table 
2   emp.10000000002    2       2      2022-06-30 17:29:31 -- should not as emp_id not exists in temp table
3   emp.10062808068    1       2      2023-03-30 11:19:20 -- should not return as updated_time > application_date 2023-03-30 07:09:31
4   emp.10000000003    2       3      2023-03-30 03:03:30 -- should not as emp_id not exists in temp table
5   emp.10000000004    1       2      2023-03-30 04:04:40 -- should not as emp_id not exists in temp table
6   emp.10012314367    2       1      2023-03-30 13:13:31 -- should not return as updated_time > application_date 2023-03-30 03:33:28
7   emp.10062808068    2       3      2023-03-20 17:24:10 -- should not return as status = 3
8   emp.10036984442    1       2      2023-03-30 09:09:09 -- should return as updated_time <= application_date && status != 3 
9   emp.10012314367    2       1      2023-03-20 10:10:10 -- should not return as updated_time < application_date 2023-03-30 03:33:28 but not latest
10  emp.10012314367    2       2      2023-03-21 12:12:12 -- should return as updated_time < application_date 2023-03-30 03:33:28

Any help in the query to achieve the desired output will be highly appreciated. TIA

3

Answers


  1. Chosen as BEST ANSWER

    The following query worked for me, the application_date column value in the temp_employee table was incorrect, after the fix the query return expected result.

    SELECT t.employee_id, oel.type, oel.status, oel.updated_time
    FROM (
        SELECT t.employee_id, MAX(el.updated_time) updated_time
        FROM temp_employee t LEFT JOIN employee_logs el
            ON t.employee_id = el.employee_id 
            AND el.status <> 3 
            AND el.updated_time <= t.application_date
        GROUP BY t.employee_id
    ) t LEFT JOIN employee_logs oel
    ON t.employee_id = oel.employee_id AND t.updated_time = oel.updated_time;
    

    SQL Fiddle link solution


  2. I assume this is a temporary required query, to check if employees have agreed to some changes published (or sth. the like)

    So, performance shouldn’t be all to critical.

    As a first step, I would use this query to determine the most recent status values per employee. Join the emp_log table on itself, with condition L.emp_id = R.emp_id and L.update_date < R.update_date – and pick the results, where the right set is null (means, there is no further record with a more recent update-state for this employee):

    SELECT 
      L.employee_id,
      L.type,
      L.status,
      L.updated_time
    FROM 
      employee_logs L
    LEFT JOIN 
      employee_logs R
    on
      L.employee_id = R.employee_id AND
      L.updated_time < R.updated_time
    WHERE
      isnull(R.updated_time)
    

    http://sqlfiddle.com/#!9/7a95da/2

    Now, you got the most recent updates – just missing the employees that have no updates -> right join them in.

    SELECT 
       ifnull(temp.employee_id, e.employee_id) as employee_id,
       temp.type,
       temp.status,
       temp.updated_time,
       e.application_date
     FROM 
    (SELECT 
          L.employee_id,
          L.type,
          L.status,
          L.updated_time
        FROM 
          employee_logs L
        LEFT JOIN 
          employee_logs R
        on
          L.employee_id = R.employee_id AND
          L.updated_time < R.updated_time
        WHERE
          isnull(R.updated_time)
        ) as temp
        RIGHT JOIN
          employees e
        ON 
          temp.employee_id = e.employee_id
    

    http://sqlfiddle.com/#!9/7a95da/10

    On the final result, you can apply your where-conditions as required. (WHERE temp.status <> 3 and temp.update_time <= e.application_date) – I left this out, cause it doesn’t make sence to me, that there are update-times BEFORE the employee even applied…

    Login or Signup to reply.
  3. You can achieve the desired output by modifying your query to use a subquery with a conditional aggregation to select the appropriate rows from the employee_logs table. Here’s the query to get the type, status, and updated_time for each employee_id in the temp_employee table based on your conditions:

    http://sqlfiddle.com/#!9/e73ea8/6

        SELECT
        te.employee_id,
        CASE WHEN el.status IS NULL THEN NULL ELSE el.type END AS type,
        CASE WHEN el.status IS NULL THEN NULL ELSE el.status END AS status,
        CASE WHEN el.status IS NULL THEN NULL ELSE el.updated_time END AS updated_time
    FROM temp_employee te
    LEFT JOIN (
        SELECT
            el.employee_id,
            MAX(CASE WHEN el.status != 3 AND el.updated_time <= te.application_date THEN el.updated_time ELSE NULL END) AS updated_time,
            MAX(CASE WHEN el.status != 3 AND el.updated_time <= te.application_date THEN el.type ELSE NULL END) AS type,
            MAX(CASE WHEN el.status != 3 AND el.updated_time <= te.application_date THEN el.status ELSE NULL END) AS status
        FROM employee_logs el
        JOIN temp_employee te ON el.employee_id = te.employee_id
        GROUP BY el.employee_id, te.application_date
    ) el ON te.employee_id = el.employee_id;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search