skip to Main Content

I am currently engaged in learning to formulate queries on the Datalemur platform.

Upon encountering a specific question, I diligently crafted a solution which successfully yielded the desired output. Regrettably, despite achieving the intended result, my solution failed to meet the acceptance criteria.

I am seeking guidance regarding the potential shortcomings within my query that may have contributed to its non-acceptance.

Problem set : https://datalemur.com/questions/odd-even-measurements

Question :

This is the same question as problem #28 in the SQL Chapter of Ace the Data Science Interview!

Assume you’re given a table with measurement values obtained from a Google sensor over multiple days with measurements taken multiple times within each day.

Write a query to calculate the sum of odd-numbered and even-numbered measurements separately for a particular day and display the results in two different columns. Refer to the Example Output below for the desired format.

Definition:

Within a day, measurements taken at 1st, 3rd, and 5th times are considered odd-numbered measurements, and measurements taken at 2nd, 4th, and 6th times are considered even-numbered measurements.
Effective April 15th, 2023, the question and solution for this question have been revised.

measurements Table:
Column Name Type
measurement_id  integer
measurement_value   decimal
measurement_time    datetime

measurements Example Input:

measurement_id  measurement_value   measurement_time
131233  1109.51 07/10/2022 09:00:00
135211  1662.74 07/10/2022 11:00:00
523542  1246.24 07/10/2022 13:15:00
143562  1124.50 07/11/2022 15:00:00
346462  1234.14 07/11/2022 16:45:00

Example Output:

measurement_day odd_sum even_sum
07/10/2022 00:00:00 2355.75 1662.74
07/11/2022 00:00:00 1124.50 1234.14

Explanation
Based on the results,

On 07/10/2022, the sum of the odd-numbered measurements is 2355.75, while the sum of the even-numbered measurements is 1662.74.
On 07/11/2022, there are only two measurements available. The sum of the odd-numbered measurements is 1124.50, and the sum of the even-numbered measurements is 1234.14.

My query

WITH cte AS (
    SELECT
        e.measurement_id,
        e.measurement_value,
        to_char(CAST(e.measurement_time AS DATE),
                'MM/DD/YYYY HH24:MI:SS') AS sdt,
        ROW_NUMBER()
        OVER(PARTITION BY to_char(CAST(e.measurement_time AS DATE),
                                  'MM/DD/YYYY HH24:MI:SS')
             ORDER BY
                 e.measurement_id
        )                                AS rnk
    FROM
        measurements e
), get_odd_data AS (
    SELECT
        sdt,
        SUM(measurement_value) AS odd_values
    FROM
        cte
    WHERE
        mod(rnk, 2) != 0
    GROUP BY
        sdt
    ORDER BY
        sdt
), get_even_data AS (
    SELECT
        sdt,
        SUM(measurement_value) AS even_values
    FROM
        cte
    WHERE
        mod(rnk, 2) = 0
    GROUP BY
        sdt
    ORDER BY
        sdt
)
SELECT
    o.sdt,
    o.odd_values,
    e.even_values
FROM
         get_odd_data o
    JOIN get_even_data e ON o.sdt = e.sdt
ORDER BY
    o.sdt;

3

Answers


  1. Chosen as BEST ANSWER

    I just removed to_char and used cast followed by date it worked

    My query :

    WITH cte AS (
        SELECT
            e.measurement_id,
            e.measurement_value,
            CAST(e.measurement_time AS DATE) AS sdt,
            ROW_NUMBER()
            OVER(PARTITION BY CAST(e.measurement_time AS DATE)
                 ORDER BY
                     e.measurement_id
            )                                AS rnk
        FROM
            measurements e
    ), get_odd_data AS (
        SELECT
            sdt,
            SUM(measurement_value) AS odd_values
        FROM
            cte
        WHERE
            mod(rnk, 2) != 0
        GROUP BY
            sdt
        ORDER BY
            sdt
    ), get_even_data AS (
        SELECT
            sdt,
            SUM(measurement_value) AS even_values
        FROM
            cte
        WHERE
            mod(rnk, 2) = 0
        GROUP BY
            sdt
        ORDER BY
            sdt
    )
    SELECT
        o.sdt,
        o.odd_values,
        e.even_values
    FROM
             get_odd_data o
        JOIN get_even_data e ON o.sdt = e.sdt
    ORDER BY
        o.sdt;
    

  2. If you "PARTITION BY measurement_time" which contains H:M:S, you are not really partitioning… at least not "within each day"… (you are partitioning "within each second") you should TRUNC() the date.

    Login or Signup to reply.
  3. Use the ROW_NUMBER analytic function to number the rows and take the modulus and then use conditional aggregation to generate the sums:

    SELECT measurement_day,
           SUM(CASE odd_even WHEN 1 THEN measurement_value END) AS odd_values,
           SUM(CASE odd_even WHEN 0 THEN measurement_value END) AS even_values
    FROM   (
      SELECT measurement_value,
             TRUNC(measurement_time) AS measurement_day,
             MOD(
               ROW_NUMBER() OVER(
                 PARTITION BY TRUNC(measurement_time)
                 ORDER BY measurement_time, measurement_id
               ),
               2
             ) AS odd_even
      FROM   measurements
    )
    GROUP BY measurement_day
    

    Note: If you want to group by days then, in Oracle, do not use CAST(value AS DATE) because a DATE data-type has both date and time components and you are not removing the time component by using CAST. Instead, use the TRUNC function which will truncate the time component back to the start of the day (midnight).

    Which, for the sample data:

    CREATE TABLE measurements (measurement_id, measurement_value, measurement_time) AS 
    SELECT 131233, 1109.51, DATE '2022-07-10' + INTERVAL '09:00:00' HOUR TO SECOND FROM DUAL UNION ALL
    SELECT 135211, 1662.74, DATE '2022-07-10' + INTERVAL '11:00:00' HOUR TO SECOND FROM DUAL UNION ALL
    SELECT 523542, 1246.24, DATE '2022-07-10' + INTERVAL '13:15:00' HOUR TO SECOND FROM DUAL UNION ALL
    SELECT 143562, 1124.50, DATE '2022-07-11' + INTERVAL '15:00:00' HOUR TO SECOND FROM DUAL UNION ALL
    SELECT 346462, 1234.14, DATE '2022-07-11' + INTERVAL '16:45:00' HOUR TO SECOND FROM DUAL;
    

    Outputs:

    MEASUREMENT_DAY ODD_VALUES EVEN_VALUES
    2022-07-11 00:00:00 1124.5 1234.14
    2022-07-10 00:00:00 2355.75 1662.74

    fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search