skip to Main Content

I have a mysql database that contains data collected from various weather sensors and the like.
All the data is easy to handle apart from the raw rain fall which is by nature incremental but can be reset to zero from time to time ether from the internal 12bit counter wrapping or a simple power failure. example of data (much simplified):

|   time     |           rain_raw   |
| 1716805915 |       37.55905511796 |
| 1716806011 |       37.55905511796 |
| 1716806107 |       37.55905511796 |
| 1716806203 |       37.55905511796 |
| 1716806251 |       37.55905511796 |<------ Need to find this row
| 1716807098 |                    0 |
| 1716807146 |                    0 |
| 1716807866 |                    0 |
| 1716822169 |                    0 |
| 1716822217 |       0.011811023622 |
| 1716822265 |       0.011811023622 |
| 1716822361 |       0.011811023622 |
| 1716822409 |       0.011811023622 |
| 1716822457 |       0.011811023622 |
| 1716822505 |       0.011811023622 |
| 1716822553 |       0.023622047244 |
| 1716822601 |       0.023622047244 |
| 1716822649 |       0.023622047244 |
| 1716822697 |       0.023622047244 |<------ Current last entry

The first column is the time (seconds from epoch) the second the amount of rain in inches. I dont want to search top down for two reasons the database is large so I feel this is inefficient and secondly there are a number of data gliches (historical data not important currently) which can trigger false positives with the forward search I have tried. I need to find the indicated row so that I can use the rain fall value as an adder to subsequent entries.

So my question, is there and efficient way to search backwards from the current last entry to the point where the data got reset. I am using Mariadb with Qt6 and C++.

Thanks for any help.

This is my current foward search:

SELECT time,rain_raw 
FROM weather_data t1 
WHERE rain_raw > (
    SELECT rain_raw 
    FROM weather_data t2 
    WHERE t1.rain_raw > t2.rain_raw 
    ORDER BY time desc 
    LIMIT 1 
);

This worked with a simple data set but was not good with the actual data set I wish to use.

2

Answers


  1. This is just out of my head, I have not created any test tables:

    SELECT time,rain_raw
    FROM weather_data t1
    WHERE t1.time < (
      SELECT time FROM weather_data t2
      WHERE t2.rain_raw = 0 ORDER BY time desc LIMIT 1
    )
    AND t2.rain_raw > 0
    ORDER BY t1.time DESC
    LIMIT 1;
    
    Login or Signup to reply.
  2. ScifiBob has already given an answer for finding the row mentioned in your question.

    This is a suggestion for an alternative approach. Instead of

    find the indicated row so that I can use the rain fall value as an adder to subsequent entries.

    You could use the LAG() window function to add the per row rainfall diffs, and then sum them up to get the cumulative totals, ignoring any diffs that are negative:

    WITH rainfall_with_prev (time, rain_raw, rain_raw_prev) AS (
        SELECT time, rain_raw, LAG(rain_raw) OVER (ORDER BY time ASC)
        FROM weather_data
    )
    SELECT time,
      SUM(
        IF(
          rain_raw_prev IS NULL,
          rain_raw,
          GREATEST(rain_raw - rain_raw_prev, 0))
      ) OVER (ORDER BY time ASC) AS cumulative_rain
    FROM rainfall_with_prev;
    

    Here’s a db<>fiddle.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search