skip to Main Content

I have a table in which I record data about marketing results from the different marketing channels (SEM, SEO, Facebvook…). When I add a new piece of data I will record the channel, the date it refers to, how many sales were made and when was this data entered (last_update). Because of the way things are set up it is possible to have to go back and edit the number of sales made at a previous date. I currently do not want to overwrite past data and insert an update as a new line instead. Here below is the DB scheme:

id (int)
channel (small_int)
date (date in unix_time format)
sales (medium_int)
last_update (date in unix_time format)

How would I be able to extract:
– only the latest_update for every channel for every day?
– the same as above but at a date in the past (ie let’s imagine I want to see January’s data not as it stands today, but as it was entered at latest on February 1st)?

2

Answers


  1. The best way to approach this problem would be using WINDOW FUNCTIONS, however MySQL does not provide support for them, so it will require a hack.

    Is the id field a foreign key or a meaningless surrogate key identifying each row? I will suppose (because you are using MySQL) that the answer is YES.

    SELECT
       id,
       channel,
       date,
       sales,
       last_update 
    FROM
       my_table a       
    JOIN
       (
          SELECT
             channel,
             date,
             MAX(last_update) 
          FROM
             my_table 
          GROUP BY
             channel,
             date
       ) b 
          ON a.channel = b.channel 
          and a.date = b.date 
          and a.last_update = b.last_update;
    

    You may create a view with this query.

    OBS: If the column id is a meaningless and useless primary key, you should create a proper primary key with date, channel and last_updated columns. Also you should probably avoid using date as a column name as it is a reserved word and the last_updated should be changed from date to date time.

    Login or Signup to reply.
  2. In some pseudo-code: Select the records where last_update = MAX(last_update). In particular,

    SELECT *
    FROM mytable mt1
    WHERE mt1.last_update = (
        SELECT MAX(mt2.last_update)
        FROM mytable mt2
        WHERE mt1.channel = mt2.channel
          AND mt2.last_update <= '2015-02-01'
        LIMIT 1
    )
    

    I assume that for each channel, there could be two records with the same last_update (i. e. the pair (channel, last_update) isn’t unique). So LIMIT 1 chooses an arbitrary one.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search