skip to Main Content

I have a table in mysql database that has the following structure and data.

City    zip    year   value 

AB, NM  87102  2012   150
AB, NM  87102  2013   175
AB, NM  87102  2014   200
DL, TX  75212  2018   100
DL, TX  75212  2019   150
DL, TX  75212  2020   175 
AT, TX  83621  2020   150

I am trying to group by city, and zip fields and calculate % change between latest two available years for the group. Note, the latest two available years may not be consecutive and not all groups may have two years of data.

Expected output:

City     zip     pct_change

AB, NM   87102   14.3
DL, TX   75212   16.6

Query:

select City, zip, max(year), calculate diff between value
from table
group by City, zip
where ....

2

Answers


  1. lag(value) over(partition by City, zip order by year)
    

    will return the value from the preceding row with the same City and zip values ordered by the year column.

    Use a CTE and row_number() to limit the result to the latest row per group.

    Now you just need some "simple" math:

    with cte as (
      select 
        City,
        zip,
        value,
        lag(value) over(partition by City, zip order by year) as lvalue,
        row_number() over(partition by City, zip order by year desc) as rn  
      from tbl
    )
    select City, zip, (value/lvalue - 1) * 100 as avg_pct_change
    from cte
    where rn = 1 and lvalue is not null
    
    Login or Signup to reply.
  2. You may use ROW_NUMBER() window function to get the values for the last two years as the following:

    SELECT City, Zip, 
          ( 
           MAX(CASE WHEN rn =1 THEN value END)/ 
           MAX(CASE WHEN rn =2 THEN value END) - 1
          ) * 100 pct_change
    FROM
    (
      SELECT *, ROW_NUMBER() OVER (PARTITION BY City, Zip ORDER BY year DESC) rn
      FROM Table_Name
    ) T
    WHERE rn <= 2 
    GROUP BY City, Zip
    HAVING MAX(CASE WHEN rn =2 THEN value END) IS NOT null
    ORDER BY Zip
    

    MAX(CASE WHEN rn =1 THEN value END) will find the value for the last year.

    MAX(CASE WHEN rn =2 THEN value END) will find the value for before the last year.

    WHERE rn <= 2 only last two years are included for each group.

    HAVING MAX(CASE WHEN rn =2 THEN value END) IS NOT null groups should have more than one row.

    See a demo.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search