skip to Main Content

I have the following table named Transactions where the timestamp column is in the Unix timestamp format with the minimum timestamp value being January 2008 and the maximum timestamp value is December 2008 (I’m only concerned with months for this problem):

+---------+--------+----------+------------+
| orderID | userID | cityName | timestamp  |
+---------+--------+----------+------------+
|       1 |      1 | city1    | 1199590481 |
|       2 |      2 | city2    | 1199590481 |
|       3 |      1 | city1    | 1199590481 |
|       4 |      2 | city2    | 1199590481 |
|       5 |      1 | city1    | 1202268881 |
|       6 |      2 | city2    | 1202268881 |
|       7 |      1 | city1    | 1202268881 |
|       8 |      2 | city2    | 1202268881 |
|       9 |      1 | city1    | 1199590481 |
|      10 |      2 | city2    | 1202268881 |
|      11 |      1 | city1    | 1205549051 |
|      12 |      2 | city2    | 1208227451 |
|      13 |      1 | city1    | 1210819451 |
|      14 |      2 | city2    | 1213497851 |
|      15 |      1 | city1    | 1216089851 |
|      16 |      2 | city1    | 1218768251 |
|      17 |      1 | city2    | 1221446651 |
|      18 |      2 | city2    | 1224038651 |
|      19 |      1 | city1    | 1226717051 |
|      20 |      2 | city2    | 1229309051 |
+---------+--------+----------+------------+
20 rows in set (0.01 sec)

Here’s the code to generate the Transactions table I have:

CREATE TABLE Transactions (orderID SMALLINT UNSIGNED,userID SMALLINT UNSIGNED,cityName VARCHAR(40),timestamp int(11),constraint pk_Transactions PRIMARY KEY (orderID));

INSERT INTO Transactions (orderID,userID,cityName,timestamp) VALUES(1,1,'city1',1199590481),(2,2,'city2',1199590481),(3,1,'city1',1199590481),(4,2,'city2',1199590481),(5,1,'city1',1202268881),(6,2,'city2',1202268881),(7,1,'city1',1202268881),(8,2,'city2',1202268881),(9,1,'city1',1200375000),(10,2,'city2',1203053400),(11,1,'city1',1205559000),(12,2,'city2',1208237400),(13,1,'city1',1210829400),(14,2,'city2',1213507800),(15,1,'city1',1216099800),(16,2,'city1',1218778200),(17,1,'city2',1221456600),         (18,2,'city2',1224048600),(19,1,'city1',1226727000),(20,2,'city2',1229319000);

My task is to count the common userIDs between each month X and month X+1, month X+2 etc, as in the first two columns of the table posted below (with the same format) to see how users are retained between each month of the year and every other later month.

Basically, If the user has ordered during the months written in the row values for the column Month 1 and column Month 2 in the below table, then that user is included in the count. If he hasn’t ordered in both Month 1 and Month 2, then the user isn’t counted.

The Month 1 and Month 2 columns in the table below should stay as they are, with only the counts being corrected. The number of rows in the table below should stay the same as well, this means including 0 count values in cases where no user has ordered in both the respective Month 1 and Month 2 row values.

In my Transactions table I only have two users, so the count should never be above 2 in any row which isn’t the case in the table below that my query generated:

+---------+---------+------------------------+
| Month 1 | Month 2 | Transacting user count |
+---------+---------+------------------------+
| 200801  | 200802  |                     25 |
| 200801  | 200803  |                      5 |
| 200801  | 200804  |                      5 |
| 200801  | 200805  |                      5 |
| 200801  | 200806  |                      5 |
| 200801  | 200807  |                      5 |
| 200801  | 200808  |                      5 |
| 200801  | 200809  |                      5 |
| 200801  | 200810  |                      5 |
| 200801  | 200811  |                      5 |
| 200801  | 200812  |                      5 |
| 200802  | 200803  |                      5 |
| 200802  | 200804  |                      5 |
| 200802  | 200805  |                      5 |
| 200802  | 200806  |                      5 |
| 200802  | 200807  |                      5 |
| 200802  | 200808  |                      5 |
| 200802  | 200809  |                      5 |
| 200802  | 200810  |                      5 |
| 200802  | 200811  |                      5 |
| 200802  | 200812  |                      5 |
| 200803  | 200804  |                      1 |
| 200803  | 200805  |                      1 |
| 200803  | 200806  |                      1 |
| 200803  | 200807  |                      1 |
| 200803  | 200808  |                      1 |
| 200803  | 200809  |                      1 |
| 200803  | 200810  |                      1 |
| 200803  | 200811  |                      1 |
| 200803  | 200812  |                      1 |
| 200804  | 200805  |                      1 |
| 200804  | 200806  |                      1 |
| 200804  | 200807  |                      1 |
| 200804  | 200808  |                      1 |
| 200804  | 200809  |                      1 |
| 200804  | 200810  |                      1 |
| 200804  | 200811  |                      1 |
| 200804  | 200812  |                      1 |
| 200805  | 200806  |                      1 |
| 200805  | 200807  |                      1 |
| 200805  | 200808  |                      1 |
| 200805  | 200809  |                      1 |
| 200805  | 200810  |                      1 |
| 200805  | 200811  |                      1 |
| 200805  | 200812  |                      1 |
| 200806  | 200807  |                      1 |
| 200806  | 200808  |                      1 |
| 200806  | 200809  |                      1 |
| 200806  | 200810  |                      1 |
| 200806  | 200811  |                      1 |
| 200806  | 200812  |                      1 |
| 200807  | 200808  |                      1 |
| 200807  | 200809  |                      1 |
| 200807  | 200810  |                      1 |
| 200807  | 200811  |                      1 |
| 200807  | 200812  |                      1 |
| 200808  | 200809  |                      1 |
| 200808  | 200810  |                      1 |
| 200808  | 200811  |                      1 |
| 200808  | 200812  |                      1 |
| 200809  | 200810  |                      1 |
| 200809  | 200811  |                      1 |
| 200809  | 200812  |                      1 |
| 200810  | 200811  |                      1 |
| 200810  | 200812  |                      1 |
| 200811  | 200812  |                      1 |
+---------+---------+------------------------+
66 rows in set (0.00 sec)

As you can see, the Transacting user count values don’t make any sense at all, especially given that I have only 2 userIDs in my Transactions table.

I’d really appreciate some help in writing a query for this. Here is the query I used to unsuccessfully generate the table above:

SELECT DISTINCT FROM_UNIXTIME(T1.timestamp, "%Y%m") 'Month 1',
FROM_UNIXTIME(T2.timestamp, "%Y%m") 'Month 2', 
count(*) 'Transacting user count'
FROM Transactions T1
JOIN Transactions T2
ON FROM_UNIXTIME(T1.timestamp, "%Y%m") < FROM_UNIXTIME(T2.timestamp, "%Y%m")
JOIN (
  SELECT count(*) FROM(
    SELECT userID
    FROM Transactions T1
    WHERE FROM_UNIXTIME(timestamp, "%Y%m") = (FROM_UNIXTIME(T1.timestamp, "%Y%m"))
    INTERSECT
    SELECT userID
    FROM Transactions T2
    WHERE FROM_UNIXTIME(timestamp, "%Y%m") = (FROM_UNIXTIME(T2.timestamp, "%Y%m"))
    ) union_table
  ) count_orders
GROUP BY FROM_UNIXTIME(T1.timestamp, "%Y%m"), FROM_UNIXTIME(T2.timestamp, "%Y%m")
ORDER BY 1, 2;

If anything is unclear then please let me know and I’ll try my best to clarify.

2

Answers


  1. This SQL should give you what you want. If it doesn’t please update your question with the result you are expecting.

    with DATA1 as (
      SELECT userid
      , FROM_UNIXTIME(timestamp, "%Y%m")  YEAR_MONTH1
      , date_format(FROM_UNIXTIME(timestamp) + interval 1 month, "%Y%m")  YEAR_MONTH2
      FROM Transactions
    )
    SELECT A.YEAR_MONTH1, A.YEAR_MONTH2, COUNT(A.USERID)
    FROM DATA1 A
    INNER JOIN DATA1 B ON A.USERID = B.USERID AND A.YEAR_MONTH2 = B.YEAR_MONTH2
    GROUP BY A.YEAR_MONTH1, A.YEAR_MONTH2
    ORDER BY A.YEAR_MONTH1, A.YEAR_MONTH2
    

    SECOND ANSWER

    This may give you what you want. If not, please update the table in your question to show the actual values you are expecting.

    with dates1 as (
      select distinct FROM_UNIXTIME(timestamp, "%Y%m") month_val
      FROM Transactions
    ),
    date_combos as (
      select a.month_val as YEAR_MONTH1
      , b.month_val as YEAR_MONTH2
      from dates1 a
      cross join dates1 b
      where  a.month_val <  b.month_val
    
    ),
    txn_month as (
      select distinct userID, FROM_UNIXTIME(timestamp, "%Y%m") order_month
      from Transactions
    ),
    usr_month1 as (
      select dc.YEAR_MONTH1, dc.YEAR_MONTH2
      ,tm1.userId, tm1.order_month
      from date_combos dc
      inner join txn_month tm1 on dc.year_month1 = tm1.order_month
      inner join txn_month tm2 on dc.year_month2 = tm2.order_month
        and tm1.userId = tm2.userId
    )
    select um1.year_month1, year_month2, count(um1.userId)
    from usr_month1 um1
    group by  um1.year_month1, year_month2
    order by  um1.year_month1, year_month2
    ;
    
    Login or Signup to reply.
  2. Try also this

    with tranmon as(
      SELECT userid
          ,FROM_UNIXTIME(timestamp, "%Y%m") mon
      FROM Transactions T1
      GROUP BY userid, FROM_UNIXTIME(timestamp, "%Y%m")
    )
    select from_mon,to_mon
       ,case when to_mon is null then 0 else count(*) end usersqty
    from(
       select t1.userid,t1.mon from_mon,t2.mon to_mon
       from tranmon t1
       left join tranmon t2
          on t2.userid=t1.userid and t2.mon>t1.mon
       group by t1.userid,t1.mon,t2.mon
    ) t
    group by from_mon,to_mon;
    

    Result (part)

    from_mon to_mon usersqty
    200801 200802 2
    200801 200803 1
    200801 200805 1
    200801 200807 1
    200801 200809 1
    200801 200811 1
    200801 200804 1
    200801 200806 1
    200801 200808 1
    200801 200810 1
    200801 200812 1
    200802 200803 1
    200802 200805 1
    200802 200807 1


    Updated2:
    New query example. All month from first to last month (YYYYMM) of table Transactions included, regardless of whether the user had orders this month

    with recursive allmonths as(
      select FROM_UNIXTIME(min(timestamp), "%Y%m") mn
          ,FROM_UNIXTIME(max(timestamp), "%Y%m") maxmn
      from Transactions
      union all
      select mn+1, maxmn from allmonths where mn<maxmn
    )
    ,allmontomon as(
      select t1.mn from_mon,t2.mn to_mon
      from allmonths t1
      left join allmonths t2 on t2.mn>t1.mn
    )
    ,tranmon as(
      SELECT userid
          ,FROM_UNIXTIME(timestamp, "%Y%m") mon
      FROM Transactions T1
      GROUP BY userid, FROM_UNIXTIME(timestamp, "%Y%m")
    )
    ,um as(
    select from_mon,to_mon
      ,case when to_mon is null then 0 else count(*) end usersqty
      ,group_concat(cast(t.userid as char)) lst
    from(
       select t1.userid,t1.mon from_mon,t2.mon to_mon
       from tranmon t1
       left join tranmon t2
          on t2.userid=t1.userid and t2.mon>t1.mon
       group by t1.userid,t1.mon,t2.mon
    ) t
      group by from_mon,to_mon
    )
    select * 
    from allmontomon c
    left join um on c.from_mon=um.from_mon and c.to_mon=um.to_mon
    order by c.from_mon,c.to_mon
    

    Example here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search