Stored procedure to get a desire result set to improve performance - MYSQL

micronyks
October 4, 2022
219 views
0 votes
2 Answers

I’m completely new in mysql and struggling to write a store procedure to get a required result set.
As you can see below are the tables I have,

I’m using node & express api to connect to mysql database. Then I make individual query to get desire result using for loop. As I deal with million of records, it becomes slow eventually making queries from node api to mysql every time. I want to improve the performance.

Basically I want to get the timeseries data for each currency for given period.

Currently what I’m doing is as below,

1) first get all currencies for which I want to get time series data

Payload to get currencies.

{
  "currencies":[ "Rupee", "Dollar"]
}

query to get relevant result

Select name, tablename from Currency_table where name IN (${currencies})

NOTE: I can’t include refernece_col as it is sensitive detail in real scenario and it can not be exposed to outside world

The Resultset that I get is as below and I store it in a variable called currency_result_set,

2) Then, I loop through each record and prepare a query to get reference_col column first for each record,

  for ( const record of currency_result_set){
        // apply some login I generate below payload
  }

*Payload to get reference_col

{
  "starttime": "2020-04-23T7:19:00Z",
  "endtime": "2020-07-23T7:19:00Z",
  "currencies": [
    {
          "name": "Rupee",
          "tablename": "rupee_table"
    },
    {
          "name": "Dollar",
          "tablename": "dollar_table"
    },   
  ]
}

get this payload and prepare below query to get reference_col for each record

query

select  reference_col, name, tablename  from rupee_table
                                    where name = "Rupee"
                                    and tablename = "rupee_table" 
Union All 

select  reference_col, name, tablename  from dollar_table
                                    where name = "Dollar"
                                    and tablename = "dollar_table"

Resultset

3) finally by using for loop again and applying some logic, I generate below queries

SELECT json_arrayagg(obj) as RESULT FROM( 

    SELECT json_object(
        "name", "Rupee",
        "value", json_arrayagg(json_array(UNIX_TIMESTAMP(datetime), C1))) AS obj 
        FROM rupee_table 
        where datetime 
        between "2020-04-23T7:19:00Z" AND "2020-07-23T7:19:00Z"
                                       
    UNION ALL 
                                
    SELECT json_arrayagg(obj) as RESULT FROM( SELECT json_object(
        "name", "Dollar",
        "value", json_arrayagg(json_array(UNIX_TIMESTAMP(datetime), C2))) AS obj 
        FROM dollar_table 
        where datetime 
        between "2020-04-23T7:19:00Z" AND "2020-07-23T7:19:00Z"
        
) x

And finally the desire result set that I get as below,

[
    {
        "name": "Rupee",
        "value": [
            [
                1627166760,           // this is timestamp of the date comes using UNIX_TIMESTAMP function
                10
            ],
            [
                1627166820,
                20
            ],
            [
                1627166880,
                30
            ],
            [
                1627166940,
                40
            ]
        ]
    },
    {
        "signalname": "Dollar",
        "datapoints": [
            [
                1627166760,
                50
            ],
            [
                1627166820,
                60
            ],
            [
                1627166880,
                70
            ],
            [
                1627166940,
                80
            ]
        ]
    }
]

and I feed this result to Highcharts and it draws the relevant lines/charts.

AS you can see, I loop through multiple times to get desire result. As I deal with million of records, preparing query again n again and sending it to mysql every time, has become very slow.
I want to know how to write a stored procedure to get desire result.

Answers

- RickJames
- October 4, 2022 at 5:06 pm
- 0 votes
0
- A MySQL Stored Procedure won’t necessarily help with performance. One case where it can help dramatically is when the MySQL server is a long way from the client and you can bundle together multiple SQL statements in the SP. In this situation, you are cutting down on the network latency of multiple roundtrips. That is, the SP, itself, is not providing any performance.
- An SP allows you to do looping. But this is usually a bad programming practice for SQL. Doing actions en masse with SQL can be an order of magnitude faster than, say, acting on one row at a time.
- If there is any enhancement in "compiling" the SP, it is minor, at least compared to the two items above.
- Having two ‘identical’ tables and having to pick between them (or UNION them) is usually a clumsy and inefficient schema design. Avoid it where practical. (In your example, had a single table with an extra column specifying the currency.)
Login or Signup to reply.

The first things, you’ll need are table and column names (I’ll use CTEs to explain it step by step):

WITH tab_col_names AS (
    SELECT name, 
           tablename, 
           concat('c', row_number() over (order by id)) as colname
      FROM currency_table
)
SELECT * FROM tab_col_names;

name	tablename	colname
Rupee	rupee_table	c1
Dollar	dollar_table	c2

Using this CTE, we can construct the SQL statement.

WITH tab_col_names AS (
    SELECT name, tablename, concat('c', row_number() over (order by id)) as colname
      FROM currency_table
)
SELECT concat(
         'SELECT json_arrayagg(obj) AS RESULT FROM (', 
           group_concat(
             concat('SELECT json_object("name", "', 
                    name, 
                    '", "value", json_arrayagg(json_array(UNIX_TIMESTAMP(dt), ',
                    colname, 
                    '))) AS obj FROM ', 
                    tablename,
                    ' where dt between "2020-04-23T7:19:00Z" AND "2020-07-23T7:19:00Z"'
             ) 
             SEPARATOR ' UNION '
           ), ') x'
         )  FROM tab_col_names

I put each argument of the inner concat on a separate line to make things more clear. For each record in tab_col_names, the inner concat constructs a string containing a SQL statement. The group_concat concatenates all those strings using the given separator ' UNION ' which gives us a single record. This is ‘enclosed’ by the outer SELECT statement using the outer concat.

The result is a long statement:

result
SELECT json_arrayagg(obj) AS RESULT FROM (SELECT json_object("name", "Rupee", "value", json_arrayagg(json_array(UNIX_TIMESTAMP(dt), c1))) AS obj FROM rupee_table where dt between "2020-04-23T7:19:00Z" AND "2020-07-23T7:19:00Z" UNION SELECT json_object("name", "Dollar", "value", json_arrayagg(json_array(UNIX_TIMESTAMP(dt), c2))) AS obj FROM dollar_table where dt between "2020-04-23T7:19:00Z" AND "2020-07-23T7:19:00Z") x

result

SELECT json_arrayagg(obj) AS RESULT FROM (SELECT json_object("name", "Rupee", "value", json_arrayagg(json_array(UNIX_TIMESTAMP(dt), c1))) AS obj FROM rupee_table where dt between "2020-04-23T7:19:00Z" AND "2020-07-23T7:19:00Z" UNION SELECT json_object("name", "Dollar", "value", json_arrayagg(json_array(UNIX_TIMESTAMP(dt), c2))) AS obj FROM dollar_table where dt between "2020-04-23T7:19:00Z" AND "2020-07-23T7:19:00Z") x

In this example, the datetime range is still hardcoded. Having a stored procedure, we can replace them by the procedure’s parameters and as parameters. Luckily, there’s no risk of SQL injection since parameters are typed, so we can inject the parameters directly:

DELIMITER //
CREATE PROCEDURE all_currencies(dt_from datetime, dt_until datetime) 
BEGIN
  DECLARE query TEXT;
  WITH tab_col_names AS (
    SELECT name, tablename, concat('c', row_number() over (order by id)) as colname
      FROM currency_table
  )
  SELECT concat('SELECT json_arrayagg(obj) AS RESULT FROM (', group_concat(concat('SELECT json_object("name", "', name, 
        '", "value", json_arrayagg(json_array(UNIX_TIMESTAMP(dt), ', colname, '))) AS obj ' 
        'FROM ', tablename,
        ' where dt between ''', dt_from, ''' AND ''', dt_until, '''') SEPARATOR ' UNION '), ') x')  
      INTO query
    FROM tab_col_names;
  SET @q := query;
  PREPARE stmt FROM @q;
  EXECUTE stmt;
  DEALLOCATE PREPARE stmt;
END
//

Having this, you can call the procedure using datetime values:

CALL all_currencies('2007-04-01 00:00:00', '2007-06-30 00:00:00');

which results in

RESULT
[{"name": "Rupee", "value": [[1177312740, 10], [1179904740, 20], [1182586740, 30]]}, {"name": "Dollar", "value": [[1177312740, 100], [1179904740, 200], [1182586740, 300]]}]

Here’s a fiddle.

Please signup or login to give your own answer.

Click here to cancel reply.

Stored procedure to get a desire result set to improve performance – MYSQL

Answers