skip to Main Content

so I have a problem with DB migration from one webserver to another.
Server 1 has MySQL version 5.6 running under cPanel hosting…
Server 2 has MariaDB version 5.5 running under Webmin/Virtualmin
PHP version is the same on both of them…5.6

Anyway, I wanted to move a site from Server 1 to 2. I exported the DB using HeidiSQL and then imported the data on Server 2. The data imported fine, but the performance of the queries is worse by a factor of 10x. I went over the buffer size variables and all other "key" variables and they are the same or increased on Server 2.
I tried changing the storage engine from MyIsam to Aria or InnoDB but the results were the same…I also optimized the whole DB but again no luck. Indexes are the same on both servers.

I then decided to host the DB back on the original server and just load the files from the new one….I exported the new DB (only data using insert ignore) and imported that SQL back to Server 1. Immediately after the import the original DB started performing slowly as well…
Unless I use the original backup from when I moved the DB the first time, no matter how I update the DB to new data it starts performing poorly…

Example of query that takes 35 secs to run now when it used to take 3 secs:

select  p.*, pd.ID detailID,
        s.title subject, s.displayTitle, s.memberPanCode,
        s.virtualDelivery,
        CASE WHEN (DATE_ADD(p.releaseDate, INTERVAL 2 WEEK) > NOW()) THEN 1 ELSE 0 END pNew,
        CASE WHEN(s.publicChoice=1) THEN s.memberPanCode ELSE '' END usableSubject,
        CASE WHEN(s.displayTitle=1) THEN s.ID ELSE '0' END subjectID from  sProduct p
    inner join  sProductDetail pd  ON pd.ID_sProduct=p.ID
    left join  sProductDetailWarehouse pdw  ON pdw.ID_sProductDetail=pd.ID
    left join  sProductDetailSubjectPrice pdsp  ON pdsp.ID_sProductDetail=pd.ID
    left join  sSubject s  ON (s.memberPanCode=pdsp.memberPanCode
                          and  s.shownOnSite=1)
    where  (      s.publicChoice=1
              OR  s.defaultSubject=1
              OR  s.memberPanCode=''
              OR  s.memberPanCode IS NULL
           )
      AND  (      (pd.ID > 0  AND  s.displayTitle IS NOT NULL)
              OR  (pd.ID IS NULL  AND  s.displayTitle IS NULL )
              OR  (pd.ID > 0  and  p.ID_sSupplier > 0 )
              OR  (pd.ID > 0  and  pdsp.ID IS NULL )
              OR  (pd.ID > 0 and  s.displayTitle IS NULL )
           )
      AND  (DATE_ADD(NOW(), INTERVAL 1 DAY) > p.showDate)
      AND  (      pdw.stock > 0
              OR  pd.stock > 0
              OR  (p.ID_sSupplier > 0  AND  p.ID_sSupplier <> '3')
           )
      and  p.published IN (1,2)
    GROUP BY  p.ID, s.memberPanCode
    order by  p.showDate desc
    limit  3 

Explain statement for the new, slower DB:
enter image description here

Explain statement for the older, faster DB:
enter image description here

Any idea what is there left to check? What can I do to solve this?

Thank you for helping

3

Answers


  1. How big are the tables? InnoDB now does FULLTEXT. MyISAM is being orphaned. I agree with not using something as old as 5.5. MariaDB has 10.0, 10.1, 10.2, 10.3. MySQL has 5.6, 5.7, 8.0. And there has been a lot of optimization work done in most of those. By backtracking to 5.5, you probably lost some optimization features. Alas, I have not spotted the specific thing that is lost.

    The ORs are deadly for performance. They essentially prevent the use of indexes. I don’t see any obvious way to rearrange things — since the ORs are across multiple tables.

    Here are some composite, covering, indexes that might help:

    pd:   INDEX(ID_sProduct, ID, stock)  -- perhaps this order is best
    pdw:  INDEX(ID_sProductDetail, stock)  -- in this  order
    pdsp: INDEX(ID_sProductDetail, memberPanCode, ID)  -- in this order
    s:    INDEX(memberPanCode, shownOnSite)  -- in either order
    

    Also, add

    p:  INDEX(showDate, published, ID, ID_sSupplier) -- in this order
    

    and restructure the query by pulling p.* out of the main flow. Currently the bulky p.* is hauled through the joins, etc before whittling down to only 3 rows. By restructuring, we can find which 3 rows first, then fetch all the stuff:

    SELECT p2.*, etc.
           p2.releaseDate > NOW() - INTERVAL 2 WEEK  AS pNew,
           etc.
        FROM (
            SELECT  toss p.*, add p.ID, keep other columns
                FROM ...
                LEFT JOIN ...
                ORDER BY...
                LIMIT 3
             ) AS x
        JOIN sProduct AS p2  ON x.ID = p2.ID
        ORDER BY p2.showDate desc
    

    That new index is “covering” in that all the uses of p in the subquery are in the index. I observed that releaseDate could be left out and picked up with the second use of sProduct.

    I put sShowDate first in the index on the assumption that it does at least some filtering (p.showDate < NOW + INTERVAL 1 DAY).

    The GROUP BY and ORDER BY combination necessitates one or two filesorts; they cannot be eliminated. What I have done is minimize their cost by making them less bulky.

    Login or Signup to reply.
  2. This query is pretty rank, those WHERE conditions are needlessly complicated

    Your original query (formatted)

       SELECT p.*,
              pd.ID,
              detailID, /** include table alias? */
              s.title subject,
              s.displayTitle,
              s.memberPanCode,
              s.virtualDelivery,
              (p.releaseDate < NOW() - INTERVAL 2 WEEK) pNew /** Booleans are resolve to 1/0 in MySQL */
              CASE WHEN s.publicChoice = 1 THEN s.memberPanCode ELSE '' END usableSubject,
              CASE WHEN s.displayTitle = 1 THEN s.ID ELSE '0' END subjectID 
    
         FROM sProduct p
    
         JOIN sProductDetail pd 
           ON pd.ID_sProduct = p.ID
    
    LEFT JOIN sProductDetailWarehouse pdw 
           ON pdw.ID_sProductDetail = pd.ID
    
    LEFT JOIN sProductDetailSubjectPrice pdsp 
           ON pdsp.ID_sProductDetail = pd.ID 
    
    LEFT JOIN sSubject s 
           ON s.memberPanCode = pdsp.memberPanCode 
          AND s.shownOnSite=1
    
        WHERE (s.publicChoice=1 OR s.defaultSubject=1 OR s.memberPanCode='' OR s.memberPanCode IS NULL)  
          AND /** (
                (pd.ID > 0 AND s.displayTitle IS NOT NULL)
             OR (pd.ID IS NULL AND s.displayTitle IS NULL) 
             OR (pd.ID > 0 AND p.ID_sSupplier > 0) 
             OR (pd.ID > 0 AND pdsp.ID IS NULL) 
             OR (pd.ID > 0 AND s.displayTitle IS NULL)
              ) */ pd.ID > 0 /** see below */
          AND p.showDate < NOW() + INTERVAL 1 DAY 
          AND (pdw.stock > 0 OR pd.stock > 0 OR (p.ID_sSupplier > 0 AND p.ID_sSupplier <> '3'))  
          AND p.published IN (1,2)
    
     GROUP BY p.ID, s.memberPanCode 
     ORDER BY p.showDate DESC 
        LIMIT 3
    

    Let’s start with this condition

      AND (
             /** This with the last subcondition is just pd.ID > 0 */
             (pd.ID > 0 AND s.displayTitle IS NOT NULL)  
    
             /** This is impossible due to your INNER JOIN */
          OR (pd.ID IS NULL AND s.displayTitle IS NULL)
    
          OR (pd.ID > 0 AND p.ID_sSupplier > 0) 
          OR (pd.ID > 0 AND pdsp.ID IS NULL) 
    
             /** This with the first subcondition is just pd.ID > 0 */
          OR (pd.ID > 0 AND s.displayTitle IS NULL)
           ) 
    

    That whole condition resolves to pd.ID > 0, which is always TRUE unless you have manually added a product with ID of 0

    I suspect (p.ID_sSupplier > 0 AND p.ID_sSupplier <> '3') can become just p.ID_sSupplier <> 3 for the same reason

    This first condition seems super inclusive too

    WHERE (
          s.publicChoice=1 
         OR s.defaultSubject=1 
         OR s.memberPanCode='' 
         OR s.memberPanCode IS NULL
          )  
       ...
    

    Which leads me to question which rows you are actually trying to avoid with this condition?

    That GROUP BY clause is worrisome too, as you have no aggregate columns selected.. a lot of your final columns will be arbitrarily selected

    What are you actually trying to achieve with this query?

    It’s worth remembering that OR conditions tend to be slower to resolve than AND conditions when using queries

    Login or Signup to reply.
  3. The execution plans (show in the EXPLAIN output) are different. So we reasonably expect different performance characteristics.

    As @RickJames pointed out in a comment, there seem to be some indexes missing in the target environment.

    The question states: “Indexes are the same on both servers.”

    But the information provided leads us to a conclusion that the indexs are not the same.

    We see some indexes referenced in the output of the first EXPLAIN. And those index names are not found in the output of the second EXPLAIN. Those index names are also not found in the schema definition script.


    Q: Why are some of the indexes (reported in the first EXPLAIN missing) from the schema definition?

    Q: Was the output from mysqldump file for the migration modified to remove some index definitions?

    Q: Was some tool other than mysqldump used to extract the schema definition for the migration, and were the indexes were omitted?

    Q: Did some “create index” statements fail to execute in the target environment? (Possibly because of limits on sizes of columns in indexes?)


    Or maybe I have it the other way around, maybe there are indexes that were added in the target that didn’t exist in the source.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search