skip to Main Content

I have a sql as shown below.

SELECT * FROM payment
WHERE is_deleted = 0
    AND is_privacy = 0
    AND ou_code IN ('A1', 'A2', ..., 'W1', 'W2'...)
LIMIT 0, 20

There are many elements after in, about 3,000.payment table has 50 million data, and ou_code, is_deleted, is_privacy all have index. I tried the following methods:

1、use a temporary table: create a new temporary table, insert the ou_code list into the table and create an index, and then use inner join. But my Java application is prohibited from using DDL statements in the program, so I can’t try this method.

2、 So I tried to create a temporary table with sql syntax: like below:

SELECT * FROM 
(VALUES ROW('A1'), ROW('A2')... ROW('W1'), ROW('W2')...) tem(`code`)
inner join 
payment
on payment.`ou_code` =  tem.`code`
WHERE is_deleted = 0
    AND is_privacy = 0
LIMIT 0, 20

But this table doesn’t seem to be able to create an index, so the efficiency is not improved, but it’s slower.

3、use exists: I tried it, but this method cannot create an index either and is even slower.

SELECT *
FROM 
payment
WHERE is_deleted = 0
    AND is_privacy = 0
    AND exists (select 1 from (SELECT 'WA1' AS `code`
UNION ALL  
SELECT 'WA2'  
UNION ALL  
SELECT 'WA3') tem where tem.code = payment.ou_code)
LIMIT 0, 20

4、 full ou code table: I have another idea like this, but it has not been implemented yet. Can you help me analyze whether it is feasible in theory?
The meaning of ou code is company code, so the full amount of ou_code is known, the above list in sql is also part of the full amount.
Can I first create a table with index in advance, insert the full amount of ou_code data, and then use a sql similar to the following?

SELECT payment.* FROM payment
inner join full_ou_code on payment.ou_code = full_ou_code.ou_code
WHERE is_deleted = 0
    AND is_privacy = 0
    AND full_ou_code.ou_code IN ('A1', 'A2', ..., 'W1', 'W2'...)
LIMIT 0, 20

Any idea will be appreciated.

2

Answers


  1. First of all I agree with @Jonas Metzler if you don’t have DDL rights you should talk to someone who is able to fix the DB structure.

    This is the best way to solve your issue. However for a temporary workaround you could try to batch your IN clauses. This could help you but only little don’t expect too much.

    With batching I mean something like this

        SELECT * FROM payment
    WHERE is_deleted = 0
        AND is_privacy = 0
        AND ou_code IN ('A1', 'A2', ..., 'A500')
    LIMIT 0, 20;
    
    -- Next batch...
    SELECT * FROM payment
    WHERE is_deleted = 0
        AND is_privacy = 0
        AND ou_code IN ('A501', 'A502', ..., 'A1000')
    LIMIT 0, 20;
    

    Keep in mind this is quick and dirty pseudo code. There may be syntax error’s but should give you an idea of what I mean

    Login or Signup to reply.
  2. The query is:

    SELECT * FROM payment
    WHERE is_deleted = 0
        AND is_privacy = 0
        AND ou_code IN ('A1', 'A2', ..., 'W1', 'W2'...)
    LIMIT 0, 20
    

    You can optimize this with a compound index on (is_deleted, is_privacy, ou_code), but keep in mind:

    The order of columns is important. For the query you show, the column ou_code must be last in that set of columns. For an explanation, see my answer to Does Order of Fields of Multi-Column Index in MySQL Matter

    MySQL can optimize long IN() lists, but by default MySQL is configured to cap the amount of memory used during the optimization phase. You can configure this. See my answer to MySQL 'IN' operator on large number of values

    Even a single-column index on ou_code may help, but the memory configuration is still needed.

    If you don’t have the index I described, and you don’t have DDL privileges, then you’ll have to talk to your database administrator and get them to create the index for you.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search