skip to Main Content

I Have a mysql(v8.x) table with 4 Billion records and I need to partition on one of the varchar column.
Partition on date column or integer column doesn’t work in this case.
This is a unique usecase.

Partitioning will be on cust_id column
cust_id contains data like below

| 2184-204CECE8FPNE_01_0102_4 |
| 2184-08NFINO3NC0E_01_0102_4 |
| 2185-B28RNFPE3NCS_01_0102_4 |
| 2185-HP3DNFNF2DNF_01_0102_4 |
| 9732-0I02NCNN30DW_01_0102_4 |
| 9732-8JFN30FJMDP3_01_0102_4 |
| 9732-B02F30CN3D25_01_0102_4 |
| 3184-N33NF4FNPF33_01_0102_4 |
| 3184-0702NDNDJD3S_01_0102_4 |

since we have hyphen in the data, how would I partition the data?
Yes, partition will be on first set of digits(before the character hyphen – before the First hyphen only)

I was trying to create a table as below snippet, i failed with it 🙁

Appreciate all the help,

Thank you

Here is the create statement I tried with, it didn’t work

CREATE TABLE cust_part (
  id varchar(127) NOT NULL,
  cust_id varchar(127) NOT NULL,
  PRIMARY KEY (id,cust_id),
  KEY idx_cust_id (cust_id)
) 
PARTITION BY RANGE COLUMNS (substring_index(cust_id,'-',1))
(
PARTITION p1000 VALUES LESS THAN ('1000'),
PARTITION p2000 VALUES LESS THAN ('2000'),
PARTITION p3000 VALUES LESS THAN ('3000'),
PARTITION p4000 VALUES LESS THAN ('4000'),
PARTITION p5000 VALUES LESS THAN ('5000'),
PARTITION p6000 VALUES LESS THAN ('6000'),
PARTITION p7000 VALUES LESS THAN ('7000'),
PARTITION p8000 VALUES LESS THAN ('8000'),
PARTITION p9000 VALUES LESS THAN ('9000'),
PARTITION p10000 VALUES LESS THAN ('10000'),
PARTITION pMax VALUES LESS THAN MAXVALUE
);

2

Answers


  1. https://dev.mysql.com/doc/refman/8.4/en/partitioning-columns-range.html says:

    • RANGE COLUMNS does not accept expressions, only names of columns.

    So you can’t use substring_index() or any other function.

    You don’t have to. You can rely on strings being ordered less than or greater than the fixed values you use in the partition defs.

    But you must keep in mind that because the column is a varchar, not an integer, you must order the ranges in alphabetical order, not numeric order.

    The following works:

    CREATE TABLE cust_part (
      id varchar(127) NOT NULL,
      cust_id varchar(127) NOT NULL,
      PRIMARY KEY (id,cust_id),
      KEY idx_cust_id (cust_id)
    ) 
    PARTITION BY RANGE COLUMNS (cust_id)
    (
    PARTITION p1000 VALUES LESS THAN ('1000-'),
    PARTITION p10000 VALUES LESS THAN ('10000-'),
    PARTITION p2000 VALUES LESS THAN ('2000-'),
    PARTITION p3000 VALUES LESS THAN ('3000-'),
    PARTITION p4000 VALUES LESS THAN ('4000-'),
    PARTITION p5000 VALUES LESS THAN ('5000-'),
    PARTITION p6000 VALUES LESS THAN ('6000-'),
    PARTITION p7000 VALUES LESS THAN ('7000-'),
    PARTITION p8000 VALUES LESS THAN ('8000-'),
    PARTITION p9000 VALUES LESS THAN ('9000-'),
    PARTITION pMax VALUES LESS THAN MAXVALUE
    );
    

    Alphabetically, '10000-' goes between '1000-' and '2000-'. You’ll have to order any further partitions in a similar way.

    Login or Signup to reply.
  2. Give this a try…

    CREATE TABLE cust_part (
      id varchar(127) NOT NULL,   -- How is this generated??
      cust_id varchar(127) NOT NULL,
      PRIMARY KEY (cust_id, id),  -- Note the order
      KEY idx_cust_id (id)
    )                             -- No partitioning
    

    If there are many rows for the typical cust_id, this will help due to "locality of reference" ("clustering").

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search