skip to Main Content

I am implementing data partitioning on mysql server(8.0).

As a "must-have requirement of partitioning" I have to add my partition key as a primary key, which is created_at in my table. eg:

alter table notifications drop primary key, add primary key(`id`, `created_at`);

In above query, id is an auto-increment primary key. I have more than 10M entries in table.
Mysql won’t allow live updates on schemas; it will lock the table when I run alter command.

At the same time, I can take a downtime for this process. Please guide the best approach to take this live.

I tried generating same amount of data on my local using "Faker" library & while running alter query on it. I observed it taking more than 2 hours for this process.

Schema

CREATE TABLE `data` (
  `seq_id` int NOT NULL AUTO_INCREMENT,
  `id` varchar(100) NOT NULL,
  `doc_id` varchar(100) NOT NULL,
  `page_no` int DEFAULT '1',
  `file_store` varchar(100) DEFAULT NULL,
  `s_id` varchar(100) NOT NULL,
  `s_f_id` varchar(100) NOT NULL,
  `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `f_store` varchar(100) DEFAULT 'GCP',
  `purged` tinyint(1) NOT NULL DEFAULT '0',
  PRIMARY KEY (`seq_id`),
  UNIQUE KEY `id` (`id`),
  KEY `doc_id` (`doc_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

2

Answers


  1. Both changing the PK and adding PARTITIONing require a full copy and re-index of the data. This will be slow. But… Let’s talk about whether the Partitioning will provide any benefit. Please provide SHOW CREATE TABLE — both currently and after adding partitioning. See also Partition

    If, on the other hand, you decide that Partitioning is really needed, and the table is growing, then the sooner you do it, the less time it will take.

    If you will be purging some old data as you do the conversion, then I suggest doing everything at once:

    CREATE TABLE new_t (
        ... ); -- with new PK and partitioning.
    INSERT INTO new_t
        SELECT * FROM t
            WHERE date >= ...; -- to purge old data by not copying it over
    

    But even before doing that, let’s see the CREATEs, there may be more advice to fold into the copy.

    Also note that Partitioning does not allow for UNIQUE keys, nor for FOREIGN KEYs. We can discuss that further after seeing the CREATE.

    Also — Do you have a "natural" PK? That is, can you get rid of id?

    More

    While changing the PK, can you simply remove the seq_id (auto_inc) column? Or do you need it for some other reason?

    Login or Signup to reply.
  2. If you are very concerned about the lock time, then I suggest you use the master-slave replication. Basically, it is a master-slave switching time, which is very short.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search