skip to Main Content

For our new MySQL database, we are thinking of using UUID as the primary key stored as BINARY(16). The key values would be generated by the the below function in Java application, or by UUID_TO_BIN(UUID()) built-in functions in data loads etc.


UUID.randomUUID();

public static byte[] convertUUIDToBytes(UUID uuid) {
    ByteBuffer bb = ByteBuffer.wrap(new byte[16]);
    bb.putLong(uuid.getMostSignificantBits());
    bb.putLong(uuid.getLeastSignificantBits());
    return bb.array();
}

public static UUID convertBytesToUUID(byte[] bytes) {
    ByteBuffer byteBuffer = ByteBuffer.wrap(bytes);
    long high = byteBuffer.getLong();
    long low = byteBuffer.getLong();
    return new UUID(high, low);
}

SYS_GUID() in Oracle, which generates completely random numbers, is routinely used as a first-class primary or unique key, but many blogs claim using UUID Ver 4 would have severe performance issues because of the non-sequential nature of it Blog on MySQL site.

  1. Would there be any performance issue in retrieving the data with using UUID Ver 4. Our data is completely random with respect to their insertion order, and the chances of our queries accessing the first-inserted row would be the same as that of the most recently inserted row. And we do not expect the rows in our query results show up in the order of their insertion. We also would not need to query rows with the primary key value as a range.

  2. Would there be any performance issue while inserting the rows.

We do understand that a numeric would have better performance compared to UUID as a key, but UUID is an important part of our data and most queries would be using UUID as the key.

2

Answers


  1. There are performance differences and storage size differences when using UUID. Whether the differences are great enough to make it a deal-breaker for your application is something you will have to test for yourself.

    Percona has done some great blogs on this topic.

    To UUID or not to UUID? (2007)

    For auto_increment key load process took 1 hour 50 minutes giving load speed of 40305 rows/sec. For UUID process took over 12 hours and is still going. From MySQL status I can see it is loading about 200 rows/sec and the it is still slowing down a bit as key file growths. So in this little case we have about 200 times performance difference which is worth to consider

    Illustrating Primary Key models in InnoDB and their impact on disk usage (2015)

    …the image below shows us that UUID() causes significant fragmentation because it is causing pages to be split all throughout the table. This is deemed “expensive” since the ibd file now is more than 2x greater than the UUID()-optimised method, and about 3x greater than a Primary Key with AUTO_INCREMENT.

    Horizontal axis – Number of inserts x 25,000

    Vertical axis – Time Taken in seconds

    enter image description here

    I would avoid using UUID as primary key unless you are storing data in multiple MySQL instances (i.e. sharding) and you need to ensure they remain unique across all instances.

    Login or Signup to reply.
  2. Don’t bother with V6.

    MySQL 8’s UUID_TO_BIN() does what V6 does, but does it while converting from hex to binary. That is, you don’t need V6 to get the locality of reference if you store in BINARY(16) after using that function.

    MariaDB 10.7, similarly, adds a full-fledged "datatype" for UUID. It shows V1, but shuffles the bits into a 16-byte binary-like datatype.

    My discussion, from more than a decade ago: http://mysql.rjweb.org/doc.php/uuid

    (Thanks for the link to V6; I’ll add it to my blog.)

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search