skip to Main Content

We (almost) always add a general id column as a primary key while designing sql tables. If a table already contains a unique column, I wonder if we should designate that as the primary key and not use the id column or still use it while just adding a unique key index for the unique column. And what would be the implications of either of the approaches mentioned above?

We recently had a case where a table had a UUID designated as the primary key but the said column led to certain issues when mysqldump was used to dump the database. The said table also has a non-null unique column of type VARCHAR so to work around the aforementioned issue, the id column was removed and instead, the unique key column was designated as the primary key.

This made me wonder if we should always do this or only in specific cases and if it would impact the operations on that table in any way.
Some of the questions that come to mind are:

  1. Would it affect the performance? If yes, what if the said table has a lesser number of records, eg., less than a hundred?
  2. The said unique column being a string-type column may contain anything. So, would that result in any issue while joining it with another table via a primary-foreign key relationship?

Any help in this regard is highly appreciated.

2

Answers


  1. Here is a list of considerations for using an existing unique column as primary key instead of adding a special column solely for that purpose:

    • The unique column must be non-nullable.
    • The unique column should be inexpensive. (For example an integer, not a string, especially not a compound key. This is a performance concern.)
    • The value of the unique column is expected to never change throughout the lifetime of the row. (Even if your RDBMS allows you to change the value of a primary key, in practice you are strongly advised to avoid doing such acrobatics.)
    • The unique column is not likely to be refactored into something else (especially not into a non-unique column) in a future revision of the database schema.
    • The unique column should, ideally, be a sequentially increasing number, so that you can make use of a clustered index. (This is a performance concern.)
    Login or Signup to reply.
  2. 2 out of 3 tables I have written have a "natural" PRIMARY KEY. I avoid artificially adding an id to such tables.

    Here are some common patterns. (id is AUTO_INCREMENT`)

    • A uuid is very random, hence this may be optimal:

        PRIMARY KEY (id),
        UNIQUE(uuid)
      
    • For many-to-mapping table between foo and bar, this saves both space and speed, while enforcing that you don’t accidentally insert duplicates. It is implemented (in InnoDB) as two 2-column BTrees:

        PRIMARY KEY(foo_id, bar_id),  -- for mapping one way
        INDEX(bar_id, foo_id)         -- for mapping the other way
      
    • Sometimes the artificial id is beneficial, but this can let you use the PK for "locality of reference":

        PRIMARY KEY(user_id, id),  -- cluster together all the users stuff
        INDEX(id)    -- sufficient to heep AUTO_INCREMENT happy
      

    As for your specific questions:

    • The datatype(s) of the column(s) in the PK are of secondary importance. The big issue is how many rows need to be touched.
    • When testing on multiple columns, a composite (multi-column) index usually wins over two individual columns, regardless of cardinality.
    • TEXT and BLOB cannot be indexed, and prefixing (eg, INDEX(ext(50)) does not really address your concerns.
    • Foreign keys only check for the existence of the value; the type of index (primary/unique/index) does notmatter.
    • UUIDs are evil in many ways; avoid them where practical. [My opinion.]
    • PARTITIONing a table adds more cans-of-worms.
    • INSERTing a row must wait for all PRIMARY and UNIQUE keys to be checked for dups before completing. Other secondary keys are inserted in a ‘delayed’ technique (using the "Change Buffer"). That is, there is a slight benefit in minimizing the number of unique keys in a table.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search