skip to Main Content

I need to identify each booking with an easy code that can be given to the customer and used by them.

This code is not the "real" booking identifier.
That is, a booking already has a unique id which is used for Client <> Server communication – it is a Posgres auto-incremental int.

But we also want the customer to have a convenient short code, so they could use in case like:

  1. reach out to our customer service with that code.
  2. link their booking details into their mobile app.

So instead of having some 13-character long random sequence, we use 6-character long string of numbers + capital letters.

For example: 3RATW2

My solution currently is:

  1. take a charset of possible/allowed characters.
  2. per each output character (6 times because the output should be 6 characters long):
    1. shuffle the charset with Fisher-Yates algorithm
    2. pick a random character from the result.

Using Kotlin code:

val charset = "ABCDEFGHIJKLMNOPQRSTUVWXTZ0123456789"
return (1..6)
    .map { // for each iteration :
         charset.toCharArray().shuffle() // shuffle (Fisher-Yates)
         .toString().random() // then take a random char
    }
    .joinToString("")

But since 6 is relatively low count, I’m not sure whether it’s a good enough solution in order to avoid possible collisions as the number of bookings will be amounting. Is there a way to improve it (e.g. shuffling the charset more than once, etc.)?

Or is the only real solution to always get the list of existing booking codes from our DB (Postgres), and then the code generator will have to make sure the generated code is not included in that list (if included, run the code again until found a unique code)?

2

Answers


  1. The way to avoid collisions when creating a new order would be to:

    1. Generate a code for the new order
    2. Check if there already is an order in the database with the same code (something like SELECT 1 FROM orders WHERE code = "3RATW2")
    3. Repeat until you generate a code that isn’t already taken

    I highly recommend adding a unique index on the column containing these codes so that it is absolutely impossible to create a collision in the DB.

    If you generate 6 character codes with a charset of 36 characters you have 36^6 possible codes (~2 billion), you can increase this number by generating longer codes and/or including more possible characters.

    Be aware that as you create more orders it will be more likely that you will create a collision when generating a new code and you will have to retry the generation (possibly multiple times)

    Do orders that have been already processed still need to have a code? If not then you can remove the code from old/cancelled orders so that you can re-use the same code for a new order.

    Login or Signup to reply.
  2. Use an encryption. Encrypt the numbers 0, 1, 2, 3, … and as long as you keep the same key and do not repeat a number the resulting encryption will not repeat. 0..9, A..Z gives 36 characters, though you might want to drop back to the Base32 character set to avoid ambiguities, such as 0,O or I,1.

    Base32 is 5 bits per character, so 6 characters is 30 bits. You can use a 30 bit Feistel encryption, which you can write yourself easily enough provided you do not want cryptographic security.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search