skip to Main Content

I have a scenario where I have a table users with information about the users in the system. This table currently has about 70 columns – out of these 70 columns, 4 are related to the user’s address.

Now, I need to get more information about their address, which would be stored in the additional 15 columns. I am wondering I should put these 15 columns to the table users and increase the number of columns to about 90, or if to put these new 15 columns to a new table user_addresses where would be stored all address-related information about users’ addresses. One row in the users table would have one row in the user_addresses table.

I like the second option with the 1-to-1 relationship better, as it is more organized (although there’s some extra logic). What worries me, though, is that if I want to search users based on their location, I’ll need to use JOINs (and with using autocomplete or similar functionality, it could be a pain). Anyway – in terms of the size of the users table, there are currently 1,6 million records, so that would mean the same amount of records in the user_addresses table.

Performance-wise, is it, in this case, better to place everything to one table or to two tables with 1-1 relationship? I am trying to avoid a situation building the 1-1 scenario and then migrating it all back to one table due to bad speed/performance.

2

Answers


  1. It will depend on your use case, so implement a PoC and benchmark.

    The rule of thumb is: avoid querying unnecessary data. If you need all (or most) of the 90 columns every time you query the user (unlikely), then keeping them in one table is better.

    If most of the time you only need a few columns, and the other 85 only rarely, then it is better to keep the main table small and join other 1:1 tables when required.

    In my experience, for most cases, you need the ID, the display name and the status or their equivalents. Feel free to add a few more columns if required. The rest is not required for most cases and should be in other tables.

    Example, let’s look at stackoverflow. You see my profile picture, name, reputation, badges and link to the profile under this post. You see the same in your header about your own user. These columns are what is required for the usual case. The rest should be separated.

    However, this is just a general guideline and might not improve your performance significantly. PoC and benchmark!

    Login or Signup to reply.
  2. has_one is usually a bit of a red flag for me and often not necessary. It’s great for situations where the object may have more fields based on some conditional. For example, a user may be an employee or a customer and have other fields, that makes sense.

    In your case, if all users have an address or could, then I’d make it the same table.

    If that address location is potentially reused, then that’s another issue.

    I’ve been in a company where due to the huge size of the table, it was decided to make a bunch of 1-1 relationships and avoid the migrations. That was a pain to work with.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search