skip to Main Content

I am designing the database for a project that I am working on, basically the project is a question bank.

Each question has a flag(mark), where the user can set it as true so that they mark it to get back to it later.

The number of questions is (10,000) for now, and it will increase of course. So we can say that each user can mark (10,000) questions for now.

If I create a huge table where I save the questions id and user id and the flag, the table will be huge because if we have (100,000) users, and each user can flag (10,000) questions, the the table size will be (1,000,000,000) rows, and the number will increase if we got more users or more questions.

also we have a mark filter, when the user wants to create a quiz they can pick avoid (marked questions), or they can search for only marked questions, in other words we will use the mark filter a lot.

Plus, I am sure that I don’t need to have the marks for all users in one table, as it is only important for the user himself, and I don’t care at all to see them as admin or even for Statical analysis across all users, so having all of them in one table is not going helpful.

Now after knowing that, is it wise to create a table for each user to save his own marks? Or is there a way to save the marks where it would not be slow to search for it?

I read that making a table for each user is not a good approach, but also I read that larg tables is a huge problem. And I need suggestions here

2

Answers


  1. It is definitely a bad idea to create a table for each user. You can classify the questions and create a table by type. For example, there is a definition that puts type A questions into table A, type B questions into table A, and type C. Use table B for questions

    Login or Signup to reply.
  2. The answer to your question is it depends on your case

    Here’s how I would approach solution with certain questions and assumptions

    • Do all users mark the 10k questions? How many flag are we expecting user to mark? (Probably not much)
    • What database are we using? Is the technology going to be same even when we scale it up? (Yes, Does’t really matter but this provides key insight on do we really need to optimize it now or not)
    • How many times do we need to reference the flags? (Probably when showing user question and maybe showing stat to admin)

    I would create another table for now for flags and add total_flag_count to the questions table. This is for optimization when we want to see total flags count (purely hypothetical you might not need this). Every time a user flags question we are going to increment total_flag_count and if a user removes flag we will decrement it.

    Your 1 billion row is a worst case estimate and even in this case you are probably fine with this as you can always partition it. Large tables by themselves are not a problem. Depending upon use cases and queries they might be fine.

    There are other solutions. You can use array data structure in columns and so on. You might be able to avoid using many rows using this approach but I would say that it’s overkill at this point.

    First lunch your app analyze how users are using it and then find a solution for your use case. You can always use caching to make it faster

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search