skip to Main Content

A friend and I are working on a database that stores information about cPanel hosting accounts, such as what settings, apps, and features each account is using.

Most of the fields are boolean, such as whether or not the account has any wordpress sites, any php 5.4 driven sites, any ruby on rails sites, etc…

A small number of fields are non-boolean data like disk usage in MB, hostname of the server the account resides on, and the username of the account, etc…

In my mind, it makes sense to store ALL this information in one single table.

So the table might have the following columns:

    php54 boolean,
    wordpress boolean,
    ror boolean,
    username varchar(8),
    hostname varchar(20),
    usage_mb int(9),

I figure that the primary key could be (username,hostname).

However, my friend has already set up the database with multiple tables that look like this:

Fact Table:

    id int(11),
    php54 boolean,
    wordpress boolean,
    ror boolean,
    usage_mb int(9),

User Table:

    id int(11),
    factid int(11),
    hostid int(11),
    username varchar(8)

Hostname Table:

    id int(11),
    hostname varchar(20),
    ip varchar(15),

Where each table’s primary key is “id” and the user table references the hostname table and fact table using ‘hostid’ and ‘factid’ foreign keys (respectively).

I believe my friend’s rationale behind multiple tables is to organize the data based on the type of data, despite all the data being related to one single, unique account.

My rationale is that since all the data belongs to one unique account, and therefore every single row is 1:1, does it make sense to have multiple tables?

I would think multiple tables would be sensible if a row in one table can reference multiple rows in another table… But in this case each row from each table can only be associated with one single row from any other table… so i think one table is fine.

Should this data be in multiple tables, or in one single table?

We’re both sort of noobs figuring things out as we go.

At which point does it make sense to use multiple tables?

Currently its really difficult to write an API to add the data associated with one single account to three separate tables, as all the primary keys auto increment, and other than that there isn’t any key that is unique to the account which would make it easy to update existing data.

Sorry if none of this makes sense

2

Answers


  1. In your case, I dont’t think having multiple tables with one to one relationships is the right way.

    It is not forbidden and in some cases it can be helpfull (
    Is there ever a time where using a database 1:1 relationship makes sense?), but you’ll have to deal with unecessary joins in your requests.

    Login or Signup to reply.
  2. Ignoring ids, the way you find out what your CKs (candidate keys) are and whether you should decompose is the topic of normalization to higher NFs (normal forms). This formalizes your notion of “a row in one table can reference multiple rows in another” (among others). Guessing using common sense here, there’s no particular need to decompose. Introducing ids not visible at the business level is always technically unnecessary but happens per its own practical/ergonomic reasons. Further explanation/justification is information modeling & database design textbook chapters on design, CKs, NFs & surrogates–read some. Vague notions like “same type of data” are not helpful.

    (TL;DR “At what point should I create a separate table?” is a basic question with a complex answer that requires learning some stuff.)

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search