skip to Main Content

Having the following tables

CREATE TABLE contacts (
    Id INT NOT NULL AUTO_INCREMENT,
    Name varchar(255) NOT NULL,
    PRIMARY KEY(Id)
);

CREATE TABLE contacts_cstm (
    Id INT NOT NULL AUTO_INCREMENT,
    Id_c INT NOT NULL,
    Department_c varchar(255) NOT NULL,
    PRIMARY KEY(Id)
);

And having contacts_cstm data like this

Id  Id_c    Department_c
1   1       ^pro^
2   2       ^pro^,^pro^
3   3       ^pro^,^temp^,^pro^

I want to count all the data in the contacts_cstm table. where the value of the department_c field includes ^pro^, or where there is more than one incidence of ^pro^ like the second row, but it should not participate in the account if it includes other items different from ^prop^

The expected output is 2, for registers that include ^pro^ and ^pro^,^pro^

I have created this sqlfiddle

Thanks in advance

2

Answers


  1. using replace to identify exclusions

     select count(*) - 
    (SELECT  count(*) as exclusions from contacts_cstm
    where replace(replace(department_c,'^pro^',''),',','') <> '') pros
    from contacts_cstm;
    

    https://dbfiddle.uk/5NVZIHaL

    Login or Signup to reply.
  2. Your structure does not make any sense. Storing a comma separated list of departments in Department_c is obviously a bad idea (see Is storing a delimited list in a database column really that bad?) and requires string manipulation and a full table scan to get the answers you want. To make this even worse, you have stored this serialised relationship in a separate table with a meaningless surrogate PK, when you already have Id_c (presumably FK to contacts.Id?).

    It is not totally clear from your question what you are trying to do here, but it seems likely that this data would be better represented by contacts (Id, Name), departments (Id, Name) and contact_department (contact_id, department_id). This can be queried much more efficiently as it can effectively use indices, instead of requiring full table scans.

    That said, with your current structure, you can improve on P.Salmon’s answer slightly by removing the need for the index scan for the outer count(*):

    SELECT COUNT(*)
    FROM contacts_cstm
    WHERE REPLACE(REPLACE(department_c, '^pro^', ''), ',', '') = '';
    

    This assumes that you would not have a row in contacts_cstm where department_c is an empty string. If the empty string is a possible scenario, you can add a criterion to exclude the empty strings:

    SELECT COUNT(*)
    FROM contacts_cstm
    WHERE REPLACE(REPLACE(department_c, '^pro^', ''), ',', '') = ''
    AND department_c <> '';
    

    Here’s a db<>fiddle.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search