skip to Main Content

I have a table:

Rank Title Genre
1 Guardians of the Galaxy Action,Adventure,Sci-Fi
2 Prometheus Adventure,Mystery,Sci-Fi
3 Split Horror,Thriller
4 Sing Animation,Comedy,Family
5 Suicide Squad Action,Adventure,Fantasy

Is there any function that I can count how many distinct text contain in theses columns? And how many columns contain each text?
I am using MYSQL

Like

Total distinct text
10

And

Adventure Sci-Fi Mystery Horror Thriller Animation Family Comedy Fantasy
2 3 2 1 1 1 1 1 1

2

Answers


  1. I’m updated my answer since you updated your question. In SQL tables, it’s good practice to separate values. Your Genre column contains multivalued attributes. It will be easier to count how many movies each Genre contains with that way, or create another table.

    Also, I think the total distinct test is supposed to be 5 since you’re counting the distinct titles. You might want to use the keyword DISTINCT in SQL for that.

    Login or Signup to reply.
  2. To Split comma separated string into rows concatenate square brackets ([]) around your string to make it into a JSON array. Then use JSON_TABLE to convert it into a table.

    To count distinct text use this query :

    with cte as (
      SELECT t.Rank_, j.genre
      FROM mytable t
      CROSS JOIN JSON_TABLE(
                           CONCAT('["', REPLACE(Genre, ',', '","'), '"]'),
                          '$[*]' COLUMNS (genre TEXT PATH '$')
                          ) j
    )
    select count(distinct genre) as 'Total distinct text'
    from cte
    

    To count per genre :

        with cte as (
          SELECT t.Rank_, j.genre
          FROM mytable t
          CROSS JOIN JSON_TABLE(
                               CONCAT('["', REPLACE(Genre, ',', '","'), '"]'),
                              '$[*]' COLUMNS (genre TEXT PATH '$')
                              ) j
        )
        select genre, count(1) as 'count'
        from cte
        group by genre
    

    You can use the conditional aggregation to Pivot rows into columns :

    with cte as (
      SELECT t.Rank_, j.genre
      FROM mytable t
      CROSS JOIN JSON_TABLE(
                           CONCAT('["', REPLACE(Genre, ',', '","'), '"]'),
                          '$[*]' COLUMNS (genre TEXT PATH '$')
                          ) j
    ),
    cte2 as (
           select count(case when genre = 'Action' then 1 end) as 'Action',
                  count(case when genre = 'Adventure' then 1 end) as 'Adventure',
                  count(case when genre = 'Sci-Fi' then 1 end) as 'Sci-Fi'
           from cte
           group by genre
    )
    select max(Action) AS 'Action', max(Adventure) as 'Adventure', max(`Sci-Fi`) as 'Sci-Fi'
    from cte2
    

    Demo here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search