How do I write a query that counts (totals) the number of values in a group for data spread across three tables? For each reporter and report date, I’d like a count of the number of sightings where species codes is 10 or 20 (using IN because species group has lots of codes).
REPORTER TABLE
reporter_id | reporter_num
-----------------------------
1 | 1111
2 | 2222
REPORT TABLE
report_id | reporter_id | report_date
-------------------------------------
1 | 1 | 2022-09-05
2 | 1 | 2022-09-05
3 | 1 | 2022-09-05
4 | 1 | 2022-09-16
5 | 2 | 2022-09-22
6 | 2 | 2022-09-22
SIGHTING TABLE
sighting_id | species_code
------------------------
1 | 10
2 | 55
3 | 20
4 | 35
5 | 55
6 | 20
This is essentially what I’m working with when the three tables are joined:
reporter_num | report_date | species_code
----------------------------------------
1111 | 2022-09-05 | 10
1111 | 2022-09-05 | 55
1111 | 2022-09-05 | 20
1111 | 2022-09-16 | 35
2222 | 2022-09-22 | 55
2222 | 2022-09-22 | 20
Query: for each reporter_num and report_date (one row per reporter_num and report_date), count the number of sightings where species_code is 10 or 20. Expected results:
reporter_num | report_date | my_count
----------------------------------------
1111 | 2022-09-05 | 2
1111 | 2022-09-16 | 0
2222 | 2022-09-22 | 1
A count in my query gives the total number of records for each reporter_num and report_date which isn’t what I want:
select
reporter.reporter_num, report.report_date,
count(sighting.species_code in (10, 20)) as my_count
from report
inner join reporter on report.reporter_id = report.reporter_id
inner join location on report.report_id = location.report_id
inner join method on location.location_id = method.location_id
left join sighting on method.method_id = sighting.method_id
group by
reporter.reporter_num, report.report_date;
Query Results – my_count is total number of records, which is incorrect:
reporter_num | report_date | my_count
----------------------------------------
1111 | 2022-09-05 | 3
1111 | 2022-09-16 | 1
2222 | 2022-09-22 | 2
Tried a subquery and the counts in both the outer query and subquery are incorrect:
select
reporter.reporter_num, report.report_date,
count(my_table.my_count)
from report
inner join reporter on report.reporter_id = reporter.reporter_id
inner join (
select
reporter.reporter_id, reporter.reporter_num,
report.report_date, sighting.species_code as my_count
from report
[... see joins in above query ...]
where sighting.species_code in (10, 20)
) as my_table on reporter.reporter_id = my_table.reporter_id
group by
reporter.reporter_num, report.report_date;
I feel like I’m close but missing something (or a couple somethings). Any suggestions? Many thanks.
2
Answers
I wasn't using COUNT or COUNT(IF) correctly. The IF has to be 1 if true and NULL if false; otherwise, it will add 1 even if the species code isn't in the species group. The correct code:
The above will also return report dates that don't have any species in the species group, my_count will be zero for that case, which is what I want.
To exclude report dates with zero species in the species group, use count(*) in SELECT and filter for the species codes in WHERE.
Outputs:
First creating an ad hoc, temporary table using
GROUP_CONCAT
andGROUP BY
:Outputs:
Then counting occurrences of either 10 or 20 in the
GROUP_CONCAT
‘edgs
column:In this case using
REGEXP_REPLACE
instead ofREPLACE
as is used in the credited link.Try it here: https://onecompiler.com/mysql/3ygv4cry2
Credit: https://www.tutorialspoint.com/finding-number-of-occurrences-of-a-specific-string-in-mysql