I want to display the count of hosts/servers that do not have a given package installed (example: "abc") in a Grafana dashboard with Postgres as data source.
Example table of a single server with "abc" package present which is displayed in Grafana:
Hostname | Package | version | Package_installed_on_server |
---|---|---|---|
server1 | abc | 10.0.1 | true |
server1 | abc | 10.0.2 | false |
server1 | package2 | 3.1 | true |
server1 | package3 | 4.1.1 | true |
server2 | package2 | 3.1 | true |
server2 | package3 | 4.1.1 | true |
server2 | package4 | 10.0.1 | true |
Server3 | package2 | 3.1 | true |
Server3 | package3 | 4.1.1 | true |
For example, I have 10,000 servers and in 400 servers the package "abc" is not installed. Then Grafana should display the count of servers where no version of "abc" is present.
This is what I tried. With the above example the output should be 2 where package abc is not present. But I am not getting the output as expected (or 2).
SELECT count(*)
FROM (
SELECT DISTINCT hostname
FROM scanner_table
GROUP BY hostname
HAVING COUNT(CASE WHEN package = 'abc' THEN 1 END) = 0
) AS host_without_abc;
How to solve this?
2
Answers
Simple, naive, slow, first implementation – count all hosts – count all hosts with abc package = hosts without abc package:
More mature solution: left join for the same hostname with selected package and count hostnames, where join is empty/null:
Depending on undisclosed details, one or the other query is simpler / faster.
Setup
This multicolumn index helps all below queries (if table is vacuum’ed enough), some more some less;
You should have a
hostname
table with one row per relevant host.Consider creating one if you don’t:
Queries
Fastest if table
hostname
exists. (FASTEST overall.)See:
Fastest without table
hostname
and only few rows per hostname.Similar to what you tried (which should actually work, inefficiently).
Emulating an index skip scan will be fastest without table
hostname
and many rows per hostname.Needs the index to be fast. See:
fiddle
The expression
package <> 'abc'
inORDER BY
sorts the package ‘abc’ on top. Why?If column
package
can be NULL, you need to do more for some queries.