I have the following two tables:
CREATE TABLE `ip_info` (
`start_ip` int(10) unsigned NOT NULL,
`end_ip` int(10) unsigned NOT NULL,
`country_code` varchar(3) DEFAULT NULL,
`country_name` varchar(255) DEFAULT NULL,
`continent_code` varchar(3) DEFAULT NULL,
`continent_name` varchar(255) DEFAULT NULL,
`asn` int(10) unsigned DEFAULT NULL,
`as_name` varchar(255) DEFAULT NULL,
`as_domain` varchar(255) DEFAULT NULL,
PRIMARY KEY (`start_ip`,`end_ip`),
KEY `country_code_idx` (`country_code`),
KEY `asn_idx` (`asn`)
)
CREATE TABLE `servers` (
`ipport` varchar(255) NOT NULL,
`ip` varchar(255) NOT NULL,
`port` int(11) NOT NULL,
`ip_as_int` int(10) unsigned NOT NULL,
`version` text NOT NULL,
`protocol` int(11) NOT NULL,
`online_count` int(11) NOT NULL,
`max_count` int(11) NOT NULL,
`description` text NOT NULL,
`favicon` text DEFAULT NULL,
`last_seen` int(11) NOT NULL,
`cracked` tinyint(1) DEFAULT NULL,
`joined_on` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`ip`,`port`),
KEY `ip_as_int_idx` (`ip_as_int`)
)
I try to run the following query (get servers from the US, with at least 5 people online), but it takes very long to complete (18s):
SELECT count(*) FROM
(SELECT * FROM servers AS d JOIN ip_info i ON d.ip_as_int BETWEEN i.start_ip AND i.end_ip WHERE i.country_code = "us")
AS s WHERE (s.online_count > 5) ;
Because of this, I tried explaining the query, and I get the following:
+------+-------------+-------+------+--------------------------+------------------+---------+-------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+--------------------------+------------------+---------+-------+---------+------------------------------------------------+
| 1 | SIMPLE | i | ref | PRIMARY,country_code_idx | country_code_idx | 15 | const | 761917 | Using where; Using index |
| 1 | SIMPLE | d | ALL | ip_as_int_idx | NULL | NULL | NULL | 1035230 | Range checked for each record (index map: 0x2) |
+------+-------------+-------+------+--------------------------+------------------+---------+-------+---------+------------------------------------------------+
For some reason the ip_as_int_idx isn’t used, I suppose because of this MySQL has to do a full table scan.
How should I change the query/the index so it’s faster?
2
Answers
I would rephrase the query as:
The modified query may be able to make good use of the following indexes:
Search by IP is not easy. And no index understands that the ranges are not overlapping. So, an index with (start, end) will, at best, scan half the table, hence O(N) complexity.
So, I took on the challenge and came up with https://mysql.rjweb.org/doc.php/ipranges
It does require structuring the table to only have start and not end (or vice versa). And it requires dealing with any gaps between ranges. But it prevents overlapping, allowing
ORDER BY ... LIMIT 1
to find the answer with O(1).Another thing — You must use the same format for IP addresses in both tables. And
VARCHAR
will give the wrong answer for comparisons like'12.12.12.12' BETWEEN '111.3.4.5' AND '122.12.12.12'.
INT UNSIGNED` works for IPv4, but not IPV6.