skip to Main Content

I have the following two tables:

CREATE TABLE `ip_info` (
  `start_ip` int(10) unsigned NOT NULL,
  `end_ip` int(10) unsigned NOT NULL,
  `country_code` varchar(3) DEFAULT NULL,
  `country_name` varchar(255) DEFAULT NULL,
  `continent_code` varchar(3) DEFAULT NULL,
  `continent_name` varchar(255) DEFAULT NULL,
  `asn` int(10) unsigned DEFAULT NULL,
  `as_name` varchar(255) DEFAULT NULL,
  `as_domain` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`start_ip`,`end_ip`),
  KEY `country_code_idx` (`country_code`),
  KEY `asn_idx` (`asn`)
)
CREATE TABLE `servers` (
  `ipport` varchar(255) NOT NULL,
  `ip` varchar(255) NOT NULL,
  `port` int(11) NOT NULL,
  `ip_as_int` int(10) unsigned NOT NULL,
  `version` text NOT NULL,
  `protocol` int(11) NOT NULL,
  `online_count` int(11) NOT NULL,
  `max_count` int(11) NOT NULL,
  `description` text NOT NULL,
  `favicon` text DEFAULT NULL,
  `last_seen` int(11) NOT NULL,
  `cracked` tinyint(1) DEFAULT NULL,
  `joined_on` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`ip`,`port`),
  KEY `ip_as_int_idx` (`ip_as_int`)
)

I try to run the following query (get servers from the US, with at least 5 people online), but it takes very long to complete (18s):

SELECT count(*) FROM
    (SELECT * FROM servers AS d JOIN ip_info i ON d.ip_as_int BETWEEN i.start_ip AND i.end_ip WHERE i.country_code = "us") 
AS s WHERE (s.online_count > 5) ;

Because of this, I tried explaining the query, and I get the following:

+------+-------------+-------+------+--------------------------+------------------+---------+-------+---------+------------------------------------------------+
| id   | select_type | table | type | possible_keys            | key              | key_len | ref   | rows    | Extra                                          |
+------+-------------+-------+------+--------------------------+------------------+---------+-------+---------+------------------------------------------------+
|    1 | SIMPLE      | i     | ref  | PRIMARY,country_code_idx | country_code_idx | 15      | const |  761917 | Using where; Using index                       |
|    1 | SIMPLE      | d     | ALL  | ip_as_int_idx            | NULL             | NULL    | NULL  | 1035230 | Range checked for each record (index map: 0x2) |
+------+-------------+-------+------+--------------------------+------------------+---------+-------+---------+------------------------------------------------+

For some reason the ip_as_int_idx isn’t used, I suppose because of this MySQL has to do a full table scan.

How should I change the query/the index so it’s faster?

2

Answers


  1. I would rephrase the query as:

    SELECT count(*)
    FROM servers d 
    JOIN ip_info i ON d.ip_as_int BETWEEN i.start_ip AND i.end_ip 
    WHERE i.country_code = 'us'
      and d.online_count > 5;
    

    The modified query may be able to make good use of the following indexes:

    create index ix1 on ip_info (country_code, start_ip, i.end_ip);
    
    create index ix2 on servers (ip_as_int, online_count);
    
    Login or Signup to reply.
  2. Search by IP is not easy. And no index understands that the ranges are not overlapping. So, an index with (start, end) will, at best, scan half the table, hence O(N) complexity.

    So, I took on the challenge and came up with https://mysql.rjweb.org/doc.php/ipranges

    It does require structuring the table to only have start and not end (or vice versa). And it requires dealing with any gaps between ranges. But it prevents overlapping, allowing ORDER BY ... LIMIT 1 to find the answer with O(1).

    Another thing — You must use the same format for IP addresses in both tables. And VARCHAR will give the wrong answer for comparisons like '12.12.12.12' BETWEEN '111.3.4.5' AND '122.12.12.12'. INT UNSIGNED` works for IPv4, but not IPV6.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search