I’m currently using PlanetScale for database hosting and I’m trying to monitor my queries’ performance so that it doesn’t affect pricing too much and I see that my recently added SQL query is hitting the database pretty hard and it takes ~26ms to execute. So the question is, why? And how can I optimize the query to execute faster and take less resources?
All I’m doing here is finding closest cities to desired city that have data about population and has airport entity connected to it:
SELECT DISTINCT neighbour.*,
(
6371 *
acos(cos(radians(main.latitude)) *
cos(radians(neighbour.latitude)) *
cos(radians(neighbour.longitude) -
radians(main.longitude)) +
sin(radians(main.latitude)) *
sin(radians(neighbour.latitude)))
) AS distance
FROM flaut.City neighbour
JOIN flaut.City main ON neighbour.code <> main.code
JOIN flaut.Airport airport ON neighbour.code = airport.cityCode
WHERE main.code = 'MOW'
AND neighbour.population IS NOT NULL
HAVING distance < 800
ORDER BY distance;
According to PlanetScale this query is the heaviest on DB engine among all the other queries I have
Although DESCRIBE
doesn’t appear to indicate what’s wrong, it doesn’t read that much from the database
2
Answers
The problem is that you are doing a cross join for every house to every other house and then performing this calculation on them. Only after the calculation do you filter by the nearest ones.
You need to find a way to filter BEFORE the calculation. This can be done. If you think about a distance you calculate given two points in lat and long by finding the distance with complicated math.. But you know if the distance between the two lat or the two long are greater than some amount the distance between the 2D distance must be greater. If you filter on that before the calculation (if you have an index on them this will be much faster.)
So it would look something this (where x is the lag/long distance you want to be shorter than):
(I assume that
WHERE main.code = 'MOW'
identifies one specific lat/lng.)To find the city nearest to one specific lat/lng:
INDEX(lat, lng)
andINDEX(lng, lat)
— better speedAll of this is discussed in Find Nearest