We allow our clients to map their domain to our service’s subdomain.
For instance, we provide the service at client1.ourservice.com
and request the client to create a CNAME record to map their domain clientdomain.com
to client1.ourservice.com
.
We are now looking to check the CNAME update automatically in our webserver within 10 seconds. We know it is possible since Shopify manage to do so (in even less time).
What is the best way to poll the DNS CNAME update?
What we tried the following that are not giving consistent results (between 1sec and 3min):
// nodejs
dns.resolve4
dns.resolveCname
// dig
dig $domain
dig @1.1.1.1 $domain
dig @$random_dns_server $domain
2
Answers
Thank you, doing
dig +trace
is working.Poll the servers that host the DNS records for the domain in question, i.e. the authoritative nameservers.
Start with a library that gives you full access to the DNS response (i.e. all sections, not just the answer). When you get either a "nonexistent" response, look at its "authority" section for a SOA record; it’ll have an mname field pointing at what could be considered the "primary" nameserver for that domain. Then configure the DNS resolver library to query that specific nameserver.
That’s expected. You weren’t querying a global database; you were querying caching proxies in all of your examples. Both positive and negative replies are cached for a certain amount of time, and each zone administrator decides what TTL to set for their records (and what negative TTL to use for the zone as a whole).
What makes it look inconsistent is that the TTL is counted from the time the first query was made; repeated queries do not extend the TTL.
For example, if you query a non-existent domain and it’s not in cache yet, it’ll get freshly cached for whole 5 minutes (or whatever NegTTL is set), and adding that domain will not be visible during those entire 5 minutes.
On the other hand, if you query a non-existent domain and the response happens to be already cached (e.g. if you just queried it a minute ago), with its TTL having already counted down from the original 5 minutes to (say) 60 seconds, then it’ll only take 60 seconds for a change to be visible.
But additionally, when you’re querying a large public resolver such as 1.1.1.1 or 8.8.8.8, you’re not actually querying the same server every time – many locations (ignoring anycast) will have that IP address load-balanced across more than one physical machine, each having its own cache. So even if the first query of a nonexistent domain caused the resolver to cache the results, the second one might hit a different backend host that doesn’t have it in cache yet.