by Corey Quinn, Senior Technical Consultant at Taos
A part of what I do at Taos involves interviewing prospective consultants for our Unix/DevOps practice via a thorough technical assessment. Our technical interview spans virtually the entire breadth of topics that encompass the practice of systems administration.
One focus area that I like to spend a bit of time on is DNS. This essential service acts as the underpinning behind almost everything else a system does; when DNS goes away, your system is likely to be very, very unhappy. Despite this, few people tend to have a grasp upon how DNS works underneath the hood. I went searching for a decent write-up that explains the name resolution process, and struggled to find anything succinct that hit the points I felt were important.
So from the top, “I’m on a Linux system. I punch taos.wpengine.com into a web browser and press enter. Strictly at a name resolution level, assuming every DNS server in the world has a cold cache, what happens?”
On Linux specifically, the browser will call getaddrinfo() to spark up the system’s internal resolver. Assuming /etc/nsswitch.conf is configured to query files first and DNS second, the system will first check/etc/hosts to see if taos.wpengine.com is defined. Assuming it isn’t, it will then check the system’s local cache to see if there is a recently cached answer to this query. Presuming not, it will then check /etc/resolv.conf to see what its local nameserver is. From there, it sends out a UDP packet on port 53 to that nameserver.
If we assume that this resolver is not authoritative for taos.wpengine.com, taos.com, com., or the root domain (and isn’t configured to think that it is!), and is configured to permit recursive lookups from my host, it again checks its cache. Presuming the cache is empty (as we have stated), it then quests out to the root nameserver (which it gets from its prepopulated root hints file; ‘dig’ with no arguments on a properly configured system will regenerate this), and queries for taos.wpengine.com. The root nameserver responds with a delegation response that points to the servers that are authoritative for .com. The resolver then asks the server that is authoritative for .com the same question; it now gets back the authoritative nameserver for taos.com.
At this point, I usually like to stop the interviewee to posit an interesting question. “Let’s say that taos.com lists an authoritative nameserver of ns1.taos.com; at this point, you’ve got a circular dependency in place. Your nameserver is going to get back ‘the authoritative server for taos.com is ns1.taos.com.’ Remember, nameserver delegations are given out by name rather than by IP address. At this point, it’s going to continue to endlessly resolve taos.com to determine where ns1 lives. Obviously, this doesn¹t happen; why not?”
The correct answer to this is that ns1.taos.com is in the “answer” section of the DNS delegation response. In the “additional” section of the response, it also includes the IP address of that server; this is called a glue record.
At this point, the nameserver queries the authoritative nameserver for taos.com, and (presumably) gets back an A record for taos.wpengine.com… and caches it.
If you’ve enjoyed this dive into DNS resolution, please get in touch with Taos; we’d love to spend a couple of hours on the phone with you discussing similar topics in depth!
 Note that some web browsers have their own DNS cache; that is disregarded for this question, as it adds an unnecessary layer of complication to what is already a fairly complex process.
 Note this used to be gethostbyname(), and became getaddrinfo() due to the need for IPv6 support.