Summary

NXDOMAIN (Non-Existent Domain) is the standard response returned by a DNS server when a domain name is not known to it. NXDOMAINS, along with any other DNS queries, are logged and can be reported on.

Such reports should show all the queries to a given zone that were not matched by an existing domain. The value proposition is that this data can provide the TLD owner a view to which domains do have an audience and may need to be registered to capture traffic.

We will see how while this is true in theory; the practice may bring more challenges than solutions.


Analysis

DNS Queries

In broad terms:

  • The DNS operates a network of servers, set to respond to queries on domain names.
  • The system is designed to be hierarchical and redundant, resulting in a single query potentially requiring several servers to formulate an answer, and any given answer coming from any one of a number of redundant servers.
  • These servers may be offered authoritatively by the entity responsible for the zone, or recursively by operators who offer short term cached answers.

A typical query for www.test.com will:

  1. Check internal memory for knowledge of that domain, if no information is found
  2. Check the network default DNS (a Recursive server) for the domain, if no information is found
  3. Analyze the domain (right to left) to check which level of information is known and what it misses. In our example we may know where to find .com information, but not test.com
  4. Interrogate Authoritative servers one level at a time until the information is found. In our example ask .com about test.com, and ask test.com about www.test.com.
  5. At any level, should the information not be found by the relevant Authoritative Server, it will return an NXDOMAIN

For example: a request for testme.test (which does not exist) returns an NXDOMAIN:

$ dig testme.test

; <<>> DiG 9.8.3-P1 <<>> testme.test
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 31136
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

DNS Logging

Each DNS Server (Authoritative or Recursive) is set to log queries and answers.
Depending on SLAs this logging may carry more or less information, and may be retained for varying length of time.

DNS is extremely verbose and operators will tend to optimize their operations by limiting logging both in terms of what is logged and how long it is kept. They might for example maintain a full log for a day, and aggregated data over longer periods of time. Some operators are contractually bound to provide that information, others are not, or (like most Recursive servers) provide their service free of charge and without SLA.

By design, not all queries in a given TLD hit the TLD operator’s DNS, most are directly handled by Recursive servers along the way. By the same design, all NXDOMAIN are issued by the Authoritative server, Recursive servers are not designed to remember an non-existent domain.

As a TLD Operator you should be able to obtain your TLD DNS logs including NXDOMAINs. As the operator for 3rd or 4th level zones (anything left of mycompany.com for example), depending on your service, you may be able to access this information.


Processing Logs for NXDOMAIN

A typical New gTLD zone will log upward of 1M NXDOMAINs a day.

This large dataset needs to be processed to be interpreted; this involves:

  • Collating logs from a distributed network of installed DNS servers;
  • Extracting NXDOMAIN from general logs;
  • Identifying unique domain names;
  • Counting occurrences of each; and
  • Sorting them;

resulting in lists of 10,000+ results per day.

We note that the large quantity of data one must process does imply technical overhead, which comes at a cost. Depending on traffic and setup, a provider may need to tailor and limit these processes.
Most TLD DNS operators offer access to their full logs, and most will provide processed NXDOMAIN lists, but these are generally capped: the Top 10,000 entries, or anything above 10 occurrences a day, for example.


Analyzing NXDOMAIN reports

We often use the analogy of having to sift large amounts of trash for a few pearls.

To us pearls are human-generated queries for genuine names or typos of names, which could be registered to catch traffic. Value may also be found in the name association with our TLDs: owned brands and products, or competition; insofar as they are human-generated.

However, the Internet is used in large portion by automated systems which when following NXDOMAIN-generating behaviour, do it so consistently that they bury potential pearls under their volume of useless data.

Where does automated data come from?

1. ICANN
It is the single largest generator of NXDOMAIN, by design as these are part of the monitoring of our TLD DNS infrastructure.
Ex. zz–icann-monitoring.tld and zz–icann-sla-monitoring.tld.

2. IP Monitoring
Several IP protection operators run regular checks for top brand domain names in all TLDs in order to monitor for unwanted registrations.

3. Routers
Some router brands use proxy detection or other network detection methods that may result in NXDOMAIN. As these have a very large installed base, any configuration issue results in heavy traffic.
Ex. wpad.tld

4. Browsers
With the advent of search/address bars some search queries or portions of may be interpreted as domain names (and vice versa).
This content could in principle, be interesting in terms of search study, but is much better served by working directly with Search providers.
Google Chrome had for a while a behaviour that generated randomized string NXDOMAIN, but this has been corrected and has been disappearing as their installed based is updated.

5. Other IOTs
The increasing use of Internet-enabled devices can potentially generate large amounts of undesired, meaningless traffic.
We have seen in the recent past, network failures generated by such devices.