An investigation into slow sudo command

In recent days I created a new virtual machine with ubuntu 20.04, but I found the terminal command with sudo took a long time, almost 3 seconds each time. When I seach sudo command slow in Google, the first two search results are both from stackflow and suggest adding an entry 127.0.0.1 hostname to /etc/hosts file. I checked /etc/hosts and found the entry 127.0.0.1 hostname exists (in fact I made a mistake here, which I will describe later). This article describes the investigation into this issue in timeline.

Investigation process

Firstly I use the strace to record system calls information of a sudo command.

1
sudo strace -r -T -o sudo_strace.log -e trace=all sudo echo hi

From the output of strace I find a suspicious point, poll read from fd=4 costs 2.915s, based on the context of this system call, it is a DNS resolution for hostname ubuntuwork.internal.xxx. There exist several questions

  1. Why the DNS resolution for hostname doesn’t use host table directly
  2. Why there is a DNS resolution for ubuntuwork.internal.xxx
  3. Why sudo command has a DNS resolution for hostname
  4. Is there a DNS cache working
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0.000103 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 4 <0.000043>
0.000090 setsockopt(4, SOL_IP, IP_RECVERR, [1], 4) = 0 <0.000035>
0.000085 connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.53")}, 16) = 0 <0.000041>
0.000093 poll([{fd=4, events=POLLOUT}], 1, 0) = 1 ([{fd=4, revents=POLLOUT}]) <0.000034>
0.000087 sendmmsg(4, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="C\324\1 \0\1\0\0\0\0\0\1\nubuntuwork\10internal"..., iov_len=60}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=60}, {msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\242\310\1 \0\1\0\0\0\0\0\1\nubuntuwork\10internal"..., iov_len=60}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=60}], 2, MSG_NOSIGNAL) = 2 <0.000355>
0.000546 poll([{fd=4, events=POLLIN}], 1, 5000) = 1 ([{fd=4, revents=POLLIN}]) <2.915181>
2.915528 ioctl(4, FIONREAD, [60]) = 0 <0.000168>
0.000382 recvfrom(4, "C\324\201\203\0\1\0\0\0\0\0\1\nubuntuwork\10internal"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.53")}, [28->16]) = 60 <0.000062>
0.000185 poll([{fd=4, events=POLLIN}], 1, 2083) = 1 ([{fd=4, revents=POLLIN}]) <0.000057>
0.000140 ioctl(4, FIONREAD, [60]) = 0 <0.000078>
0.000170 brk(0x5582dd4ce000) = 0x5582dd4ce000 <0.000060>
0.000142 recvfrom(4, "\242\310\201\203\0\1\0\0\0\0\0\1\nubuntuwork\10internal"..., 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.53")}, [28->16]) = 60 <0.000066>
0.000390 close(4) = 0 <0.000085>
0.000159 brk(0x5582dd4be000) = 0x5582dd4be000 <0.000061>
0.000112 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 4 <0.000041>
  1. For the first question, I re-check the hosts file and hostname, and find the mistake in hostname setting. From hostname(7) we can see valid characters for hostnames are ASCII(7) letters from a to z, the digits from 0 to 9, and the hyphen (-). Since the underscore is invalid character in hostname, the system call gethostbyname will return ubuntuwork and it is not listed in host table. The DNS resolution procedure will first query host table, if the query item is not found, system will continue to check the search domain.
1
2
3
4
5
6
7
8
9
➜  hostname
ubuntuwork

cat /etc/hosts
127.0.0.1 localhost ubuntu_work
127.0.1.1 ubuntu_work

cat /etc/hostname
ubuntu_work
  1. By default, the search field in /etc/resolv.conf is empty but it can be configured with one or more search domains. After I check the /etc/resolv.conf I find internal.xxx in the search field. That explains why there is a DNS resolution for ubuntuwork.internal.xxx. There is another question, why does the dns lookup for a no such name host cost so much time?
1
2
3
nameserver 127.0.0.53
options edns0 trust-ad
search internal.xxx
  1. For the third question, in order for sudo to know whether this rule should be applied, it needs to lookup the host it is running on. More details are explained in a stackoverflow answer sudo command trying to search for hostname.

  2. Is there a DNS cache for linux server? The answer is no in OS-level unless a caching service such as Systemd-Resolved, DNSMasq, or Nscd is installed and running. In ubuntu 20.04 systemd-resolved is enabled by default, and I check the cached items by following commands. The USR1 signal will not stop the service but tells systemd-resolved to write all the current cache entries (and some other information) to the system log. After checking the output file for cached items by searching CACHE:. ubuntuwork is not in the cached items, which explains why the sudo is slow each time.

1
2
$ sudo killall -USR1 systemd-resolved
$ sudo journalctl -u systemd-resolved > ./dns-cache.txt

Summary

After I have figured out the root cause, I find either of the following ways can workaround and make the sudo run normally.

  • Add an entry of 127.0.0.1 ubuntuwork to /etc/hosts
  • Remove the search line in /etc/resolv.conf

And actually, I should correct the hostname, such as renaming hostname from ubuntuwork to ubuntu-work and add corresponding entry to /etc/hosts.

Reference