Skip to content

Fix(DNS): Handle EAI_NODATA as success (empty address list) in getaddrinfo#649

Merged
djs55 merged 1 commit intomoby:masterfrom
keigoi:fix-issue-509-try2
Jul 3, 2025
Merged

Fix(DNS): Handle EAI_NODATA as success (empty address list) in getaddrinfo#649
djs55 merged 1 commit intomoby:masterfrom
keigoi:fix-issue-509-try2

Conversation

@keigoi
Copy link
Copy Markdown
Contributor

@keigoi keigoi commented Jun 26, 2025

This pull request addresses a long-standing issue where DNS lookups within containers, especially those running in rootless Docker-in-Docker (DIND) environments using VPNKit, would fail with an NXDOMAIN error.

This often occurred even when valid IP addresses were present in the DNS response, or for specific record types like SRV records.

Problem:

  • DNS Lookup Errors in Rootless DIND

    • Related Issue: DNS lookup error in containers in rootless DIND (VPNkit) moby#47628
    • When running containers in dind-rootless mode, DNS lookups would fail with NXDOMAIN.
    • This was particularly evident when the upstream DNS server stripped out IPv6 addresses. For example, commands like apk add in alpine containers would fail due to these DNS errors.
    • This behavior was reproduced across multiple dind images.
  • NXDOMAIN for SRV Records

The core of the issue stemmed from VPNKit treating the EAI_NODATA error code from getaddrinfo as a fatal failure. EAI_NODATA indicates that the name exists, but there are no addresses of the requested type (e.g., no AAAA records if IPv6 is filtered out, or no SRV records for a specific query).

Solution:

This pull request modifies VPNKit's DNS handling to interpret EAI_NODATA as a successful result, specifically as an empty address list. By doing so, DNS lookups can complete successfully even when an upstream DNS server returns an empty list for a particular query type.

Expected Outcome:

Notes

Acknowledgments

  • Special thanks to @joanbm, who is explicitly listed as a co-author on this commit (the fullname and email address is retrieved from their own public repo).

  • A special thanks to Tomoya Kawaguchi (@yamoyamoto) for their invaluable help in debugging this issue. Tomoya added debug logs to narrow down the problem and confirmed that nslookup no longer returned NXDOMAIN within the Alpine images in the DIND environment.

Co-authored-by: Joan Bruguera Micó <joanbrugueram@gmail.com>
@dan0dbfe
Copy link
Copy Markdown
Contributor

dan0dbfe commented Jul 2, 2025

Looks like something changed that broke the dep solver in the build stage of ocaml-ci.

I'm guessing the package upgrades are unintentional but a side effect of opam-repository updating something.

In the last PR that worked:

Solving with opam-repository commit: https://github.com/ocaml/opam-repository.git#refs/heads/master (ad2202b486885c0c13f146ead5c85646ce87e24e)

and it used cmdliner.1.0.4 for example.

However this PR's run failed. The commit used was:

Solving with opam-repository commit: https://github.com/ocaml/opam-repository.git#refs/heads/master (3e4a334c6caed0798833ef331ef815eede191f03)

and it failed to solve as, amongst other problems, alcotest 1.9.0 requires >= 1.2.0

@djs55
Copy link
Copy Markdown
Collaborator

djs55 commented Jul 3, 2025

I think the CI failures are a separate issue. I did a test build locally with docker build -t test . and it was fine.

@keigoi thanks for the detailed explanation and patch. It looks safe so I'll merge it and we can try to fix the CI separately.

@djs55 djs55 merged commit 4fd14e5 into moby:master Jul 3, 2025
5 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants