OpenShift DNS – Internals - J. Faber's Blog & Wiki

# OpenShift DNS – Internals ## CoreDNS Under the hood, there is a [CoreDNS](https://coredns.io/), which handles DNS resolution for Pods inside the cluster. DNS services run as DaemonSets (meaning they ignore all taints and match the selector `kubernetes.io/os=linux`) on every node in a cluster. ```shell-session $ oc get po -n openshift-dns NAME READY STATUS RESTARTS AGE dns-default-4zjw4 2/2 Running 2 5d18h dns-default-7hznz 2/2 Running 2 5d18h dns-default-878q4 2/2 Running 2 5d18h dns-default-rnqhg 2/2 Running 2 5d18h dns-default-rwc2h 2/2 Running 2 5d18h dns-default-s9bpl 2/2 Running 2 5d18h ``` Example CoreDNS config: ```text .:5353 { bufsize 1232 errors log . { class error } health { lameduck 20s } ready kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure fallthrough in-addr.arpa ip6.arpa } prometheus 127.0.0.1:9153 forward . /etc/resolv.conf { policy sequential } cache 900 { denial 9984 30 } reload } hostname.bind:5353 { chaos } ``` Notable options from the configuration include: - Bound on port 5353 (remapped to 53 in the Service) - Configures CoreDNS to handle the kubernetes `*.cluster.local` domain (default k8s domain name) and reverse DNS for both IPv4 and IPv6. If reverse DNS is not found within CoreDNS's database, it "falls through" to the next plugin for resolution – forward in this case - The Kubernetes plugin is connected directly to the Kubernetes API. It watches for changes in endpoints. These changes are essentially instant – the plugin has some cache, but it gets invalidated on endpoint change. - Everything else is forwarded to the upstream based on `/etc/resolv.conf` - Cache (for forwarded queries only): - For 15 minutes - Negative cache (NXDOMAIN) for 30 seconds, up to 9984 responses `openshift-dns` pods are deployed with a `dnsPolicy` setting of `Default`. So, `/etc/resolv.conf` is the worker's real configuration. In this case: ```text search ocp.faber.sh nameserver 10.0.40.1 ``` ## `dnsPolicy` This setting determines how DNS queries are resolved within the pods. The available `dnsPolicy` options and their implications are as follows: - **`Default`**: This policy causes the pod to inherit the DNS configuration of the worker node it's running on. Essentially, DNS queries from the pod are resolved in the same manner as queries originating from the node itself. - **`None`**: When this policy is selected, the pod ignores the standard DNS configuration provided by Kubernetes. Instead, the `dnsConfig` option must be explicitly specified to define how DNS queries should be handled. This policy is typically used for custom DNS setups. - **`ClusterFirst`**: With this policy, DNS queries from the pod are initially directed to the cluster's DNS service, typically CoreDNS. If the query does not correspond to any domain within the cluster, it is then forwarded to upstream DNS servers for resolution. This ensures that internal cluster resources are prioritized in DNS resolution – this one is **the default in practice**. - **`ClusterFirstWithHostNet`**: This policy is used in conjunction with the `hostNetwork` setting. It directs DNS queries to the cluster's DNS service first, similar to `ClusterFirst`. However, if the DNS query does not match any cluster domains, it is forwarded to the DNS servers configured for the host network. This policy is suitable for pods that need to interact with both cluster resources and external services while using the host's network namespace. ## How do pods resolve hostnames inside a cluster? Default `dnsPolicy` is `ClusterFirst` so `/etc/resolv.conf` that gets injected inside a container for dns resolution is pointing to CoreDNS pods. Inside pods, applications (more specifically *glibc*, *go resolver*,...) use this configuration for dns resolution (example for a pod in namespace `argocd`): > [!info] > *glibc* does not perform any DNS caching on its own (however, it can be coupled with [nscd](https://linux.die.net/man/8/nscd) for caching). ```text search argocd.svc.cluster.local svc.cluster.local cluster.local ocp.faber.sh nameserver 172.30.0.10 options ndots:5 ``` - **`search`**: The `search` line lists domains to append to a request when the name being queried **does not contain a dot**. This is relevant for short, unqualified names (e.g., "my-service"). With `ndots:5`, it means that if the query name has fewer than 5 dots, Kubernetes (or rather, the *glibc* resolver in the container) will try appending each of these search domains in order until it resolves or fails. - **`nameserver`**: This line specifies the DNS server to which the queries should be sent. The IP `172.30.0.10` corresponds to the CoreDNS service within the cluster. This is a ClusterIP, which Kubernetes uses to expose a service internally within the cluster. - **`options ndots:5`**: This option configures the resolver to treat a query as fully-qualified (and thus not use the search list) only if it contains 5 or more dots. This is a somewhat high value, tailored to Kubernetes' DNS resolution practices, allowing short names to be searched within the specified search domains. > [!tip]- Prefer `my-service` instead of `my-service.namespace.svc` for inter-namespace service connections. > > Single DNS request for Service `argocd-server`: > ```shell-session > $ nslookup -debug argocd-server > Server: 172.30.0.10 > Address: 172.30.0.10#53 > > ------------ > QUESTIONS: > argocd-server.argocd.svc.cluster.local, type = A, class = IN > ANSWERS: > -> argocd-server.argocd.svc.cluster.local > internet address = 172.30.208.93 > ttl = 5 > AUTHORITY RECORDS: > ADDITIONAL RECORDS: > ------------ > Name: argocd-server.argocd.svc.cluster.local > Address: 172.30.208.93 > ------------ > QUESTIONS: > argocd-server.argocd.svc.cluster.local, type = AAAA, class = IN > ANSWERS: > AUTHORITY RECORDS: > -> cluster.local > origin = ns.dns.cluster.local > mail addr = hostmaster.cluster.local > serial = 1711260333 > refresh = 7200 > retry = 1800 > expire = 86400 > minimum = 5 > ttl = 5 > ADDITIONAL RECORDS: > ------------ > ``` > > Extra additional DNS query request if using `argocd-server.argocd.svc`: > ```shell-session > $ nslookup -debug argocd-server.argocd.svc > Server: 172.30.0.10 > Address: 172.30.0.10#53 > > ------------ > QUESTIONS: > argocd-server.argocd.svc.argocd.svc.cluster.local, type = A, class = IN > ANSWERS: > AUTHORITY RECORDS: > -> cluster.local > origin = ns.dns.cluster.local > mail addr = hostmaster.cluster.local > serial = 1711260333 > refresh = 7200 > retry = 1800 > expire = 86400 > minimum = 5 > ttl = 5 > ADDITIONAL RECORDS: > ------------ > ** server can't find argocd-server.argocd.svc.argocd.svc.cluster.local: NXDOMAIN > Server: 172.30.0.10 > Address: 172.30.0.10#53 > > ------------ > QUESTIONS: > argocd-server.argocd.svc.svc.cluster.local, type = A, class = IN > ANSWERS: > AUTHORITY RECORDS: > -> cluster.local > origin = ns.dns.cluster.local > mail addr = hostmaster.cluster.local > serial = 1711260333 > refresh = 7200 > retry = 1800 > expire = 86400 > minimum = 5 > ttl = 5 > ADDITIONAL RECORDS: > ------------ > ** server can't find argocd-server.argocd.svc.svc.cluster.local: NXDOMAIN > Server: 172.30.0.10 > Address: 172.30.0.10#53 > > ------------ > QUESTIONS: > argocd-server.argocd.svc.cluster.local, type = A, class = IN > ANSWERS: > -> argocd-server.argocd.svc.cluster.local > internet address = 172.30.208.93 > ttl = 5 > AUTHORITY RECORDS: > ADDITIONAL RECORDS: > ------------ > Name: argocd-server.argocd.svc.cluster.local > Address: 172.30.208.93 > ------------ > QUESTIONS: > argocd-server.argocd.svc.cluster.local, type = AAAA, class = IN > ANSWERS: > AUTHORITY RECORDS: > -> cluster.local > origin = ns.dns.cluster.local > mail addr = hostmaster.cluster.local > serial = 1711260333 > refresh = 7200 > retry = 1800 > expire = 86400 > minimum = 5 > ttl = 5 > ADDITIONAL RECORDS: > ------------ > ``` `ClusterIP` that resolvers inside pods connect to for IP address resolution: ```shell-session $ oc get svc -o wide -n openshift-dns NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR dns-default ClusterIP 172.30.0.10 <none> 53/UDP,53/TCP,9154/TCP 26d dns.operator.openshift.io/daemonset-dns=default ``` ```shell-session $ oc get endpoints -o wide -n openshift-dns NAME ENDPOINTS AGE dns-default 10.128.0.9:5353,10.128.2.3:5353,10.129.0.7:5353 + 15 more... 26d ``` ```shell-session $ ovn-nbctl lb-list eeb98e2d-2f6e-4764-9492-220fcea2a5d4 Service_openshif udp 172.30.0.10:53 10.128.0.9:5353,10.128.2.3:5353,10.129.0.7:5353,10.129.2.3:5353,10.130.0.7:5353,10.131.0.3:5353 $ ovn-nbctl find Load_Balancer _uuid : eeb98e2d-2f6e-4764-9492-220fcea2a5d4 external_ids : {"k8s.ovn.org/kind"=Service, "k8s.ovn.org/owner"="openshift-dns/dns-default"} health_check : [] ip_port_mappings : {} name : "Service_openshift-dns/dns-default_UDP_node_router_w1.ocp.faber.sh" options : {event="false", hairpin_snat_ip="169.254.169.5 fd69::5", neighbor_responder=none, reject="true", skip_snat="false"} protocol : udp selection_fields : [] vips : {"172.30.0.10:53"="10.128.0.9:5353,10.128.2.3:5353,10.129.0.7:5353,10.129.2.3:5353,10.130.0.7:5353,10.131.0.3:5353"} ``` There is no special option for the load balancing strategy, so OVNKubernetes employs a round-robin, stateless strategy for load balancing. > [!info] > > **OpenShift SDN** used iptables for managing Service Network in the past. However, **OVNKubernetes** now manages it through Open vSwitch. > > Iptables is still used for NodePort and LoadBalncer (external IP) Services. ## Resources - [Understanding the DNS Operator | Networking | OKD 4](https://docs.okd.io/latest/networking/dns-operator.html) - https://rcarrata.com/openshift/dns-forwarding-openshift/ - https://rcarrata.com/openshift/dns-deep-dive-in-openshift/ - [OVN-Kubernetes architecture - OVN-Kubernetes network plugin | Networking | OpenShift Container Platform 4.14](https://docs.openshift.com/container-platform/4.14/networking/ovn_kubernetes_network_provider/ovn-kubernetes-architecture-assembly.html)