my goal is to call a service on an aks cluster (aks1) from a pod or a service on a second aks cluster (aks2).
These clusters will be on different regions and should communicate over a private network.
Azure CNI plugin.
So, after some reading and some video listening, it seemed for me that the best option was to use an externalName service on AKS2 calling a service defined in a custom private DNS zone (ecommerce.private.eu.dev), being these two VNets paired before.
This seems the vnet giving the address space to aks services:
dev-vnet 10.0.0.0/14
=======================================
dev-test1-aks v1.22.4 - 1 node
dev-test1-vnet 11.0.0.0/16
dev-test2-aks v1.22.4 - 1 node
dev-test2-vnet 11.1.0.0/16
After a lot of trials all I can get is connectivity between pods networks and never to reach the service network from the other cluster.
- I don’t see any active firewall
- I’ve peered all three networks: dev-test1-vnet, dev-test2-vnet, dev-vnet (services CIDR)
- I’ve create a Private DNS zones private.eu.dev where I’ve put the "ecommerce" A record (10.0.129.155) that should be resolved by the externalName service
dev-test1-aks (EU cluster):
kubectl create deployment eu-ecommerce --image=k8s.gcr.io/echoserver:1.4 --port=8080 --replicas=1
kubectl expose deployment eu-ecommerce --type=ClusterIP --port=8080 --name=eu-ecommerce
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.1.1/deploy/static/provider/cloud/deploy.yaml
kubectl create ingress eu-ecommerce --class=nginx --rule=eu.ecommerce/*=eu-ecommerce:8080 -o yaml --dry-run=client
This is the ingress rule:
❯ kubectl --context=dev-test1-aks get ingress eu-ecommerce-2 -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: eu-ecommerce-2
namespace: default
spec:
ingressClassName: nginx
rules:
- host: lb.private.eu.dev
http:
paths:
- backend:
service:
name: eu-ecommerce
port:
number: 8080
path: /ecommerce
pathType: Prefix
status:
loadBalancer:
ingress:
- ip: 20.xxxxx
This is one of the externalName I’ve tried on dev-test2-aks:
apiVersion: v1
kind: Service
metadata:
name: eu-services
namespace: default
spec:
type: ExternalName
externalName: ecommerce.private.eu.dev
ports:
- port: 8080
protocol: TCP
These are some of my tests:
# --- Test externalName
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://eu-services:8080
: '
wget: cant connect to remote host (10.0.129.155): Connection timed out
'
# --- Test connectivity AKS1 -> eu-ecommerce service
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://eu-ecommerce:8080
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://10.0.129.155:8080
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://eu-ecommerce.default.svc.cluster.local:8080
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://ecommerce.private.eu.dev:8080
# OK client_address=11.0.0.11
# --- Test connectivity AKS2 -> eu-ecommerce POD
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://11.0.0.103:8080
#> OK
# --- Test connectivity AKS2 -> eu-ecommerce service
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://ecommerce.private.eu.dev:8080
#> FAIL
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget -qO- http://10.0.129.155:8080
# --- Test connectivity - LB private IP
kubectl --context=dev-test1-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget --no-cache -qO- http://lb.private.eu.dev/ecommerce
#> OK
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- wget --no-cache -qO- http://lb.private.eu.dev/ecommerce
#> KO wget: can't connect to remote host (10.0.11.164): Connection timed out
# --- Traceroute gives no informations
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- traceroute -n -m4 ecommerce.private.eu.dev
: '
* * *
3 * * *
4 * * *
'
# --- test2-aks can see the private dns zone and resolve the hostname
kubectl --context=dev-test2-aks run -it --rm --restart=Never busybox --image=gcr.io/google-containers/busybox -- nslookup ecommerce.private.eu.dev
: ' Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
Name: ecommerce.private.eu.dev
Address 1: 10.0.129.155
'
I’ve also created inbound and outbound network policies for the aks networks:
- on dev-aks (10.0/16) allow all incoming from 11.1/16 and 11.0/16
- on dev-test2-aks allow any outbound
Seen docs:
- https://learn.microsoft.com/en-us/azure/aks/private-clusters#virtual-network-peering
- https://kubernetes.io/docs/concepts/services-networking/service/#externalname
- https://learn.microsoft.com/en-us/azure/dns/private-dns-getstarted-portal#create-a-private-dns-zone
- https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-peering-overview
- https://www.youtube.com/watch?v=J4S6AxYNDtM
2
Answers
Solved using an internal load balancer on the peered VNet.
This address is routable and reachable from a peered network. So you can still use an externalName or externalIP to reach it from the other cluster services
In AKS the service CIDR is not part of your vnet address space and therefore it is not routed by Azure in any way so you won’t be able to connect from a pod directly to a service in another cluster.
What you have to do is:
With this your high-level communication scheme would look like this: (aks1)pod -> (aks2)lb -> (aks2)ingress -> (aks2)service -> (aks2)pods