I could write up my own how to and deployment guide here, However, Dean has done a fantastic job of documenting the deployment of the TKG resources within Azure, look no further if you are looking for a guide.
I deployed TKG 1.4 into Azure and the focus of my article is the documentation surrounding Azure networking.
Within the official documentation it states;
“For each workload cluster that you deploy later, you need to create a worker NSG named CLUSTER-NAME-node-nsg, where CLUSTER-NAME is the name of the workload cluster. This worker NSG must have the same VNET and region as its management cluster.”
Having followed the guide, created the required NSG object in Azure, and deploying test application workloads, I found I was unable to access public load-balanced IPs on Azure Tanzu Kubernetes Grid cluster.
Using the tooling native to Azure I soon was prompted with the following errors and possible cause.
Having a hunt around found that when I added a very open rule to the worker VM cluster object directly, the test application workloads started functioning.
As you can see the rule was very open and for testing purposes only, and not a solution that was tenable for a production deployment.
Working logically through each of the created objects, a bit of beard pulling and asking a few colleagues, it was soon discovered that the association with the workload cluster subnet with the workload cluster NSG was not present.
Associating the subnet with the NSG fixed the issue and the application workloads started working.
So if you are deploying TKG in Azure and discover that you are unable to access public load-balanced IPs on Azure Tanzu Kubernetes Grid cluster, then check the association of the workload subnet to the NSG first, it might save you a few hours troubleshooting.