Finding the Issue

After some googling, I was about to create an issue over at Calico’s GitHub for which I wanted to provide as much information as possible. When I started to collect that information, I realized that the problem was limited to one subnet in particular. All the other local networks were accessible just fine. Spending some time thinking about what’s special about this particular network, I decided to look at all the network related configurations in the cluster, when I found a MetalLB ipaddresspool that was clearly exposing IP addresses out of that problematic network CIDR - I had set that up months ago when I wanted to expose services as part of that network. Luckily, none of those were in use and I could just delete that configuration.

Because I configured MetalLB in such a way that it relies on Calico to do the BGP announcements, I knew that I had also modify the bgpconfiguration for Calico. When I was looking at that, I noticed that it announced 192.168.1.0/24 - the problematic network - via the serviceLoadBalancerIPs property. I removed that as well and everything started to work again almost instantaneously.

What’s Bothering

As always, I am glad I could resolve the issue all by myself, as this is the best way to learn things. However, in this case I don’t feel as if the problem is understood completely. The configuration I had to remove has been that way for months. It didn’t cause any issues through Calico updates, cluster reboots, or even complete restarts of network routers and switches. So it is still a mystery to me, why this has been causing issues all of a sudden.

More Networking Woes...

More Problems

Finding the Issue

What’s Bothering