Containers and Orchestration - Kubernetes service endpoints no longer auto created, updated or deleted, how to fix?
... / Kubernetes service endpoi...
BMPCreated with Sketch.BMPZIPCreated with Sketch.ZIPXLSCreated with Sketch.XLSTXTCreated with Sketch.TXTPPTCreated with Sketch.PPTPNGCreated with Sketch.PNGPDFCreated with Sketch.PDFJPGCreated with Sketch.JPGGIFCreated with Sketch.GIFDOCCreated with Sketch.DOC Error Created with Sketch.
Frage

Kubernetes service endpoints no longer auto created, updated or deleted, how to fix?

Von
Uniqbit
Erstellungsdatum 2025-06-03 08:50:00 (edited on 2025-06-09 15:24:29) in Containers and Orchestration

Since five days ago, one of our managed Kubernetes clusters has been malfunctioning, and we are unsure how to address this issue. The symptoms include some workloads on the cluster having trouble verifying node certificates, which we resolved by rotating all existing nodes in our node pools. However, a troubling issue remains: the cluster is no longer automatically creating or updating endpoint objects for service objects. This means that when we deploy a new deployment with a service, for example, we cannot reach it from other pods because the cluster does not create or bind an endpoint for it. Deleting a service occasionally deletes the corresponding endpoint, but this behavior is not reliable.

The cluster does not log or report errors that we can relate to this issue, and on the surface, everything appears to be functioning normally. Have you experienced a similar issue and know the best way to address it?

[Edit] corrected spelling and grammar..


1 Antwort ( Latest reply on 2025-06-09 15:30:39 Von
Uniqbit
)

I'm seeing that the ovhcloud-konnectivity-agent is logging failed connections:

I0603 14:13:34.506695 1 client.go:354] "Received DIAL_REQ" serverID="b121eaba-7112-443c-b6a7-686026918cc2" agentID="63a50384-0369-4b5e-9fd0-a263fadc4ea5" dialID=7498280033314238644 dialAddress="10.2.1.33:443"
I0603 14:13:34.510103 1 client.go:354] "Received DIAL_REQ" serverID="b121eaba-7112-443c-b6a7-686026918cc2" agentID="63a50384-0369-4b5e-9fd0-a263fadc4ea5" dialID=5599432136654785947 dialAddress="10.2.1.33:443"
I0603 14:13:34.512266 1 client.go:483] "received CLOSE_REQ" connectionID=0
I0603 14:13:34.512281 1 client.go:489] "Failed to find connection context for close" connectionID=0
I0603 14:13:35.397189 1 client.go:420] "error dialing backend" error="dial tcp 10.2.3.24:10250: i/o timeout" dialID=6139996188483838982 connectionID=5782 dialAddress="10.2.3.24:10250"
I0603 14:13:35.406239 1 client.go:483] "received CLOSE_REQ" connectionID=0
I0603 14:13:35.406258 1 client.go:489] "Failed to find connection context for close" connectionID=0
I0603 14:13:36.093710 1 client.go:354] "Received DIAL_REQ" serverID="b121eaba-7112-443c-b6a7-686026918cc2" agentID="63a50384-0369-4b5e-9fd0-a263fadc4ea5" dialID=8667963574853404645 dialAddress="10.2.3.101:9443"
I0603 14:13:36.114566 1 client.go:420] "error dialing backend" error="dial tcp 10.2.1.33:443: i/o timeout" dialID=6571743257240054316 connectionID=5783 dialAddress="10.2.1.33:443"
I0603 14:13:36.124114 1 client.go:483] "received CLOSE_REQ" connectionID=0
I0603 14:13:36.124130 1 client.go:489] "Failed to find connection context for close" connectionID=0
I0603 14:13:36.262956 1 client.go:420] "error dialing backend" error="dial tcp 10.2.1.33:443: i/o timeout" dialID=8826262292868951193 connectionID=5784 dialAddress="10.2.1.33:443"
I0603 14:13:36.272076 1 client.go:483] "received CLOSE_REQ" connectionID=0
I0603 14:13:36.272089 1 client.go:489] "Failed to find connection context for close" connectionID=0
I0603 14:13:36.455660 1 client.go:354] "Received DIAL_REQ" serverID="b121eaba-7112-443c-b6a7-686026918cc2" agentID="63a50384-0369-4b5e-9fd0-a263fadc4ea5" dialID=6633832615179440962 dialAddress="10.2.3.24:10250"
I0603 14:13:36.504801 1 client.go:420] "error dialing backend" error="dial tcp 10.2.1.33:443: i/o timeout" dialID=53346797863079866 connectionID=5785 dialAddress="10.2.1.33:443"
I0603 14:13:36.513935 1 client.go:483] "received CLOSE_REQ" connectionID=0
I0603 14:13:36.513953 1 client.go:489] "Failed to find connection context for close" connectionID=0
I0603 14:13:36.524846 1 client.go:420] "error dialing backend" error="dial tcp 10.2.1.33:443: i/o timeout" dialID=762598440534119604 connectionID=5786 dialAddress="10.2.1.33:443"
I0603 14:13:36.524852 1 client.go:420] "error dialing backend" error="dial tcp 10.2.1.33:443: i/o timeout" dialID=1150912048355849252 connectionID=5787 dialAddress="10.2.1.33:443"
I0603 14:13:36.533900 1 client.go:483] "received CLOSE_REQ" connectionID=0
I0603 14:13:36.533915 1 client.go:489] "Failed to find connection context for close" connectionID=0
I0603 14:13:36.534034 1 client.go:483] "received CLOSE_REQ" connectionID=0
I0603 14:13:36.534049 1 client.go:489] "Failed to find connection context for close" connectionID=0