I just wanted the community to be aware of this, OVH still didn't respond as to why this happened.
Yesterday they had a switch failure:
https://bare-metal-servers.status-ovhcloud.com/incidents/ljc3sd037jxm
a full (or two) racks offline.
No worries, you would think, just migrate your failover IPs. Well aparrently, if there is a failure on OVH side you can't. Here is what happened:
We migrated all the ranges. The good news most actually pointed to the other instance, but one got stuck in the status todo. So one range couldn't be migrated for the full length (and longer) of the failure.
The ovh support didn't really answer much and was very difficult to reach at all.
When the switch now finally got replaced and the other instance became available, the missing range was still in todo state, but at least it was reachable again due to the original instance being reachable. But things got ***worse*** from here!
Now we noticed, that the IP range that were migrated had difficulties communicating with the internet. The reason: both instances, the old one and the target of the migration where receiving parts of the packets. So it appears like OVH was announcing this IP from two switches instead of one.
OVH didn't fix this at all for the next 10 hours. Silently after a good 10 hours later they finally exited the todo state and we could move all IPs back to their origin and all issues are resolved as of now.
But the learning is. Failover IPs don't actually do what they're supposed to. They can't failover, maybe if your instance fails, but definitely not if it is such a simple thing as a ToR switched died.
Still awaiting from the OVH support an explanation.
Failover IPs do not work during failure
Related questions
- Blacklisted IP ranges by UCEPROTECTL3
25675
23.02.2021 09:08
- Hot to delete my account?
23319
29.07.2018 19:49
- Free easy to setup OpenWRT NAT firewall for OVH (c)ESXi dedicated hosts
18414
16.05.2022 07:18
- Purchased a dedicated server - Awaiting Validation on a Saturday
17424
19.05.2018 20:07
- Proxmox + opnsense
17130
19.07.2018 15:53
- KVM error, blank screen, not send Keys
16972
11.12.2021 11:45
- .iso install on IPMI very slow
16699
22.05.2018 11:14
- Upcoming change: EFI System Partition over RAID1 for Linux installations
15921
08.10.2025 16:29
- Network Resilience Improvement for LACP aggregation on OVHcloud Bare Metal Servers
15559
27.10.2025 11:21
- Can't cancel/return server
15549
25.06.2018 08:31