Bare Metal - Proxmox Server Freeze / Hard reboot
Answers to your questions / Bare Metal / Proxmox Server Freeze / H...
BMPCreated with Sketch.BMPZIPCreated with Sketch.ZIPXLSCreated with Sketch.XLSTXTCreated with Sketch.TXTPPTCreated with Sketch.PPTPNGCreated with Sketch.PNGPDFCreated with Sketch.PDFJPGCreated with Sketch.JPGGIFCreated with Sketch.GIFDOCCreated with Sketch.DOC Error Created with Sketch.
Frage

Proxmox Server Freeze / Hard reboot

Von
kadybat
Erstellungsdatum 2025-05-15 21:44:50 (edited on 2025-05-16 10:07:47) in Bare Metal

Hey everyone,

I'm facing an issue with my dedicated server (running Proxmox) hosted by OVH since 4 days. The system randomly freezes, and OVH ends up performing an automatic reboot. This has happened multiple times now, and I'm trying to identify the root cause.

OVH has already conducted three separate hardware investigations and reported no hardware issues.
The NVMe drives were replaced in December, as the previous ones were worn out.
OVH now claims it must be a software issue, but unfortunately there are no clear logs pointing to the root cause.


Interestingly, we had the same issue about two months ago, and what helped back then was reducing the resource limits (CPU/RAM) on the individual containers and VMs. After doing that, the server ran stable for a while – until now.

Here are my server specs:

  • Proxmox Version: 8.4.1
  • CPU: AMD Ryzen 5 3600X - 6c/12t - 3.8 GHz / 4.4 GHz
  • RAM: 64 GB ECC 2666 MHz
  • Storage: 2×500 GB NVMe SSD (Soft RAID)

Here are some logs from /var/log/syslog right before the last freeze:

May 15 14:20:57 mio-network sshd[995589]: Received disconnect from 218.92.0.249 port 31935:11: [preauth]
May 15 14:20:57 mio-network sshd[995589]: Disconnected from authenticating user root 218.92.0.249 port 31935 [preauth]
May 15 14:20:57 mio-network sshd[995589]: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=218.92.0.249 user=root
...
May 15 14:21:13 mio-network sshd[996018]: Disconnected from invalid user elena 8.217.43.77 port 49102 [preauth]
May 15 14:21:15 mio-network audit[996161]: NETFILTER_CFG table=filter family=7 entries=0 op=xt_replace pid=996161 subj=unconfined comm="ebtables-restor"
-- Reboot --


As you can see:

  • There were multiple SSH brute-force attempts from random IPs (China, Russia, etc.).
  • SSH logins for invalid or root users were being attempted repeatedly.
  • Around the time of the freeze, ebtables-restore was executed, modifying netfilter rules.
  • Shortly after, the server completely froze and OVH initiated a reboot.


There’s no clear indication of a kernel panic, OOM killer, or disk error in the logs. Just regular cron activity, audit logs, and network rule updates.

Has anyone experienced similar behavior with ebtables-restore or Proxmox freezing with no clear cause?

Additional Info:

  • Root SSH login is still enabled (working on securing it).
  • No monitoring for hardware issues yet (will check with smartctl).
  • Using 4 LXC Containers, 2 Windows VMs

We have already performed several checks ourselves, including:

  • Filesystem check (fsck)
  • RAM tests (memtest)
  • Disk health checks (SMART)
    All of these showed no errors.


Any insights, suggestions, or tools to better trace the next incident would be highly appreciated.
Thanks in advance!


4 Antworten ( Latest reply on 2025-05-16 14:30:31 Von
Sich
)

Well, at first, I was thinking about memory issue, that the most common case of freeze.

Did you monitor the temperature? How is the server load before the freeze? As you say that reducing the resource allocated to VM helped with the issue.

The brute force on SSH should not trigger a freeze, you can always look at something like Crowdsec to reduce the "noise".

Random freeze is really painful to fix, specially when you have nothing in the logs.

Did you try to connect with kvm on freeze? To see if something is on the screen?

For SSH brute-force attempts, you can install Fail2Ban.

You can also improve your SSHd configuration :

  • define an another userid
  • use this new userid to ssh login
  • disable sshd root logon (need to login 1st with "yourotheruserid")
  • use you SSH key to make logon
  • create (if not yet done) your ed25519 key
  • use only ed25519 key !
  • disable RSA key (less secure) in sshd config
  • only when your logon with ed25519 key is ok, disable password login

This will reduce a lot the number of SSH attempts.

 

But, as Sich explain, these SSH attempts is NOT link to your server reboot !