2025.06.29 - Proxmox Fix
Intro
Po ostatnich aktualizacjach Proxmoxa zaczął mi się resetować Lenovo m920q. Przejrzałem internety i problem jest diagnozowany jako "marne sterowniki do karty sieciowej Intela".
Drugim problemem to zaśmiecanie logów wpisami z firewall'a.
Network - fix z internetu
W logach widać, że maszyna przed resetem ma problem z kartą sieciową:
Jun 29 04:24:07 pve3 ceph-osd[1173]: 2025-06-29T04:24:07.113+0200 xxx -1 osd.0 1844 get_health_metrics reporting 1 slow ops, oldest is osd_op(client.xxx.0:xxx 4.12 4:xxx:::rbd_data.xxx.xxx:head [write 1200128~4096 in=4096b] snapc 0=[] ondisk+write+known_if_redirected+supports_pool_eio e1844)
Jun 29 04:24:07 pve3 kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
TDH <95>
TDT <7f>
next_to_use <7f>
next_to_clean <94>
buffer_info[next_to_clean]:
time_stamp <10ce578be>
next_to_watch <95>
jiffies <10ce60f00>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
Jun 29 04:24:08 pve3 ceph-osd[1173]: 2025-06-29T04:24:08.090+0200 xxx -1 osd.0 1844 heartbeat_check: no reply from 192.168.2.251:6806 osd.1 since back 2025-06-29T04:23:25.862229+0200 front 2025-06-29T04:23:25.862237+0200 (oldest deadline 2025-06-29T04:23:51.161943+0200)
Jun 29 04:24:08 pve3 ceph-osd[1173]: 2025-06-29T04:24:08.090+0200 xxx -1 osd.0 1844 heartbeat_check: no reply from 192.168.2.250:6806 osd.2 since back 2025-06-29T04:23:25.862255+0200 front 2025-06-29T04:23:25.862248+0200 (oldest deadline 2025-06-29T04:23:51.161943+0200)
Jun 29 04:24:08 pve3 ceph-osd[1173]: 2025-06-29T04:24:08.090+0200 xxx -1 osd.0 1844 get_health_metrics reporting 1 slow ops, oldest is osd_op(client.xxx.0:xxx 4.12 4:xxx:::rbd_data.xxx.xxx:head [write 1200128~4096 in=4096b] snapc 0=[] ondisk+write+known_if_redirected+supports_pool_eio e1844)
-- Boot xxx --
Jun 29 04:26:41 pve3 kernel: Linux version 6.8.12-11-pve (build@proxmox) (gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-11 (2025-05-22T09:39Z) ()
Jun 29 04:26:41 pve3 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-11-pve root=/dev/mapper/pve-root ro quiet
Problem opisany w https://bugzilla.proxmox.com/show_bug.cgi?id=6273
Potencjalne rozwiązanie to https://forum.proxmox.com/threads/e1000-driver-hang.58284/page-8#post-375919
Dopisałem do /etc/networking/interfaces post-up dla eno1.
(...)
iface eno1 inet manual
post-up /usr/bin/logger -p info -t ifup "Disabling offload for eno1" && /sbin/ethtool -K eno1 tso off gso off gro off && /usr/bin/logger -p info -t ifup "Disabled offload of eno1"
(...)
Po ręcznym uruchomieniu skrypty w logach widać wpis
# journalctl -S "1min ago"
Jun 29 11:03:03 pve3 ifup[167278]: Disabling offload for eno1
Jun 29 11:03:03 pve3 ifup[167280]: Disabled offload of eno1
Teraz czekamy i patrzymy czy serwer będzie bardziej stabilny.
Firewall - fix z internetu
W logach widnieje dużo wpisów:
Jun 29 10:03:36 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:03:46 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:03:56 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:04:06 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:04:16 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:04:26 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:04:36 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:04:46 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:04:56 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:05:06 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Jun 29 10:05:16 pve3 pve-firewall[1155]: status update error: ipset_restore_cmdlist: ipset v7.17: Error in line 4: Element cannot be added to the set: it's already added
Problem i rozwiązanie opisane w https://forum.proxmox.com/threads/proxmox-firewall-doesnt-seem-to-work-and-errors-in-log.128359/#post-561729
Przy konfiguracji firewalla chyba źle podałem aliasy dla hostów PVE - miały adresy 192.168.2.x/24 zamiast /32. Usunięcie maski via GUI (Datacenter -> Firewall -> Alias) rozwiązało sprawę i w logach nie ma już błędów.
Jun 29 11:07:46 pve3 pvefw-logger[592]: received terminate request (signal)
Jun 29 11:07:46 pve3 pvefw-logger[592]: stopping pvefw logger
Jun 29 11:07:46 pve3 systemd[1]: Stopping pvefw-logger.service - Proxmox VE firewall logger...
Jun 29 11:07:47 pve3 systemd[1]: pvefw-logger.service: Deactivated successfully.
Jun 29 11:07:47 pve3 systemd[1]: Stopped pvefw-logger.service - Proxmox VE firewall logger.
Jun 29 11:07:47 pve3 systemd[1]: pvefw-logger.service: Consumed 1.316s CPU time.
Jun 29 11:07:47 pve3 systemd[1]: Starting pvefw-logger.service - Proxmox VE firewall logger...
Jun 29 11:07:47 pve3 pvefw-logger[169286]: starting pvefw logger
Jun 29 11:07:47 pve3 systemd[1]: Started pvefw-logger.service - Proxmox VE firewall logger.
Jun 29 11:08:16 pve3 pmxcfs[146658]: [status] notice: received log
Jun 29 11:08:45 pve3 pmxcfs[146658]: [status] notice: received log
Jun 29 11:08:45 pve3 pmxcfs[146658]: [status] notice: received log
Jun 29 11:11:13 pve3 pmxcfs[146658]: [status] notice: received log