Hello.
My setup is:
- Lenovo M920q mini pc with Proxmox installed (this doesn’t have IPMI, only vPRO and it’s annoying me)
- Fujitsu TX1320 M3 with TrueNAS Core installed - ZFS + RAID1 (this is a low-end “enterprise grade” server, and best thing - it has IPMI).
The Proxmox PC keeps all its CTs and 1 VM on the TrueNAS using iSCSI.
The idea behind my setup was that it felt nice that the TrueNAS would handle all the storage heavy lifting - ZFS, RAID etc., while the Proxmox mini PC would be a “compute-only” node that has a naked Proxmox install with some config.
The problem with that is if the TrueNAS machine loses power or is restarted, the Proxmox CTs/VMs switch their filesystem to read-only and stop responding to requests. This is because the iSCSI connection is interrupted. When the TrueNAS is back online, Proxmox doesn’t make any attempt to restart the VMs/CTs - they’d still be broken.
It’s annoying to me to have to VPN to the Proxmox web ui and wait 15 minutes until all the CTs/VMs are restarted and now again functioning on the “alive” iSCSI connection.
I was wondering what are my options here to remove the dependency chain?
I’m really into the idea of decomissioning the Proxmox node because I’m scared I won’t be able to (over VPN) change the power state of the machine if something goes wrong, since it only has vPro and not iSCSI like the TrueNAS machine. By doing that, I’d consolidate the storage and the compute into the TrueNAS machine.
Options I can think of:
- Decomission the Proxmox node and move all Debian VMs/CTs to TrueNAS BSD jails. Is that even possible? Will all my Debian VMs work in BSD?
- Decomission the Proxmox node, switch TrueNAS Core to TrueNAS Scale and move CTs/VMs to TrueNAS Scale’s Linux VMs
- Keep the Proxmox node and somehow figure out how to get Proxmox to refresh the CTs/VMs on iSCSI connection loss.
- Keep the Proxmox PC, but switch it to iESXI hoping that it handles the iSCSI failure more gracefully
EDIT: I didn’t make it clear at first - TrueNAS stores more data than just VMs - documents, Linux ISOs ™, photos, Syncthing
Thanks for making it clear that iSCSI power down is in fact one of the more grim scenarios, I couldn’t make it out how bad of a situation it is. In an enterprise environment a SAN being down would require some type of incident report.
UPS - as you suggested - would solve most of my problems to be honest.