Benchmarking a TrueNAS Scale VM in Proxmox
Getting my head around real world performance
Small recap from my first post; I've been watching a lot of video's of other nerds claiming that running TrueNAS Scale as a VM works fine. I even had some video's telling me you can run TrueNAS Scale in a VM, SMB-share the ZFS datasets from that VM with the Proxmox host and run other VM's from those SMB-shares.
First thing that came to mind seeing this ; WHY??
Why would you want to run a VM from a Samba-share. One could argue that having TrueNAS ZFS for replication and snapshot could be valuable, and since it's all running on a virtual network you would not be encumbered by the speeds of your physical network-adapter. Proxmox has support for ZFS out of the box and though Proxmox itself is not really a solid solution as NAS, using ZFS with a bunch of SSD's in mirror for VM's would probably be a lot faster than running those VM's from a SMB-share.. Or NFS for that matter.
But then again; most of the guys in those video's were very positive about VM's on SMB-shares so I had to check.
I installed TrueNAS Core in a VM running on a NVME SSD. I gave it 4 cores and 24 Gb's of RAM and 2 virtual NIC's. Both paravirtualized VirtIO nics, this is important because this allows speeds up to 10Gbit/s. If you use Realtek or Intel you're essentially emulating hardware with lower speeds (up to 1 gbit/s).
My TrueNAS VM has 2x3,5" Harddisks in mirror and one 2.5" SSD (Samsung) as striped ZFS dataset. All disks are connected to SATA-ports. I added these disks following this guide. Both ZFS datasets have no caching disks at all.
For testing I've created a clean Debian 11 VM with Gnome desktop. The VM itself runs on the same NVME SSD as TrueNAS Scale runs from. Debian has 2 cores, 2Gb RAM, one VirtIO Paravirtualized NIC.
I wanted to run several test-cases;
- HDD vs. SSD
- SMB vs NFS
- Virtio SCSI vs Virtio Block (and SATA)
- Raw vs qcow2
These are settings from Proxmox, everytime I wanted to benchmark a new test-case I created a new virtual drive and assigned it to the Debian VM. SMB and NFS shares are mounted from Proxmox.
For benchmarking I used the default disk-management app from Gnome. There's a benchmark feature and every time I used 100 samples of 100MiB each. I also ran some real-world tests by copying a large file from local to the mounted test-drive.
I did test most scenario's with SMB as well. While benchmarking with the Gnome Disk-tool went fine I could not do any real-world tests; the VM would freeze after 10 seconds of copying and Proxmox would throw a IO-error. I have encountered this before while trying to install a VM on a SMB-share. So I left these results out of the conclusion.
Another observation; when using qcow2 as format I could run the benchmark once, the second time the transferspeeds would exceed 20gbit/s. I'm not sure why this happens and this needs further investigation.
One day, when I finally understand Excel I'll generate some nice graph's. But for now I'll stick to raw data and a conclusion.
To view the raw results, check out this Notion database
Best performing setup is a ZFS dataset backed by a SSD, shared over NFS, connected to a VM using VirtIO Block and file-type RAW. Real-world writes are around 280mb/s and reads around 560Mb/s. Not at all bad, especially if you compare those number with a virtual disk running on the NVME-ssd (360 mb/s write, 420mb/s read). What really surprised me is the fact that reads are slower from NVME, maybe because the SSD is also running the TrueNAS and Debian VM's? The SSD from TrueNAS is only a SATA disk and it has some overhead from NFS as well. So I do not really understand why NVME is performing significantly worse in this scenario.
If you don't have a ZFS dataset backed by SSD you will get decent read-performance from a HDD backed ZFS dataset, around 560 mb/s, but writing to that set is pretty slow; around 80mb/s.
There are some things to keep in mind.
If you are running a lot of VM's, and some of those VM's are accessing data from TrueNAS using NFS you will suffer from lower throughput. The VirtIO NIC performance will be hindered by the CPU and is the bottleneck in your system. I did not test the disk-throughput when NFS is handling other stuff as well (maybe another day).
Though performance is really good when running from a shared ZFS dataset it still feels wrong to me. Especially if VM's will be accessing files from the same ZFS dataset over NFS. I did not even touch on access times and those are a lot worse on shares. Random access and databases running in a VM on a share would probably perform pretty bad. I'll be benchmarking the same setup only with TrueNAS core instead of Scale, I don't expect huge differences but it would help making a decision about where to store my ZFS dataset.
In the end I will be running my VM's from 2 SSD's in Mirror in ZFS on Proxmox, not using shares.