PSA about mini PCs: They might not come with adequate cooling for RAM, leading to potential data corruption.
(I’m in the middle of troubleshooting/fixing overheating RAM causing memory errors, will post on /c/selfhosted when I have more conclusions).
TLDR: Bought 3 Minisforum HM90 mini PCs (for Proxmox), equipped them with 64gb (2x32gb) RAM, with a different brand RAM in each PC. All 3 give sporadic errors in Memtest86. The RAM overheats due to the 2 SSDs mounted in the lid blocking natural airflow. With the lid off, or an extra fan installed, there are no errors.
The errors were very sporadic: 1 PC gave errors after 1-2 passes, then almost 24hours. Second PC gave errors after more than 24 hours and some cases more than 48 hours between errors. The last PC gave hundreds of errors on the first pas.
To be fair, memtest is a synthetic test and the RAM is unlikely to see 100% utilisation in real life, on the other hand the two adjacent SATA SSDs and the NVMe SSD are completely idle during memtest, and will generate extra heat during production use.
Take this seriously, people. I’ve been there and it caused tons of issues on an older server of mine. That’s why I was very adamant about my current system having built-in error correction for its RAM.
PSA about mini PCs: They might not come with adequate cooling for RAM, leading to potential data corruption.
(I’m in the middle of troubleshooting/fixing overheating RAM causing memory errors, will post on /c/selfhosted when I have more conclusions).
TLDR: Bought 3 Minisforum HM90 mini PCs (for Proxmox), equipped them with 64gb (2x32gb) RAM, with a different brand RAM in each PC. All 3 give sporadic errors in Memtest86. The RAM overheats due to the 2 SSDs mounted in the lid blocking natural airflow. With the lid off, or an extra fan installed, there are no errors. The errors were very sporadic: 1 PC gave errors after 1-2 passes, then almost 24hours. Second PC gave errors after more than 24 hours and some cases more than 48 hours between errors. The last PC gave hundreds of errors on the first pas. To be fair, memtest is a synthetic test and the RAM is unlikely to see 100% utilisation in real life, on the other hand the two adjacent SATA SSDs and the NVMe SSD are completely idle during memtest, and will generate extra heat during production use.
Take this seriously, people. I’ve been there and it caused tons of issues on an older server of mine. That’s why I was very adamant about my current system having built-in error correction for its RAM.
Had an NVME fritz out on me on a passively cooled NUC because of thermals, I suspect. That sucked.