Skip to content

X10DRG #31

Open
Open
@varanova

Description

@varanova

Originally posted in another thread, then realized probably better to not clutter that motherboard's thread with X10DRG data.

Anyway, I did some testing and the X10DRG has FOUR zones, not 2.
So instead of zones 0x00 and 0x01, you use 0x00 - 0x03. They're all "CPU zones".

To make things work correctly, you'll need to edit the set_ipmi_fan_level.sh file, and make sure it handles all four zones for CPU calls. After you've tested the .SH files in the ipmi folder and everything is working correctly, update the actual smfc.py file prior to running install.sh, specifically the set_fan_level function, to make sure it handles all four zones in the same way as testing.

Obviously every config is different, so I wanted to share these notes for anyone else running a X10DRG mobo, as it took a few hours to figure this all out and why some of the fans weren't working!!

Thanks to @petersulyok for his work on this, it helped me immensely as I was completely lost by the ipmi documentation.

Activity

petersulyok

petersulyok commented on Nov 5, 2023

@petersulyok
Owner

Hi @varanova, thanks for sharing this! I would appreciate if you could share a picture about the motherboard to help me understand the physical topology of the board (I checked on Supermicro site and there are many variants). I'm wondering to know how multiple CPU zones and fans can be placed properly in this setup.

BTW, how many CPUs do you have installed on this board?

varanova

varanova commented on Nov 5, 2023

@varanova
Author

My build has 2 CPUs, so I set "cpus" to 2 under the conf file. I may be wrong, but I assumed it just averaged CPU temps? Or maybe takes the max between the two?

The server is the one listed here: (has photos and descriptions)
https://www.supermicro.com/products/system/4u/4028/sys-4028gr-trt.cfm

It's a GPU server, but there was no GPU zone as far as I could tell. On the default fan control, when the CPU temps went up, the fans went up. CPU down, fans down. However the fans were too aggressive for just CPU cooling, and mine still has these tiny 12k RPM fans which are like 100x nosier than noctua ones. That's why I was looking for some way to tone them down since the server room is close to my office.

petersulyok

petersulyok commented on Nov 9, 2023

@petersulyok
Owner

How many fans are installed in this system?

varanova

varanova commented on Nov 10, 2023

@varanova
Author

It has 8 fans. FAN1 - FAN8. There's also no FANA/FANB in this system.

Here's an example output from ipmitool: (sudo ipmitool sensor|grep FAN)
FAN1 | 3300.000 | RPM | ok | 300.000 | 500.000 | 700.000 | 25300.000 | 25400.000 | 25500.000
FAN2 | 3400.000 | RPM | ok | 300.000 | 500.000 | 700.000 | 25300.000 | 25400.000 | 25500.000
FAN3 | 3400.000 | RPM | ok | 300.000 | 500.000 | 700.000 | 25300.000 | 25400.000 | 25500.000
FAN4 | 3300.000 | RPM | ok | 300.000 | 500.000 | 700.000 | 25300.000 | 25400.000 | 25500.000
FAN5 | 3300.000 | RPM | ok | 300.000 | 500.000 | 700.000 | 25300.000 | 25400.000 | 25500.000
FAN6 | 3300.000 | RPM | ok | 300.000 | 500.000 | 700.000 | 25300.000 | 25400.000 | 25500.000
FAN7 | 3300.000 | RPM | ok | 300.000 | 500.000 | 700.000 | 25300.000 | 25400.000 | 25500.000
FAN8 | 3300.000 | RPM | ok | 300.000 | 500.000 | 700.000 | 25300.000 | 25400.000 | 25500.000

petersulyok

petersulyok commented on Nov 15, 2023

@petersulyok
Owner

Just for the clarification, I assume this is the typical fan placing/setup here:

where the fans cool everything, HDs, motherboard, CPUs, GPUs.

I think the challenge here to find the proper temperature source for the fan control.

varanova

varanova commented on Nov 16, 2023

@varanova
Author

That's the correct placement yes.

The ideal controller would look at all the temps (GPU/drives/CPU...)

Though it doesn't measure the GPU temps at all, so the fans don't ramp up at all for them. I don't even see a GPU temp measurement in the sensors list. I think you'd have to get the GPU temps from somewhere else (nvidia-smi for instance) in order to use that data in the fan speed calculation.

duecedriver

duecedriver commented on Feb 3, 2025

@duecedriver

once again..

write as part of the provisioning script a tool that would assist in determining what each fan/zone assignment does..

then let the user determine which sensor from the IPMI sensor list to assign as primary to that zone.. and monitor the remainder to ensure they dont exceed limits as a system wide safety net

duecedriver

duecedriver commented on Feb 3, 2025

@duecedriver

That's the correct placement yes.

The ideal controller would look at all the temps (GPU/drives/CPU...)

Though it doesn't measure the GPU temps at all, so the fans don't ramp up at all for them. I don't even see a GPU temp measurement in the sensors list. I think you'd have to get the GPU temps from somewhere else (nvidia-smi for instance) in order to use that data in the fan speed calculation.
For your board.. here is what you can do until SMFC becomes more configurable ... if ever...

either follow my suggestions in another thread to replace your fans with lower CFM / Amp fans that will meet your cooling demand at 50% duty cycle which is what the current BMC is programmed to shoot for.. or take over manually

since you have IPMI compiled and running on a system somewhere.. it does not even need to be on the server in question and honestly easier and more convenient to run it remotely from a laptop etc... here is a small script you can modify or enter the commands individually

The first line of setting the fans to maximum is mandatory.. and only has to be done once after the system boots... or you can set it from the IPMI/BMC web client on the host

the 'majic' happens on the other 2 lines...

the last 2 segments 0x00 0x05 are what slow the zones down

0x00 and 0x01 are the zones or fan groupings so in your case .. add 2 more for 0x02 and 0x03

the last 0xXX is a hex value for fan speed percentage .. again in hex ... so 0x64 would be 100%

ideally SMFC would query you for what sensor you want each of these 2 zones to key off and the min / max for that sensor... it would be easy but he is making this out to be nuclear physics...

write a couple versions of the script if you wish.. silent at idle for example .. 0x10 (16%) is pretty quiet if your cpu and components stay cool enough.. good enough..

have another for when the room is hotter or the system is doing more work.. say 0x20) like 30%

#!/bin/sh

Echo Setting Fans Maximum
ipmitool -H 192.168.0.11 -U ADMIN -P PASSWORD -I lanplus raw 0x30 0x45 0x01 0x01

Sleep 5

Echo Setting Fans Custom

ipmitool -H 192.168.0.11 -U ADMIN -P PASSWORD -I lanplus raw 0x30 0x70 0x66 0x01 0x00 0x05

ipmitool -H 192.168.0.11 -U ADMIN -P PASSWORD -I lanplus raw 0x30 0x70 0x66 0x01 0x01 0x05

Echo Waiting 10 Seconds for Sensors

Sleep 10

ipmitool -H 192.168.0.11 -U ADMIN -P PASSWORD -I lanplus sdr | grep FAN
ipmitool -H 192.168.0.11 -U ADMIN -P PASSWORD -I lanplus sdr | grep CPU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      X10DRG · Issue #31 · petersulyok/smfc