Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding CPU core pinning and parking capability #416

Merged
merged 34 commits into from
Dec 4, 2023

Conversation

HenrikHolst
Copy link
Contributor

First attempt at merging this patch set that adds cpu core pinning and parking capability. There is automatic logic that will try to detect if running on a CPU with non-uniform L3 cache (say 7900x3D or 7950x3D) or on a CPU with non-uniform frequencies (say Intel Alder Lake, Raptor Lake and so on with E-cores and P-cores).

By default the code will pin all the threads of the game on the cores with the better L3 or frequency but can also be configured to completely park the less desired cores (akin to how the xbox gamebar works in Windows on the 7900x3D and 7950x3D). Parking requires the aid of an new helper utility, cpucorectl, under polkit since that requires root privileges.

If logic fails or is inadequate then it is also possible to manually set which cores to pin to or which cores to park in the config file. There is also a built in safety check to make sure that the user/logic does not disable all cores by making sure that at least 4 cores have to be available for the game.

Added common files for the cpu core parking/pinning functionality
added build info for common-cpu to meson
added the cpu core parking/pinning settings to gamemode-config.c
added cpu core parking/pinning settings to gamemode-config.h
call the cpu core parking/pinning from gamemode-context.c
Added gamemode-cpu.c which contains the functions for cpu core parking and pinning
added the cpu core parking/pinning definitions to gamemode.h
Added build info for gamemade-cpu.c to meson
added the polkit policy for the cpucorectl utility so we can call it without being root
added polkit rules so we can run the cpucorectl utility without being root
added some info about the new cpu core parking and pinning settings in gamemode.ini
Added a utility to enable and disable cpu cores (aka core parking) since this requires root privileges
added build info for the new cpucorectl utility to meson
Added detection for big.LITTLE aka cpu:s where not all cores have the same frequency like on Intel Alder Lake and newer. The current logic allows a 5% difference in the max frequency due to some reports that those cpu:s doesn't always give back the exact same value (possible due to boosting capability).
use defines instead of values for the park_or_pin variable
use a define instead of values for the park_or_pin variable
if cpu core parking/pinning was disabled by the logic then there would be a double free at exit
@cfebs
Copy link

cfebs commented May 26, 2023

@HenrikHolst is there a good way to verify that the core pinning is working correctly while a game is running?
Want to give the branch a go!

@HenrikHolst
Copy link
Contributor Author

HenrikHolst commented May 26, 2023

@cfebs use ps or top to find the pid of the game once it is running and then you can use "taskset -ap pid-number" to see the cpu mask for the process and all it's threads, depending on the amount of cores your cpu have the default mask would be something like "ffff" if the process was allowed to run on all cores and something like "ff00" or "00ff" if it was masked out of 8 of the cores (each f here represents 4 cores)

@cfebs
Copy link

cfebs commented May 26, 2023

Thanks for the walk through! Here are my results with a 7950X3D and pin_cores=yes set in ~/.config/gamemode.ini

❯ pgrep supertuxkart
205593
❯ gamemoded --status=205593
gamemode is active and [205593] registered

Here's the result of taskset -ap 205593 https://gist.github.com/cfebs/d78f9fdafbaeafbc7eb750c3c5b789de so that would mean 48/61 threads are masked?

Was also just using watch -d on taskset and it was not changing over time FWIW.

If you would like any help testing other patches LMK!

@HenrikHolst
Copy link
Contributor Author

Thanks for the walk through! Here are my results with a 7950X3D and pin_cores=yes set in ~/.config/gamemode.ini

❯ pgrep supertuxkart
205593
❯ gamemoded --status=205593
gamemode is active and [205593] registered

Here's the result of taskset -ap 205593 https://gist.github.com/cfebs/d78f9fdafbaeafbc7eb750c3c5b789de so that would mean 48/61 threads are masked?

Was also just using watch -d on taskset and it was not changing over time FWIW.

If you would like any help testing other patches LMK!

that looks correct yes, most likely the game launched new threads some time after launch so looks like we would have to store the pids that we watch and apply the mask on a regular basis to catch cases like this.

@cfebs
Copy link

cfebs commented May 26, 2023

Did another test with a steam game (if just to provide more info for other testers!). Also tried to debug which procs gamemode considers registered:

❯ ps haxo pid | sed 's/\s\+//g' | xargs -I{} gamemoded --status={} \; 2>&1 | grep '] registered'
gamemode is active and [244593] registered
gamemode is active and [244615] registered

The main proc that was taking most cpu time was different though: 244861 which is probably just a child of the main steam launcher procs.

Here's the taskset results:

  • pid 244593's current affinity mask: ff00ff
  • pid 244615's current affinity mask: ff00ff
  • 244861 indeed looks like the actual game. Most are pinned but looks like same behavior of some launching later.

And here's my ps + cli args for those procs: https://gist.github.com/cfebs/50e8c6b4198cce20db713bbfad451734

looks like we would have to store the pids that we watch and apply the mask on a regular basis to catch cases like this

Gotcha, I'll keep using this build for as long as it's stable - thank you for the branch and the quick responses.

made core pinning optionally silent, used for when the reaper thread calls us repeatable so we don't create tons of unnecessary logs
Reapply the core pinning from the reaper thread to catch cases where the game launches threads after initial start
walk through /proc/pid/task to make sure that we set the thread affinity for every single thread in the process
@HenrikHolst
Copy link
Contributor Author

Did another test with a steam game (if not to provide more info for other testers!). Also tried to debug which procs gamemode considers registered:

❯ ps haxo pid | sed 's/\s\+//g' | xargs -I{} gamemoded --status={} \; 2>&1 | grep '] registered'
gamemode is active and [244593] registered
gamemode is active and [244615] registered

The main proc that was taking most cpu time was different though: 244861 which is probably just a child of the main steam launcher procs.

Here's the taskset results:

* `pid 244593's current affinity mask: ff00ff`

* `pid 244615's current affinity mask: ff00ff`

* [244861](https://gist.github.com/cfebs/11510e9bc3459e6e5f690eb90656bfef) indeed looks like the actual game. Most are pinned but looks like same behavior of some launching later.

And here's my ps + cli args for those procs: https://gist.github.com/cfebs/50e8c6b4198cce20db713bbfad451734

looks like we would have to store the pids that we watch and apply the mask on a regular basis to catch cases like this

Gotcha, I'll keep using this build for as long as it's stable - thank you for the branch and the quick responses.

I have added some patches now that reapplies the core pinning in the gamemode reaper thread (by default it runs once every 5 seconds) and I also make sure that we really hit every single thread of the process.

@cfebs
Copy link

cfebs commented May 27, 2023

Awesome, here's the first test with supertuxkart:

pinning_tux_1.mp4

Looks great so far!

@HenrikHolst
Copy link
Contributor Author

nice!, thanks for the confirmation!

@cfebs
Copy link

cfebs commented May 27, 2023

Here are some Steam tests! Each using gamemoderun %command% launch option and then observing with watch -n0.5 'pgrep $EXE | xargs taskset -ap'. Games launch around 7-10s mark.

So both look like there are a few threads that stay default masked. In DS3 there's more mask variety :)

In both cases using gamemoded --status was getting: gamemode is active but [...] not registered

@HenrikHolst
Copy link
Contributor Author

could be that gamemode for some reason couldn't find the process id or something. One thing to try is to run "gamemodelist" to see if the process is listed as being registered.

Another is to edit /etc/xdg/systemd/user/gamemoded.service and change "ExecStart=/usr/bin/gamemoded" to "ExecStart=/usr/bin/gamemoded -l" and then do "systemctl --user daemon-reload && systemctl --user stop gamemoded.service" to let the change take effect (the -l makes gamemoded log more to the journal) and then run "journalctl --user /usr/bin/gamemoded -f" in a terminal and then run the game and see what gamemoded logs.

@cfebs
Copy link

cfebs commented May 28, 2023

Thanks for the logging tip.
I just re-installed pkg manager version of gamemode (v1.7) to test the behavior with Dark Souls 3 and:

❯ gamemodelist 
    PID    PPID USER      NI PSR COMMAND
 187873    6150 $USER     0  14 reaper
 187989  187965 $USER     0  12 python3
 187991  187989 $USER     0   4 steam.exe
 187993  187965 $USER     0  22 wineserver
 187997  187965 $USER     0   6 services.exe
 188000  187965 $USER     0  16 winedevice.exe
 188010  187965 $USER     0  17 winedevice.exe
 188043  187965 $USER     0   8 plugplay.exe
 188051  187965 $USER     0   6 svchost.exe
 188058  187965 $USER     0  21 explorer.exe
 188075  187965 $USER     0   8 rpcss.exe
 188085  187965 $USER     0  16 tabtip.exe
 188119  187965 $USER     0  23 DarkSoulsIII.ex

❯ gamemoded --status=188119
gamemode is active but [188119] not registered

❯ gamemoded --status=187965
gamemode is active but [187965] not registered

So I guess that's just how gamemode works: gamemodelist can list a pid but it's not returned as "registered" in --status. Good to know :)

Hope to see this merged! And LMK if need any more testing.

@HenrikHolst
Copy link
Contributor Author

Thanks for the logging tip. I just re-installed pkg manager version of gamemode (v1.7) to test the behavior with Dark Souls 3 and:

❯ gamemodelist 
    PID    PPID USER      NI PSR COMMAND
 187873    6150 $USER     0  14 reaper
 187989  187965 $USER     0  12 python3
 187991  187989 $USER     0   4 steam.exe
 187993  187965 $USER     0  22 wineserver
 187997  187965 $USER     0   6 services.exe
 188000  187965 $USER     0  16 winedevice.exe
 188010  187965 $USER     0  17 winedevice.exe
 188043  187965 $USER     0   8 plugplay.exe
 188051  187965 $USER     0   6 svchost.exe
 188058  187965 $USER     0  21 explorer.exe
 188075  187965 $USER     0   8 rpcss.exe
 188085  187965 $USER     0  16 tabtip.exe
 188119  187965 $USER     0  23 DarkSoulsIII.ex

❯ gamemoded --status=188119
gamemode is active but [188119] not registered

❯ gamemoded --status=187965
gamemode is active but [187965] not registered

So I guess that's just how gamemode works: gamemodelist can list a pid but it's not returned as "registered" in --status. Good to know :)

Hope to see this merged! And LMK if need any more testing.

thanks for the tests! Hopefully this will increase the fps for the games that do get registered.

@HenrikHolst
Copy link
Contributor Author

closes #412

@greg2010
Copy link

greg2010 commented Sep 5, 2023

Took it out for a spin with Starfield, can confirm that the affinity is correctly set for all child processes. Also, got a noticeable bump in FPS in crowded areas.

@afayaz-feral any chance this could be merged? This is a major boon for asymmetric CPU owners.

@afayaz-feral afayaz-feral mentioned this pull request Sep 11, 2023
@afayaz-feral afayaz-feral merged commit 9614cec into FeralInteractive:master Dec 4, 2023
@cfebs
Copy link

cfebs commented Dec 4, 2023

🎉

afayaz-feral added a commit that referenced this pull request Dec 4, 2023
The original PR #416 failed the format check, but this wasn't apparent
until after merging.
@HenrikHolst
Copy link
Contributor Author

7900x3d user here. noticed at least 3 to 12 more fps in Akila city of Starfield! (I was stuck at 60 there but now reaching above 70!) some time ago I disabled one entire CCD (one without 3d v-cache) through BIOS and tested the same scene in Akila city and had 0 fps gain yet gamemode can give more fps? how is this possible? thank you very much.

hard to say but could be that since the rest of the system, except just the game, can use the other CCD cores when the game is pinned instead of the entire CCD being disabled the game gets more cpu than it would otherwise.

@HenrikHolst
Copy link
Contributor Author

BTW taskset -ap is always showing all fs for me and also mangohud is showing all cores are active when I use park_cores=yes in config file...

prob better to open a issue for that than to continue here on the MR but check the gamemode log, in order to park the cores an external utility have to be used which means that polkit is also used and if you are on say ubuntu < 23.04 or some version of debian then the included polkit rules are incomatible and have to be manually remade to work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants