Message boards :
Number crunching :
GPU not at 100% when all CPU cores crunch
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Apr 10 Posts: 4 Credit: 38,050,684 RAC: 0 |
Hi all, I know the thread title might sound familiar, but I think I encountered a different problem/bug than discussed previously here, so please bear with me. You see, I have this water-cooled gaming rig with 6-core/12-thread CPU (Intel i7-970) and Radeon HD7970 GPU. And of course, I use it to crunch for several BOINC projects, but MilkyWay@home is (currently) the only project I run on GPU. Now normally, the HD7970 can burn through one MilkyWay@home workunit in about 50 seconds and GPU is utilized at 100% the entire time. But that happens only if at least one CPU core/thread is idle. If all CPU cores/threads are crunching, the GPU slows down considerably - most of the time, the GPU utilization jiggles between 40 and 80%. The MilkyWay@home workunits take about twice as long to crunch, too. When I disable CPU tasks, the GPU utilization almost immediately jumps back to 100%. When I enable them, the GPU falls back into that 40 to 80% range, so there is definitely a pattern there. I updated to the latest BOINC Manager and graphics drivers, but nothing helped. Is this some known problem? Oh, I almost forgot, the PC runs on W7 64-bit Professional and has 12 GB RAM. |
Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0 |
It's not a problem per se, it's just the nature of the beast. There's a lot of memory IO involved with running GPU apps, so if you have all the cores busy doing other tasks (including using it yourself to do your work) something has to give. Therefore the graphics card sometimes has to wait around for it's turn to get to main memory, as your from the hip experiment demonstrated. ;-) |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
It's not a problem per se, it's just the nature of the beast. AND it is especially noticeable the better/faster the gpu is. The better/faster the gpu is the more it's need for stuff to do, all that stuff comes from the cpu, so if you don't leave a cpu core free the gpu will bog down ALOT! Some projects are able to fit the whole workunit into gpu memory making this a non problems, this project can't do that as the workunit is just too big. DistRTgen can do that, as can Collatz and Moo, but at most projects the workunit is just too big to fit all of it into the gpu memory and still have enough left over to crunch with. |
Send message Joined: 30 Apr 10 Posts: 4 Credit: 38,050,684 RAC: 0 |
To say the truth, I found your "high I/O and memory access" explanation a bit fishy. It simply would be a very unlikely coincidence that 11 CPU cores crunching were fine, but 12 cores suddenly created such a bottleneck that it would slow down the GPU to half. So I poked around a bit more and I think I found the true source of the problem. I think the MW@H app jumps between CPU threads wildly, which (among other undesirable things) causes those GPU slowdowns. When I use Windows Task Manager to force "milkyway_separation__modified_fit_1.22_windows_x86_64__opencl_amd_ati" process to use only one thread (it is called "process affinity" in Windows), the GPU jumps to 100% even if all CPU cores are crunching at full blast. Forcing the MW@H app to use just one CPU thread of course eliminates that wild jumping. The bad thing is, the affinity setting lasts only one workunit, so unless it is fixed in the app itself, this solution is useless. I found no bug-report thread here, so if some moderator sees this, please forward this information to MW@H programmers. BTW, this is not the first time I encountered a problem like this, though AFAIR it would be the first for BOINC apps. I still vividly remember how many programs crashed or run extremely slow on (then bleeding edge desktop CPU) Athlon X2. Even the WXP themselves needed a special patch to run properly. It is rather rare to happen with modern programs though. The last time I needed to mess with process affinity like this was when I had random crashes in Fallout 3. It ran fine on my previous 4-thread machine, but its programmers obviously never expected that 12-thread machines would come so soon... ... ... ... Hmm, is it possible somehow to force process affinity in BOINC when it starts the apps, by any chance? |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey there Pavel. We are looking into this. There is a chance it might be something weird with the Boinc scheduler, but we won't know until we get a chance to look into it some more. Sorry if it takes a little while to fix we have a couple other bug fixes to release this week and then we will focus on this. Thanks for the report, Jake W. |
Send message Joined: 30 Apr 10 Posts: 4 Credit: 38,050,684 RAC: 0 |
Oh wow, I didn't expect somebody from MW@H team would notice so soon. That CPU thread affinity problem is no big deal, I can easily leave 1 core idle via "local computing preferences" in BOINC Manager. My current water-cooling solution a bit struggles with the almost 400-watt heat load from GPU and CPU anyway. I primarily designed it to be as quiet as possible, not to dissipate that much heat 24/7. So until I solve that, I can't safely run the MW@H GPU app for longer than a few hours a day anyway. |
Send message Joined: 27 Apr 10 Posts: 35 Credit: 90,828,595 RAC: 0 |
I had problems with this a few years ago on folding@home. The system SHOULD give the GPU the cycles it needs in order to run properly, but sometimes it doesn't if you have the CPU running at 100%. The simplest fix is to just leave one core free (this is normally recommended anyway). You aren't going to lose much work from that one core and it will be an overall net gain in production because your GPU won't be starving to death. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Hmm, is it possible somehow to force process affinity in BOINC when it starts the apps, by any chance? Yes but every new unit goes back to the default settings. So YES you can set a currently running unit to a certain cpu core, but as soon as that unit finishes the next unit will revert back to the defaults. You can do this by going into the task manager, and then right clicking on the task and clicking affinity. I am NOT a Linux guy so you will have to figure out the Linux equivalents if you use Linux. Supposedly there are some interesting tweaks coming in some future Boinc versions, but I don't know if that is one of them or not. I am not a member of the Boinc Mailing List or a programmer, so don't know those kinds of things. |
Send message Joined: 30 Apr 10 Posts: 4 Credit: 38,050,684 RAC: 0 |
As a temporary fix, I found a little program that can set process affinity automatically: http://bitsum.com/processlasso/ In case anyone else wants to try it, you will find "Configure default CPU affinities" in its Options menu. Just write *milkyway* (including the asterisks) in the "name match" field, select only one CPU on the right and press "Add to list". Works like charm, my GPU now crunches at 100% even when all CPU cores are busy. |
©2024 Astroinformatics Group