Run Multiple WU's on Your GPU

Author	Message
ProDigit Send message Joined: 13 Nov 19 Posts: 9 Credit: 32,117,570 RAC: 0	Message 70275 - Posted: 26 Dec 2020, 0:16:43 UTC Last modified: 26 Dec 2020, 0:31:45 UTC Perhaps related, For Nvidia RTX 2000 series GPUs, anything above 2 is useless. The RTX 2080 Ti does 3 WUs at the same time as it does 2WUs, a tiny bit faster, but it'll also consume about 10W more power ( 5%) for finishing 2x3 WUs in less than 5% difference in time of 3x2WUs. AMD GPUs are a bit better for Einstein and Milkyway, because they have more DPP. I think because Milkyway is using DPP, more than what the GPUs actually have to offer, and thus is bottlenecking the GPU by a lot! (80-140W usage out of 150-195W limits). If that's the case, you could set GPU to 0.33 and 0.25, on RTX3080 and 3090. My main issue is, that I want to use only 1 Milkyway on my GPU, and combine it with another project that doesn't use a lot of 64bit (DPP) commands. I have 2 GPUs in my system, and want to run 1 Milkyway WU per GPU. With my current setup, 2 MW WUs are running on one GPU, and none on the second. <app_config> <app> <name>milkyway</name> <gpu_versions> <max_concurrent>2</max_concurrent> <gpu_usage>0.3333</gpu_usage> <cpu_usage>0.5</cpu_usage> </gpu_versions> </app> <project_max_concurrent>2</project_max_concurrent> </app_config> ID: 70275 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 5 Jul 11 Posts: 993 Credit: 378,630,446 RAC: 5,302	Message 70280 - Posted: 27 Dec 2020, 18:44:17 UTC - in response to Message 70275. Perhaps related, For Nvidia RTX 2000 series GPUs, anything above 2 is useless. The RTX 2080 Ti does 3 WUs at the same time as it does 2WUs, a tiny bit faster, but it'll also consume about 10W more power ( 5%) for finishing 2x3 WUs in less than 5% difference in time of 3x2WUs. AMD GPUs are a bit better for Einstein and Milkyway, because they have more DPP. Einstein is SP. I think because Milkyway is using DPP, more than what the GPUs actually have to offer, and thus is bottlenecking the GPU by a lot! (80-140W usage out of 150-195W limits). If that's the case, you could set GPU to 0.33 and 0.25, on RTX3080 and 3090. Fine if you add another task that isn't MW. But two MW will not magically create more DP units in your card. Doubling up the same type of project on one card only solves one problem - the CPU is holding up the GPU. I've ceased bothering anyway, it doesn't gain you that much in any scenario. ID: 70280 · Rating: 0 · rate: / Reply Quote

Ryan Munro Send message Joined: 22 Jun 09 Posts: 16 Credit: 81,410,790 RAC: 4,660	Message 70337 - Posted: 12 Jan 2021, 10:58:33 UTC - in response to Message 70280. Hi Guys, So I have tried running more than one unit and I can't get much of a boost from it, so for example, if I run 1 unit the completion times are around 1:40, if I run 2 they are around 3:30 so if anything they are running slower. GPU is an RTX 3090, I checked the power usage and on a single unit it's pulling around 190w, and for two it's around 215w, I tried up to 5 and it seems to cap at 215w no matter the number of units. The card can pull up to 350w under full load (Einstein does) so it feels like for whatever reason MW@H cant fully utilize my card even with multiple WU's. Any ideas on how to fully use the card with multiple WU's? My config is for two units: <app_config> <app> <name>milkyway</name> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>0.05</cpu_usage> </gpu_versions> </app> </app_config> ID: 70337 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 5 Jul 11 Posts: 993 Credit: 378,630,446 RAC: 5,302	Message 70339 - Posted: 12 Jan 2021, 19:02:44 UTC - in response to Message 70337. Hi Guys, So I have tried running more than one unit and I can't get much of a boost from it, so for example, if I run 1 unit the completion times are around 1:40, if I run 2 they are around 3:30 so if anything they are running slower. GPU is an RTX 3090, I checked the power usage and on a single unit it's pulling around 190w, and for two it's around 215w, I tried up to 5 and it seems to cap at 215w no matter the number of units. The card can pull up to 350w under full load (Einstein does) so it feels like for whatever reason MW@H cant fully utilize my card even with multiple WU's. Any ideas on how to fully use the card with multiple WU's? My config is for two units: <app_config> <app> <name>milkyway</name> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>0.05</cpu_usage> </gpu_versions> </app> </app_config> You may or may not get a boost, it depends on your CPU and GPU relative speeds. Ignore what I said in my last post, I'm now doubling up in some circumstances. If the CPU is slow (per core, as the tasks will only use 1 core) and the GPU is fast, then doubling or more will help. If you have a good CPU and it's not doing much else, and your GPU isn't that fast, then you might not need to. It also depends on what you're running on the GPU, some projects have apps that require more CPU than others. Einstein needs a lot, Collatz doesn't need much at all. MW needs it in chunks as it starts each of the tasks inside the bundle (currently a bundle of 4, so you'll see it pausing at 25, 50, 75%). What I do is watch the GPU usage in something like GPU-Z or MSI Afterburner, something with a graph of usage. If it's pretty much 100% all the time, leave it alone. If it's dipping a lot, try doubling and see if it goes higher. You can also try freeing up CPU cores. ID: 70339 · Rating: 0 · rate: / Reply Quote

Dunx Send message Joined: 13 Feb 11 Posts: 31 Credit: 1,403,524,537 RAC: 0	Message 70341 - Posted: 12 Jan 2021, 23:02:27 UTC You need to build a tired old PC and add a few HD 7970 or R9 280X cards to it.... Just because Nvidia sell you a high priced item doesn't make it suitable for efficient use on this project. I bought a used Gen 1 GTX Titan Black for the high DP and extra RAM, but it is still 3x slower than a cheap R9 280X. I currently run a Radeon VII and spews out a result every 11 seconds, running multiple tasks on a quad core CPU. I would say Nvidia cards would be better used for GPUGrid, but they have no work available.... dunx ID: 70341 · Rating: 0 · rate: / Reply Quote

Jim1348 Send message Joined: 9 Jul 17 Posts: 100 Credit: 16,967,906 RAC: 0	Message 70342 - Posted: 13 Jan 2021, 2:40:44 UTC - in response to Message 70341. I would say Nvidia cards would be better used for GPUGrid, but they have no work available.... Nvidia is cerainly not efficient with MW. Folding has a CUDA app now, and enough work. They are past their shortages. You can run the GPU on folding, and by eliminating the CPU slot (enabled by default) run BOINC on the CPU if you want to. https://foldingathome.org/start-folding/ ID: 70342 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 5 Jul 11 Posts: 993 Credit: 378,630,446 RAC: 5,302	Message 70347 - Posted: 13 Jan 2021, 18:23:12 UTC - in response to Message 70341. You need to build a tired old PC and add a few HD 7970 or R9 280X cards to it.... Just because Nvidia sell you a high priced item doesn't make it suitable for efficient use on this project. I bought a used Gen 1 GTX Titan Black for the high DP and extra RAM, but it is still 3x slower than a cheap R9 280X. I currently run a Radeon VII and spews out a result every 11 seconds, running multiple tasks on a quad core CPU. I would say Nvidia cards would be better used for GPUGrid, but they have no work available.... dunx Yip, I always get old AMD cards for Boinc. I'm appalled at Nvidia's attitude making the DP so slow. ID: 70347 · Rating: 0 · rate: / Reply Quote

S984s5KN6muKjYePgfqf7F37RiXw5f... Send message Joined: 8 May 09 Posts: 3339 Credit: 524,398,411 RAC: 1,539	Message 70351 - Posted: 14 Jan 2021, 4:30:26 UTC - in response to Message 70347. Last modified: 14 Jan 2021, 4:31:37 UTC You need to build a tired old PC and add a few HD 7970 or R9 280X cards to it.... Just because Nvidia sell you a high priced item doesn't make it suitable for efficient use on this project. I bought a used Gen 1 GTX Titan Black for the high DP and extra RAM, but it is still 3x slower than a cheap R9 280X. I currently run a Radeon VII and spews out a result every 11 seconds, running multiple tasks on a quad core CPU. I would say Nvidia cards would be better used for GPUGrid, but they have no work available.... dunx Yip, I always get old AMD cards for Boinc. I'm appalled at Nvidia's attitude making the DP so slow. Some projects do do better with Nvidia cards but as you said those requiring DP do better with AMD cards. ID: 70351 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 5 Jul 11 Posts: 993 Credit: 378,630,446 RAC: 5,302	Message 70369 - Posted: 15 Jan 2021, 18:05:55 UTC - in response to Message 70351. Some projects do do better with Nvidia cards but as you said those requiring DP do better with AMD cards. I'll be getting a very fast Nvidia one day for a game, I guess I'll allocate that to SP projects only. Unless I find an AMD that's as good. I need 4K at 120 frames per second. ID: 70369 · Rating: 0 · rate: / Reply Quote

Chooka Send message Joined: 13 Dec 12 Posts: 101 Credit: 1,782,952,901 RAC: 0	Message 70418 - Posted: 21 Jan 2021, 7:52:45 UTC Your 3090 will crush tasks at primegrid....... although I've seen with GFN-16 tasks, its not much quicker than a 3060Ti! Same can't be said for other PG sub projects though where the 3090 will be king. ID: 70418 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 5 Jul 11 Posts: 993 Credit: 378,630,446 RAC: 5,302	Message 70421 - Posted: 21 Jan 2021, 19:45:17 UTC - in response to Message 70418. Your 3090 will crush tasks at primegrid....... although I've seen with GFN-16 tasks, its not much quicker than a 3060Ti! Same can't be said for other PG sub projects though where the 3090 will be king. It seems impossible to get a straight answer anywhere as to which projects use DP (and are therefore best on old AMDs) and which are SP (which are therefore best on new Nvidias). Presumably the developers must know! Mainly from my own testing with a few cards to see how fast they go, it appears that: Einstein=SP Milkyway=DP Collatz=SP Primegrid=half of each ID: 70421 · Rating: 0 · rate: / Reply Quote

Gibbzy1991 Send message Joined: 14 Apr 17 Posts: 5 Credit: 361 RAC: 0	Message 70425 - Posted: 22 Jan 2021, 18:10:42 UTC - in response to Message 70421. SRBase seems to be SP ID: 70425 · Rating: 0 · rate: / Reply Quote

Wrend Send message Joined: 4 Nov 12 Posts: 96 Credit: 251,528,484 RAC: 0	Message 71100 - Posted: 5 Sep 2021, 5:16:04 UTC - in response to Message 69585. Last modified: 5 Sep 2021, 5:37:39 UTC Just curious to see if anyone has really studied whether there's an appreciable gain in running >1 WU/GPU? New here at MW@H, but in previous tests at other projects I never saw any significant gain. Regards, Jim ... Sorry for the late reply, but yes, for my Titan Black cards with DP optimization set in the Nvidia Control Panel it makes a huge difference. Running just one task only loads up one of my GPUs about 1/6 load. If I want to fully load up my GPUs (which I usually don't actually) I would have to run 5 to 6 tasks simultaneously per GPU for a total of up to 12. At certain parts of the different tasks it seems the GPUs will sometimes briefly spike up to near 100% though. Currently I'm running 2 tasks per GPU as they are running hot (they probably need new thermal paste) and to not drain so much power. With 2 tasks running per GPU (4 total) each card is loaded about 34%. CPU usage is so low for this project on this computer that it is generally negligible, so I'm running 4 tasks (using 4 of 12 threads) for Einstein@Home as well currently to load up my CPU about 1/3 too. ID: 71100 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 24 Jan 11 Posts: 739 Credit: 571,418,058 RAC: 60,750	Message 71101 - Posted: 5 Sep 2021, 7:17:22 UTC - in response to Message 56960. Last modified: 5 Sep 2021, 7:19:10 UTC deleted ID: 71101 · Rating: 0 · rate: / Reply Quote

Septimus Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,905,857 RAC: 0	Message 71105 - Posted: 8 Sep 2021, 15:00:21 UTC Is there any possibility that Intel will be added to the list of supported GPUâ€™s please. ID: 71105 · Rating: 0 · rate: / Reply Quote

Ryan Munro Send message Joined: 22 Jun 09 Posts: 16 Credit: 81,410,790 RAC: 4,660	Message 71156 - Posted: 26 Sep 2021, 9:42:57 UTC No matter how many WU's I run my card won't use more than 210w when crunching, 2 / 4 / 8 units, its a 3090 and under WCG running 8 of their GPU units at the same time it will pull the full 350w, card is a 390btw, any ideas? Here is my config file: <app_config> <app> <name>milkyway</name> <max_concurrent>50</max_concurrent> <gpu_versions> <gpu_usage>0.125</gpu_usage> <cpu_usage>0.05</cpu_usage> </gpu_versions> </app> </app_config> ID: 71156 · Rating: 0 · rate: / Reply Quote

S984s5KN6muKjYePgfqf7F37RiXw5f... Send message Joined: 8 May 09 Posts: 3339 Credit: 524,398,411 RAC: 1,539	Message 71157 - Posted: 26 Sep 2021, 14:10:06 UTC - in response to Message 71156. No matter how many WU's I run my card won't use more than 210w when crunching, 2 / 4 / 8 units, its a 3090 and under WCG running 8 of their GPU units at the same time it will pull the full 350w, card is a 390btw, any ideas? Here is my config file: <app_config> <app> <name>milkyway</name> <max_concurrent>50</max_concurrent> <gpu_versions> <gpu_usage>0.125</gpu_usage> <cpu_usage>0.05</cpu_usage> </gpu_versions> </app> </app_config> Changing this line <gpu_usage>0.125</gpu_usage> to be 0.0625 would let you run 16 units at one time. BUT the problem comes down to MW not being able to supply you with enough tasks in one day to keep it going, with their 10 minutes back-off between sending tasks your 3090 will be doing something else for those 10 minutes even more often. ID: 71157 · Rating: 0 · rate: / Reply Quote

Joseph Stateson Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,464,264,889 RAC: 2,477	Message 71158 - Posted: 26 Sep 2021, 14:45:23 UTC - in response to Message 71157. Last modified: 26 Sep 2021, 14:47:36 UTC No matter how many WU's I run my card won't use more than 210w when crunching, 2 / 4 / 8 units, its a 3090 and under WCG running 8 of their GPU units at the same time it will pull the full 350w, With the price of graphic cards skyrocketing, I would avoid running any card at its max rating. High temps cause thermal paste to harden making removal non-trivial. Exact OEM fan replacement can be hard to find and one must be creative on occasions. I have a few really weird fan arrangements I can post if anyone interested. BUT the problem comes down to MW not being able to supply you with enough tasks in one day to keep it going, with their 10 minutes back-off between sending tasks your 3090 will be doing something else for those 10 minutes even more often. That BOINC app I modded, 7.15.0 fixes the 10 minute wait. The latest official version is 7l.16.11 and I assume the 10 minute problem still exists for that app. Due to temperatures recently dropping here in texas, I started up a pair of garage "racks" to start crunching on Einstein and WCG. I have a 3rd rack for milkyway but the garage is still too hot to run that one. ID: 71158 · Rating: 0 · rate: / Reply Quote

S984s5KN6muKjYePgfqf7F37RiXw5f... Send message Joined: 8 May 09 Posts: 3339 Credit: 524,398,411 RAC: 1,539	Message 71160 - Posted: 27 Sep 2021, 10:29:59 UTC - in response to Message 71158. No matter how many WU's I run my card won't use more than 210w when crunching, 2 / 4 / 8 units, its a 3090 and under WCG running 8 of their GPU units at the same time it will pull the full 350w, With the price of graphic cards skyrocketing, I would avoid running any card at its max rating. High temps cause thermal paste to harden making removal non-trivial. Exact OEM fan replacement can be hard to find and one must be creative on occasions. I have a few really weird fan arrangements I can post if anyone interested. BUT the problem comes down to MW not being able to supply you with enough tasks in one day to keep it going, with their 10 minutes back-off between sending tasks your 3090 will be doing something else for those 10 minutes even more often. That BOINC app I modded, 7.15.0 fixes the 10 minute wait. The latest official version is 7l.16.11 and I assume the 10 minute problem still exists for that app. Due to temperatures recently dropping here in texas, I started up a pair of garage "racks" to start crunching on Einstein and WCG. I have a 3rd rack for milkyway but the garage is still too hot to run that one. YES your app did fix it, but unfortunately not everyone uses it. ID: 71160 · Rating: 0 · rate: / Reply Quote

Toby Broom Send message Joined: 13 Jun 09 Posts: 24 Credit: 173,240,368 RAC: 311,701	Message 71164 - Posted: 28 Sep 2021, 15:45:38 UTC - in response to Message 71157. However, if you have a work buffer should BOINC not cache a larger amount of WU's so the 10 min becomes less of an issue. I also assume that it would be better for the project to create larger work units? or at least an option for larger ones? ID: 71164 · Rating: 0 · rate: / Reply Quote