Welcome to MilkyWay@home

GPU tasking with APP CONFIG


Advanced search

Message boards : Number crunching : GPU tasking with APP CONFIG
Message board moderation

To post messages, you must log in.

AuthorMessage
ProfileMatLioLeC

Send message
Joined: 2 May 18
Posts: 8
Credit: 63,038,294
RAC: 0
50 million credit badge3 year member badge
Message 68134 - Posted: 11 Feb 2019, 0:47:24 UTC

Hello. i was wondering someone could help me with app config.

I'm using boinc right now to mostly do Milkyway@home. i recently purchased a R9 280x with a decent FP64 rating and i've been messing around with the configurations.

i think i found a very good configuration but it's a bit different

see when i run one task at a time, i get it done in about 35-41 seconds. when i load up 4 at once i can get it done in an average of 32 seconds per task... when i load up 19 at a time, i get an average of 27 seconds per task!

but theirs a catch.. it has to do the work in blocks. right now i'd like to set it to do 14 tasks at a time, it seems somewhat ideal. so 0.07 gpu fractions and i have it at .1 cpu. but to get it to do 14 task and not take on another b4 the 14 are all done. and once the 14 are done, it loads another 14 tasks..

i'm not sure what kinda line i would need to add to the APP_CONFIG. plz help with this little problem, thank you.

<app_config>
<app>
<name>milkyway</name>
<max_concurrent>30</max_concurrent>
<fraction_done_exact/>
<gpu_versions>
<gpu_usage>.07</gpu_usage>
<cpu_usage>.1</cpu_usage>
</gpu_versions>
</app>
</app_config>
ID: 68134 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bluestang

Send message
Joined: 13 Oct 16
Posts: 106
Credit: 1,051,726,469
RAC: 241,191
1 billion credit badge5 year member badge
Message 68140 - Posted: 12 Feb 2019, 0:40:08 UTC - in response to Message 68134.  

3 or 4 WUs top is the most efficient for a 280X depending on your core frequency. You also need to leave some CPU cores free to help out with this WUs.

You can't do what you're trying to do with any config file AFAIK. Not sure you'd gain anything over running 3 or 4 concurrent WUs anyway.
ID: 68140 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileMatLioLeC

Send message
Joined: 2 May 18
Posts: 8
Credit: 63,038,294
RAC: 0
50 million credit badge3 year member badge
Message 68142 - Posted: 12 Feb 2019, 5:50:28 UTC - in response to Message 68140.  

yeah i run 25 at a time. i get a bit of gains. better then 1 or 4. 4 and 3 seem to be hit and miss. 5 about the same.. tho 25!. i finish all the workloads in less then an hour and a half.

i've been looking at it a bit and i think it works fine like that. might not need the chuncks. might just go a bit faster because it's all loaded into the GPU's ram. plus it's a stock OC at 1070 but i pushed it to 1100 and boosted the power 9%. left the memory at 1600mhz tho and i have a ton of fans blowing on it from every direction. keeps it at about 63c

runs pretty darn good. i think i'll get another one.
ID: 68142 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileMatLioLeC

Send message
Joined: 2 May 18
Posts: 8
Credit: 63,038,294
RAC: 0
50 million credit badge3 year member badge
Message 68143 - Posted: 12 Feb 2019, 5:54:36 UTC - in response to Message 68142.  

BTW at stock with asus specs, 1070mhz core and 1600mhz memory, it couldn't run at at full load. would underclock itself to about 980 and hit 99% load.. then jump to 1070 and run at 99% load... took a while to figure it out. wasn't getting enough power. so i boosted the power up 8% in the radeon crimson control center or w/e they wanna call it.

now it run 99% all the time, not a single bump in it's step, crunching 25 task at once. :) atm the ram is at 1317mb, not even half full.
ID: 68143 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileMatLioLeC

Send message
Joined: 2 May 18
Posts: 8
Credit: 63,038,294
RAC: 0
50 million credit badge3 year member badge
Message 68144 - Posted: 12 Feb 2019, 5:54:43 UTC - in response to Message 68142.  
Last modified: 12 Feb 2019, 6:11:26 UTC

BTW at stock with asus specs, 1070mhz core and 1600mhz memory, it couldn't run at at full load. would underclock itself to about 980 and hit 99% load.. then jump to 1070 and run at 64% load... took a while to figure it out. wasn't getting enough power. so i boosted the power up 8% in the radeon crimson control center or w/e they wanna call it.

now it run 99% all the time, not a single bump in it's step, crunching 25 task at once. :) atm the ram is at 1317mb, not even half full.
ID: 68144 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bluestang

Send message
Joined: 13 Oct 16
Posts: 106
Credit: 1,051,726,469
RAC: 241,191
1 billion credit badge5 year member badge
Message 68149 - Posted: 12 Feb 2019, 16:37:40 UTC

Yeah, you got to bump the Power Level percent up if the core is bouncing around and not staying where you want it.

1100 on the core with a 280X should be getting you at least 550-600k+ PPD, even more sometimes (if you can keep it fed during the daily outages). That's when I was running just 3 WUs. If you're not getting close to that then the 25 at a time is not as good as you think.
ID: 68149 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileMatLioLeC

Send message
Joined: 2 May 18
Posts: 8
Credit: 63,038,294
RAC: 0
50 million credit badge3 year member badge
Message 68151 - Posted: 12 Feb 2019, 23:39:40 UTC

well i did bump it up by 9% and it's steady all day. but yeah i keep playing around with the WU, now i'm doing ten at a time. they tend to stick around 30 second average no matter the number.. unless it's one WU.. but loading up a bunch seems to trim a bit off.

i load up the tasks and i disconnect from the internet and time how long it takes. take about 2 hour to finish when i have on hand with 3 wu and 1.5h to finnish then all if i have active 20-25 WU's.

generally, i figure those cards fair better if the job all loaded up. each WU take less then a 100mb of ram on the GPU.. similar amounts on the MB DDR3.. but when i'm running 25 at a time, it's over half full on the GPU side. so i imagine it's a "all at your fingertips" advantage. might see better improvements if you have a 3-way crossfire with a QPI chipset bridge, would save a few steps i imagine. or even a 2 and up crossfire on a board with just 16 lanes to the CPu and a DMI bridge. tho the newer Z370 have more lanes i think.

I've been thinking of getting myself a Gigabyte x79 board. i think that would work out pretty well. they have great bus configs that still beat out a lot of modern boards. plus i'm a gigabyte kinda guy if i can pick.
ID: 68151 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : GPU tasking with APP CONFIG

©2021 Astroinformatics Group