Welcome to MilkyWay@home

Rarely getting any GPU work

Questions and Answers : Preferences : Rarely getting any GPU work
Message board moderation

To post messages, you must log in.

AuthorMessage
Wade Tregaskis

Send message
Joined: 17 Mar 10
Posts: 1
Credit: 19,785,985
RAC: 0
Message 72372 - Posted: 31 Mar 2022, 16:32:42 UTC
Last modified: 31 Mar 2022, 16:40:35 UTC

MilkWay@home seems very unwilling to provide my computer with GPU work units (sometimes it won't provide CPU ones either). My GPUs sit idle most of the time - or, if I have Einstein enabled, it uses them instead. Einstein@home seems to have absolutely no problem giving me work.

I've searched these forums and tried all the suggested workarounds, but none of them have any effect (or don't apply, e.g. custom Windows BOINC clients; I'm using a Mac).

I do see the common problem that the client refuses to request GPU work units if it has any already, ("N seconds; 0.00 devices" in the logs). But even when it does request units, and has none locally nor any to submit, it still almost always gets none, e.g.:


Thu 31 Mar 09:17:04 2022 | Milkyway@Home | [sched_op] Starting scheduler request
Thu 31 Mar 09:17:04 2022 | Milkyway@Home | Sending scheduler request: To fetch work.
Thu 31 Mar 09:17:04 2022 | Milkyway@Home | Requesting new tasks for CPU and AMD/ATI GPU
Thu 31 Mar 09:17:04 2022 | Milkyway@Home | [sched_op] CPU work request: 4697394.12 seconds; 0.00 devices
Thu 31 Mar 09:17:04 2022 | Milkyway@Home | [sched_op] AMD/ATI GPU work request: 345600.00 seconds; 1.00 devices
Thu 31 Mar 09:17:07 2022 | Milkyway@Home | Scheduler request completed: got 0 new tasks
Thu 31 Mar 09:17:07 2022 | Milkyway@Home | [sched_op] Server version 713
Thu 31 Mar 09:17:07 2022 | Milkyway@Home | No tasks sent
Thu 31 Mar 09:17:07 2022 | Milkyway@Home | Project requested delay of 91 seconds
Thu 31 Mar 09:17:07 2022 | Milkyway@Home | [sched_op] Deferring communication for 00:01:31
Thu 31 Mar 09:17:07 2022 | Milkyway@Home | [sched_op] Reason: requested by project

If I leave it running for many days, I occasionally see it actually grab some work units - but usually just a dozen or two, which last all of about twenty minutes.

Sometimes it refuses to get any tasks because it claims the "job cache [is] full", which is weird because often when that happens I have fewer CPU tasks downloaded than available cores - i.e. definitely more room; CPU going idle - and no GPU tasks. e.g.:


Thu 31 Mar 09:38:06 2022 | Milkyway@Home | Not requesting tasks: don't need (CPU: ; AMD/ATI GPU: job cache full)
Thu 31 Mar 09:38:06 2022 | Milkyway@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
Thu 31 Mar 09:38:06 2022 | Milkyway@Home | [sched_op] AMD/ATI GPU work request: 0.00 seconds; 0.00 devices

It's a shame because MilkyWay@home seems like an interesting project, and it seems to like my GPU (Vega64) a lot more than Einstein@home - about a hundred-fold difference in credit per GPU-second, with tasks taking a minute or so instead of several hours.
ID: 72372 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Max_Pirx

Send message
Joined: 13 Dec 17
Posts: 46
Credit: 2,421,362,376
RAC: 0
Message 72407 - Posted: 1 Apr 2022, 8:27:05 UTC - in response to Message 72372.  
Last modified: 1 Apr 2022, 8:27:37 UTC

Yes, many people experience similar problems (erratic WU supply). The reason is that there were some issues with the project server recently (see https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4863 and the other related threads in the News section). As far as I can tell the issues are not yet completely resolved but there should be steadier WU supply now.

All best
ID: 72407 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3326
Credit: 521,733,716
RAC: 54,501
Message 72409 - Posted: 1 Apr 2022, 12:29:42 UTC - in response to Message 72372.  

[quote]MilkWay@home seems very unwilling to provide my computer with GPU work units (sometimes it won't provide CPU ones either). My GPUs sit idle most of the time - or, if I have Einstein enabled, it uses them instead. Einstein@home seems to have absolutely no problem giving me work.

I've searched these forums and tried all the suggested workarounds, but none of them have any effect (or don't apply, e.g. custom Windows BOINC clients; I'm using a Mac).

I do see the common problem that the client refuses to request GPU work units if it has any already, ("N seconds; 0.00 devices" in the logs). But even when it does request units, and has none locally nor any to submit, it still almost always gets none, e.g.: [quote]

Another problem MilkyWay has is a 10 minute back-off between the last gpu being run and new ones being sent to us, a previous Admin did it and the current Admins can't seem to find the setting, the easy answer is to do what you are already doing which is to crunch some Einstein units in between batches of MW units. If you increase your cache size you will get more gpu tasks in a batch, upto the limit, but in your case it would also mean more cpu tasks and that could be a problem getting them back on time as Boinc does not have a separate setting for cpu and gpu tasks at the same project.
ID: 72409 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 709
Credit: 545,190,470
RAC: 64,403
Message 72421 - Posted: 1 Apr 2022, 17:41:16 UTC

Ping the project with an Update every 15 minutes or shorter automatically. Scripting can be done in Linux and Windows.

watch -n 300 ./boinccmd --project http://milkyway.cs.rpi.edu/milkyway/ update
ID: 72421 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
estatic707

Send message
Joined: 14 Dec 22
Posts: 3
Credit: 113,680,284
RAC: 89
Message 75342 - Posted: 25 Apr 2023, 23:47:33 UTC - in response to Message 72421.  
Last modified: 25 Apr 2023, 23:58:12 UTC

I have a similar issue. I built a whole machine just for Milkyway with a 5900X and a Radeon Pro VII (6.528 TFLOPS FP64) that finishes the stack of 300 separation GPU tasks in around 40 minutes doing 6 concurrent tasks. After that, even with Milkway as the only enabled project, it just runs CPU tasks, even after the 10 minute runoff time that I see in "properties" of the project in BOINC Manager that I don't understand. The only way for me to get more tasks is to hit update on the porject after all 300 have been completed.

I've combed through the documentation for app_config.xml and cc_config.xml to see if there were any serttings for mandating updates or something, but couldn't find anything. Is this a temporary issue with Milkway or do you have any other advice for keeping a constant stream of GPU tasks fed to my GPU?

Best, Eric
ID: 75342 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3326
Credit: 521,733,716
RAC: 54,501
Message 75344 - Posted: 26 Apr 2023, 10:31:07 UTC - in response to Message 75342.  

I have a similar issue. I built a whole machine just for Milkyway with a 5900X and a Radeon Pro VII (6.528 TFLOPS FP64) that finishes the stack of 300 separation GPU tasks in around 40 minutes doing 6 concurrent tasks. After that, even with Milkway as the only enabled project, it just runs CPU tasks, even after the 10 minute runoff time that I see in "properties" of the project in BOINC Manager that I don't understand. The only way for me to get more tasks is to hit update on the porject after all 300 have been completed.

I've combed through the documentation for app_config.xml and cc_config.xml to see if there were any serttings for mandating updates or something, but couldn't find anything. Is this a temporary issue with Milkway or do you have any other advice for keeping a constant stream of GPU tasks fed to my GPU?

Best, Eric


I believe there is a limit of 300 tasks per day per device
ID: 75344 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 601
Credit: 19,095,219
RAC: 5,477
Message 75345 - Posted: 26 Apr 2023, 11:05:32 UTC - in response to Message 75344.  

I believe there is a limit of 300 tasks per day per device

Not per day, just per GPU, when you return some of them, you can get new ones, you are just not allowed to have more than 300 in your cache.
ID: 75345 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Questions and Answers : Preferences : Rarely getting any GPU work

©2024 Astroinformatics Group