Message boards :
Number crunching :
WUs not downloaded in time - rig is idling - doing no work ...
Message board moderation
Author | Message |
---|---|
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
Hi Jake and/or Tom I'm a little hesitant to again post this past problem showing up again: 4/17/2019 10:49:34 AM | Milkyway@Home | Computation for task de_modfit_84_bundle5_4s_south4s_0_1554998626_1349218_1 finished 4/17/2019 10:49:34 AM | Milkyway@Home | Starting task de_modfit_85_bundle4_4s_south4s_0_1555431910_228293_0 4/17/2019 10:51:17 AM | Milkyway@Home | Computation for task de_modfit_85_bundle4_4s_south4s_0_1555431910_228293_0 finished 4/17/2019 10:51:17 AM | Milkyway@Home | Starting task de_modfit_85_bundle5_4s_south4s_0_1554998626_1593581_1 4/17/2019 10:51:18 AM | Milkyway@Home | Sending scheduler request: To fetch work. 4/17/2019 10:51:18 AM | Milkyway@Home | Reporting 2 completed tasks 4/17/2019 10:51:18 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 4/17/2019 10:51:20 AM | Milkyway@Home | Scheduler request completed: got 0 new tasks 4/17/2019 10:53:26 AM | Milkyway@Home | Computation for task de_modfit_85_bundle5_4s_south4s_0_1554998626_1593581_1 finished 4/17/2019 10:53:30 AM | Milkyway@Home | Sending scheduler request: To fetch work. 4/17/2019 10:53:30 AM | Milkyway@Home | Reporting 1 completed tasks 4/17/2019 10:53:30 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 4/17/2019 10:53:32 AM | Milkyway@Home | Scheduler request completed: got 0 new tasks 4/17/2019 11:08:14 AM | Milkyway@Home | Sending scheduler request: To fetch work. 4/17/2019 11:08:14 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 4/17/2019 11:08:16 AM | Milkyway@Home | Scheduler request completed: got 3 new tasks 4/17/2019 11:08:18 AM | Milkyway@Home | Started download of parameters-81-4s-donlon.txt 4/17/2019 11:08:18 AM | Milkyway@Home | Started download of stars-81-donlon.txt 4/17/2019 11:08:18 AM | Milkyway@Home | Starting task de_modfit_82_bundle4_4s_south4s_0_1555431910_224733_1 4/17/2019 11:08:19 AM | Milkyway@Home | Finished download of parameters-81-4s-donlon.txt 4/17/2019 11:08:20 AM | Milkyway@Home | Finished download of stars-81-donlon.txt 4/17/2019 11:09:56 AM | Milkyway@Home | Computation for task de_modfit_82_bundle4_4s_south4s_0_1555431910_224733_1 finished 4/17/2019 11:09:56 AM | Milkyway@Home | Starting task de_modfit_81_bundle4_4s_south4s_0_1555431910_224680_1 4/17/2019 11:11:33 AM | Milkyway@Home | Computation for task de_modfit_81_bundle4_4s_south4s_0_1555431910_224680_1 finished 4/17/2019 11:11:33 AM | Milkyway@Home | Starting task de_modfit_82_bundle4_4s_south4s_0_1555431910_233241_0 4/17/2019 11:11:36 AM | Milkyway@Home | Sending scheduler request: To fetch work. 4/17/2019 11:11:36 AM | Milkyway@Home | Reporting 2 completed tasks 4/17/2019 11:11:36 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 4/17/2019 11:11:38 AM | Milkyway@Home | Scheduler request completed: got 0 new tasks 4/17/2019 11:13:11 AM | Milkyway@Home | Computation for task de_modfit_82_bundle4_4s_south4s_0_1555431910_233241_0 finished 4/17/2019 11:13:13 AM | Milkyway@Home | Sending scheduler request: To fetch work. 4/17/2019 11:13:13 AM | Milkyway@Home | Reporting 1 completed tasks 4/17/2019 11:13:13 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 4/17/2019 11:13:15 AM | Milkyway@Home | Scheduler request completed: got 0 new tasks 4/17/2019 11:23:54 AM | Milkyway@Home | Sending scheduler request: To fetch work. 4/17/2019 11:23:54 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 4/17/2019 11:23:56 AM | Milkyway@Home | Scheduler request completed: got 3 new tasks 4/17/2019 11:23:58 AM | Milkyway@Home | Starting task de_modfit_83_bundle4_4s_south4s_0_1555431910_207243_1 4/17/2019 11:25:36 AM | Milkyway@Home | Computation for task de_modfit_83_bundle4_4s_south4s_0_1555431910_207243_1 finished 4/17/2019 11:25:36 AM | Milkyway@Home | Starting task de_modfit_80_bundle5_4s_south4s_0_1554998626_1188566_2 4/17/2019 11:27:33 AM | Milkyway@Home | Computation for task de_modfit_80_bundle5_4s_south4s_0_1554998626_1188566_2 finished 4/17/2019 11:27:33 AM | Milkyway@Home | Starting task de_modfit_83_bundle4_4s_south4s_0_1555431910_207173_1 4/17/2019 11:27:36 AM | Milkyway@Home | Sending scheduler request: To fetch work. 4/17/2019 11:27:36 AM | Milkyway@Home | Reporting 2 completed tasks 4/17/2019 11:27:36 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 4/17/2019 11:27:38 AM | Milkyway@Home | Scheduler request completed: got 0 new tasks 4/17/2019 11:29:11 AM | Milkyway@Home | Computation for task de_modfit_83_bundle4_4s_south4s_0_1555431910_207173_1 finished 4/17/2019 11:29:13 AM | Milkyway@Home | Sending scheduler request: To fetch work. 4/17/2019 11:29:13 AM | Milkyway@Home | Reporting 1 completed tasks 4/17/2019 11:29:13 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 4/17/2019 11:29:15 AM | Milkyway@Home | Scheduler request completed: got 0 new tasks 4/17/2019 11:43:01 AM | Milkyway@Home | Sending scheduler request: To fetch work. 4/17/2019 11:43:01 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 4/17/2019 11:43:03 AM | Milkyway@Home | Scheduler request completed: got 3 new tasks 4/17/2019 11:43:05 AM | Milkyway@Home | Starting task de_modfit_82_bundle4_4s_south4s_0_1555431910_186131_1 There are big time spans between "last WU completeted" and the "fetch new Tasks" request, which then loads down a big amount of 0 (zero) tasks. We have tried a lot, have read this thread and others, but don't understand what we are doing "wrong". Or what we need to adjust. Would like to point out, that on other projects we don't have this "problem". Thanks and have a nice day! |
Send message Joined: 2 Mar 13 Posts: 1 Credit: 179,256,454 RAC: 0 |
I can confirm that. My Radeon VII needs about 40 minutes for 200 tasks and then has to wait 6-10 minutes idle for new tasks. Strange. |
Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0 |
Yes, for some reason new work is assigned ONLY when the current queue is completely empty - all tasks must be completed AND reported. After that, server will assign new work upon next contact, but for a few minutes the client is idle. I have a backup BOINC project which kicks in during that period, but would prefer to crunch only Milkyway (if the queue was maintained). Perhaps it's necessary from a scientific standpoint i.e. new tasks are created according to results from previous tasks? Obviously, if that's the case, all previous tasks must be sorted out first. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Yes, for some reason new work is assigned ONLY when the current queue is completely empty - all tasks must be completed AND reported. After that, server will assign new work upon next contact, but for a few minutes the client is idle. I have a backup BOINC project which kicks in during that period, but would prefer to crunch only Milkyway (if the queue was maintained). Nope it's not that as there are currently 10976 tasks in the queue ready to send out. I am seeing the same thing....it looks like they are sending out wu's in batches that use some kind of 'master file' and until you finish all those tasks you don't get any new wu's, then you get a new 'master file' and a batch of wu's to go with it. |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey Everyone, This is a known issue, but I have not been able to find a solution to the problem. It's probably an obscure configuration issue or something, but unfortunately they are very hard to track down. Maybe I'll try a post on the BOINC official forums and see if this is a know issue with a specific configuration. Best, Jake |
Send message Joined: 23 Feb 18 Posts: 26 Credit: 4,744,416,145 RAC: 0 |
The problem after 5 months is still present. Any update? Want your Kids stay off from Drugs? Get them building Crunching PC's and they'll never have enough money for drugs |
Send message Joined: 16 Mar 10 Posts: 213 Credit: 108,391,661 RAC: 3,402 |
The problem after 5 months is still present. Any update? There has been some discussion of this in the News thread 30 Workunit Limit Per Request - Fix Implemented. However, that seems to have gone quiet - perhaps the "to do" list is rather long at present?!? To summarize, what appears to happen is that if your BOINC client sends in an update to report completed tasks and ask for new work MilkyWay spits the work request out! If another update is requested after the wait time (90 seconds?) but before there's another completed task to report you'll almost certainly get some work then. Unfortunately, if you can process a work-unit in under 90 seconds, there'll be another report (and no work) and you'll get the empty queue problem anyway. Of the projects I do work for, this seems to be peculiar to MilkyWay! In particular, SETI@Home (with relatively short work-unitsand a longer wait time (5 minutes)) doesn't have this issue... Indeed, someone suggested that perhaps the SETI@Home people might be able to advise on possible set-up issues. For more details look at that News thread, especially the second page. Cheers - Al. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
The problem after 5 months is still present. Any update? PrimeGrid has some very short workunits as well and doesn't have this problem either!! One would think the Admins would talk to each other rather than just try and muddle thru on their own, this isn't the stone age!! |
Send message Joined: 23 Feb 18 Posts: 26 Credit: 4,744,416,145 RAC: 0 |
Any news from developers? Want your Kids stay off from Drugs? Get them building Crunching PC's and they'll never have enough money for drugs |
Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,461,693,501 RAC: 0 |
this is an ongoing problem and requires (as I understand) an upgrade to the server, which could cause worse problems. I have a program that automatically issues an update about 3 minutes after the system runs out of memory but depending on the timing, there is still an idle time, as shown below, of about 6 minutes every two hours. |
Send message Joined: 14 Aug 12 Posts: 10 Credit: 10,052,995 RAC: 0 |
I mostly solved the problem by changing the "Store at least days of work" and "Store up to an additional days of work" numbers which are found on Boinc/Options/Computing preferences. This change downloads 600 work units on my computer which takes a while to run. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
I mostly solved the problem by changing the "Store at least days of work" and "Store up to an additional days of work" numbers which are found on Boinc/Options/Computing preferences. This change downloads 600 work units on my computer which takes a while to run. No the problem is that MilkyWay won't send us any work as long as we already have work on our pc's, so until we are absolutely out of work we get the dreaded 'delay'! What your solution did is push that 'delay' further down the road but until you are absolutely out of MilkyWay workunits you aren't getting any new ones either. |
Send message Joined: 14 Aug 12 Posts: 10 Credit: 10,052,995 RAC: 0 |
No, that is not what is happening for me. I might still have 200 or 300 wus to go and it will refill to the max wus I have stipulated based on a day's work as calculated by BOINC or MW@H, I don't know which. I do this every day. Last night I had it fill the "bucket, so to speak" and this morning, sometime after 5 PM CST, I had it filled again and I have 350 wus in the bucket now. I just checked. Why this isn't working for you is a puzzle I don't have an answer for. |
Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,461,693,501 RAC: 0 |
No, that is not what is happening for me. I might still have 200 or 300 wus to go and it will refill to the max wus I have stipulated based on a day's work as calculated by BOINC or MW@H, I don't know which. When the last MW task is uploaded, there is anywhere from 15 to 30 minute delay before new data is downloaded. Some users get in a tizzy after 2-3 minutes of no data and 15 minutes is a lifetime for some of them. For me, it was a programming challenge to have boinctasks issue an update when it sees the well has run dry. I also have resources for Einstein set to 0 and can run an Einstein or two while waiting if I want. Also should mention there is a limit on tasks and I and other are stuck on that limit. Adding addition time has no effect as the max has already been sent. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
No, that is not what is happening for me. I might still have 200 or 300 wus to go and it will refill to the max wus I have stipulated based on a day's work as calculated by BOINC or MW@H, I don't know which. So what are the numbers you have in the two boxes then? I will try it, maybe the key is more than one day of work? Currently mine is set at 0.25 and 0.25 |
Send message Joined: 14 Aug 12 Posts: 10 Credit: 10,052,995 RAC: 0 |
I started out with .1 and .1 cause I came from PrimeGrid where things worked as they should. I changed them to 1 and 1 on MW@H and have had no problems. It does download a bunch but not a problem as I can go thru them in a day or so. |
Send message Joined: 23 Feb 18 Posts: 26 Credit: 4,744,416,145 RAC: 0 |
With a Radeon VII i can crunch 2WU at same time in about 20sec.. a queue of 300WU will dry in more or less 50mins.. and every 50mins i have 10-15minutes of idling.. Cache is set at 10+10days but it was 1+1days and before 0.1+0.1.. no difference for me on any host Probably problem is less noticable with slower gpus but for high end gpus penalty is of about 20-25% Want your Kids stay off from Drugs? Get them building Crunching PC's and they'll never have enough money for drugs |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
I started out with .1 and .1 cause I came from PrimeGrid where things worked as they should. Thank you I will set that today and let you know how it goes. |
Send message Joined: 23 Feb 18 Posts: 26 Credit: 4,744,416,145 RAC: 0 |
I tried the reccomended settings (1+1day) but situation is the same, this was from some minutes ago but some considerations.. The problem is not the idle when the queue is empty, the problem as highlighted is that when results are reported new workunits were not downloaded until queue is empty.. THIS IS THE PROBLEM, not the idling at empty cache.. If cache was refilled (3 results, 3 new WU) there will be no problems of empty queue Look: GPU Radeon VII - Host https://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=775931 (3 Wu at same time) 72287 Milkyway@Home 17/09/2019 16:47:59 Sending scheduler request: To fetch work. 72288 Milkyway@Home 17/09/2019 16:47:59 Reporting 9 completed tasks 72289 Milkyway@Home 17/09/2019 16:47:59 Requesting new tasks for AMD/ATI GPU 72290 Milkyway@Home 17/09/2019 16:48:01 Scheduler request completed: got 0 new tasks 72291 Milkyway@Home 17/09/2019 16:48:02 Computation for task de_modfit_14_bundle4_testing_3s4f_3_1564052102_18057897_0 finished 72292 Milkyway@Home 17/09/2019 16:48:10 Computation for task de_modfit_14_bundle5_testing_4s3f_3_1564052102_18057819_0 finished 72293 Milkyway@Home 17/09/2019 16:48:18 Computation for task de_modfit_86_bundle4_4s_south4s_bgset_2_1564052102_18057556_0 finished 72294 Milkyway@Home 17/09/2019 16:49:36 Sending scheduler request: To fetch work. 72295 Milkyway@Home 17/09/2019 16:49:36 Reporting 3 completed tasks 72296 Milkyway@Home 17/09/2019 16:49:36 Requesting new tasks for AMD/ATI GPU 72297 Milkyway@Home 17/09/2019 16:49:38 Scheduler request completed: got 0 new tasks 72298 Milkyway@Home 17/09/2019 16:57:13 Sending scheduler request: To fetch work. 72299 Milkyway@Home 17/09/2019 16:57:13 Requesting new tasks for AMD/ATI GPU 72300 Milkyway@Home 17/09/2019 16:57:17 Scheduler request completed: got 300 new tasks Same problem with the second Host, here i have 10+10 Days: GPU 2x HD7970 - Host https://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=801749 (3Wu per GPU, 6 at the same time) 122460 Milkyway@Home 17/09/2019 15:33:11 Reporting 6 completed tasks 122461 Milkyway@Home 17/09/2019 15:33:11 Requesting new tasks for AMD/ATI GPU 122462 Milkyway@Home 17/09/2019 15:33:13 Computation for task de_modfit_86_bundle4_4s_south4s_bgset_2_1564052102_17748961_1 finished 122463 Milkyway@Home 17/09/2019 15:33:13 Scheduler request completed: got 0 new tasks 122464 Milkyway@Home 17/09/2019 15:34:14 Computation for task de_modfit_14_bundle5_testing_4s3f_1_1564052102_17992266_1 finished 122465 Milkyway@Home 17/09/2019 15:34:14 Computation for task de_modfit_14_bundle4_testing_3s4f_1_1564052102_17999182_0 finished 122466 Milkyway@Home 17/09/2019 15:34:27 Computation for task de_modfit_83_bundle4_4s_south4s_bgset_2_1564052102_17998863_0 finished 122467 Milkyway@Home 17/09/2019 15:34:45 Sending scheduler request: To fetch work. 122468 Milkyway@Home 17/09/2019 15:34:45 Reporting 4 completed tasks 122469 Milkyway@Home 17/09/2019 15:34:45 Requesting new tasks for AMD/ATI GPU 122470 Milkyway@Home 17/09/2019 15:34:46 Scheduler request completed: got 0 new tasks 122471 Milkyway@Home 17/09/2019 15:47:21 Sending scheduler request: To fetch work. 122472 Milkyway@Home 17/09/2019 15:47:21 Requesting new tasks for AMD/ATI GPU 122473 Milkyway@Home 17/09/2019 15:47:28 Scheduler request completed: got 600 new tasks Want your Kids stay off from Drugs? Get them building Crunching PC's and they'll never have enough money for drugs |
Send message Joined: 2 Oct 16 Posts: 167 Credit: 1,008,062,758 RAC: 834 |
Yes, this was reported in May. From the last result there needs to be a 10min period of no requests until the clients can get more work. https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4424&postid=68441#68441 I'm about to setup a script to turn off networking for like 11min, resume and do a project update, allow for 30min or so then repeat. |
©2024 Astroinformatics Group