Message boards :
Number crunching :
WUs stucks with 'Completed, validation inconclusive' status
Message board moderation
Author | Message |
---|---|
Send message Joined: 28 Dec 18 Posts: 14 Credit: 1,419,832 RAC: 0 |
I have 152 N-Body WUs sitting in my account with status 'Completed, validation inconclusive' Here's an example https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=400797800 Looks like task for wingman was never sent. |
Send message Joined: 31 Mar 12 Posts: 96 Credit: 152,502,177 RAC: 0 |
I have 152 N-Body WUs sitting in my account with status 'Completed, validation inconclusive' This issue is known check one of the news threads The second task will get sent... Eventually |
Send message Joined: 20 Nov 07 Posts: 54 Credit: 2,663,789 RAC: 0 |
The second task will get sent... Eventually You're confident about this? I haven't had a single confirmation of an inconclusive WU in a couple of days so let's see, based on the number of outstanding WU's I'm guessing 2024, but hey I guess that falls under 'eventually.' Doesn't instill confidence. Some reason why these don't get priority? Seems to me the focus should be on WU's already completed. When a project can't mind the store with old WU's what does that say about prospects for the new ones? At any time there could be another crash and I haven't seen ANY provisions to address it other than a call for funds which may be insufficient. And I'll say what many are thinking .. while the software and hardware are contributors, competence has been lacking also. |
Send message Joined: 31 Mar 12 Posts: 96 Credit: 152,502,177 RAC: 0 |
The second task will get sent... Eventually This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example. |
Send message Joined: 28 Dec 18 Posts: 14 Credit: 1,419,832 RAC: 0 |
The second task will get sent... Eventually 👠|
Send message Joined: 20 Nov 07 Posts: 54 Credit: 2,663,789 RAC: 0 |
This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example. So this backlog could of been reduced considerably with quicker proactive management. Issuance of new tasks should of stopped much sooner when inconclusives started escalating rapidly. It's like sending a fleet of cars into the field that became stranded because they ran out of gas and are now waiting for new cars to bring refills. |
Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,900,464 RAC: 0 |
This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example. What I find curious is that like others I have a big queue of inconclusive, but the few hundred I did today all were validated within a few minutes. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example. In a word yes BUT the Admins thought that it would never get s bad as it did and thought that the new disk would arrive sooner than it did and then when added to the Raid array rebuild MUCH MUCH faster than it did. Hindsight is 20/20 but the Admin was going on the info they had at the time which it turns out wasn't very good at all. That happens when the old Admin leaves with little to no documentation of what they did and the results of it, leaving the new Admin to figure it out on the fly. |
Send message Joined: 13 Oct 21 Posts: 44 Credit: 226,971,416 RAC: 3,531 |
It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. |
Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0 |
It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. Last night I have been manually prioritizing my wing man tasks ahead of the xxx_0 tasks, using 'task suspended by user'. Run all _4 tasks first, then _3, and so on. Yes, it takes a lot of baby sitting, but it should help clear work units a little bit faster. It would be nice if the client did that on its own. Could be a force multiplier. Gonna add that to the make a wish list topic. Each type was jiggered separately, NVIDIA GPU, AMD GPU, n body mt, and single thread separation. When it was time to call it a night, set all to ready to start, and let the chips fall where they may. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. You are correct both the Server and Client sides work on a FIFO basis, since the 2nd task isn't generated until the 1st task comes back it goes at the end of the list, ideally once things are 'back to normal' again the tasks should be fairly interspersed as people complete tasks and then get new ones in the next batch, since each of us has a different cache size and speed of our cpu's and gpu's there are ALOT of variables meaning fairly widely dispersed tasks. But as the Admins keep cancelling tasks and then dropping 10k at a time new ones on us things are a little more linear right now. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. I had to laugh when I read this...I used to do that too and sometimes still do!! One thing the Server side can do is prioritize the non zero tasks, they can do it with any number so it gets processed quicker than the zero tasks. Seti tried sending the re-sends only to people who had returned a steady flow of tasks recently instead of newbies but as with all good ideas some of the 'ol vets' had pc problems and the idea fizzled out, I think in the end they just played with the deadlines instead. Part of the problem, especially for new people, is that the default cache size is 10 days meaning they can get ALOT of tasks that in some cases they can't finish in time so they end up either letting them expire or aborting them, both causing another version of the task to be created. |
Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0 |
It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. The deadlines are remarkably generous..... |
Send message Joined: 3 Mar 13 Posts: 84 Credit: 779,527,712 RAC: 0 |
The deadlines are remarkably generous..... The N-Body Simulation I have atm have a 12 day deadline . |
Send message Joined: 31 Mar 12 Posts: 96 Credit: 152,502,177 RAC: 0 |
Are they 2 months generous :D |
©2024 Astroinformatics Group