WUs stucks with 'Completed, validation inconclusive' status

Author	Message
Hal Bregg Send message Joined: 28 Dec 18 Posts: 14 Credit: 1,419,832 RAC: 0	Message 72779 - Posted: 14 Apr 2022, 13:12:13 UTC I have 152 N-Body WUs sitting in my account with status 'Completed, validation inconclusive' Here's an example https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=400797800 Looks like task for wingman was never sent. ID: 72779 · Rating: 0 · rate: / Reply Quote

Kiska Send message Joined: 31 Mar 12 Posts: 96 Credit: 152,502,225 RAC: 0	Message 72782 - Posted: 14 Apr 2022, 13:33:29 UTC - in response to Message 72779. I have 152 N-Body WUs sitting in my account with status 'Completed, validation inconclusive' Here's an example https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=400797800 Looks like task for wingman was never sent. This issue is known check one of the news threads The second task will get sent... Eventually ID: 72782 · Rating: 0 · rate: / Reply Quote

arcturus Send message Joined: 20 Nov 07 Posts: 54 Credit: 2,663,789 RAC: 0	Message 72790 - Posted: 14 Apr 2022, 18:01:23 UTC - in response to Message 72782. The second task will get sent... Eventually You're confident about this? I haven't had a single confirmation of an inconclusive WU in a couple of days so let's see, based on the number of outstanding WU's I'm guessing 2024, but hey I guess that falls under 'eventually.' Doesn't instill confidence. Some reason why these don't get priority? Seems to me the focus should be on WU's already completed. When a project can't mind the store with old WU's what does that say about prospects for the new ones? At any time there could be another crash and I haven't seen ANY provisions to address it other than a call for funds which may be insufficient. And I'll say what many are thinking .. while the software and hardware are contributors, competence has been lacking also. ID: 72790 · Rating: 0 · rate: / Reply Quote

Kiska Send message Joined: 31 Mar 12 Posts: 96 Credit: 152,502,225 RAC: 0	Message 72800 - Posted: 14 Apr 2022, 22:26:07 UTC - in response to Message 72790. The second task will get sent... Eventually You're confident about this? I haven't had a single confirmation of an inconclusive WU in a couple of days so let's see, based on the number of outstanding WU's I'm guessing 2024, but hey I guess that falls under 'eventually.' Doesn't instill confidence. Some reason why these don't get priority? Seems to me the focus should be on WU's already completed. When a project can't mind the store with old WU's what does that say about prospects for the new ones? At any time there could be another crash and I haven't seen ANY provisions to address it other than a call for funds which may be insufficient. And I'll say what many are thinking .. while the software and hardware are contributors, competence has been lacking also. This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example. ID: 72800 · Rating: 0 · rate: / Reply Quote

Hal Bregg Send message Joined: 28 Dec 18 Posts: 14 Credit: 1,419,832 RAC: 0	Message 72816 - Posted: 15 Apr 2022, 8:17:15 UTC - in response to Message 72800. The second task will get sent... Eventually You're confident about this? I haven't had a single confirmation of an inconclusive WU in a couple of days so let's see, based on the number of outstanding WU's I'm guessing 2024, but hey I guess that falls under 'eventually.' Doesn't instill confidence. Some reason why these don't get priority? Seems to me the focus should be on WU's already completed. When a project can't mind the store with old WU's what does that say about prospects for the new ones? At any time there could be another crash and I haven't seen ANY provisions to address it other than a call for funds which may be insufficient. And I'll say what many are thinking .. while the software and hardware are contributors, competence has been lacking also. This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example. ðŸ‘ ID: 72816 · Rating: 0 · rate: / Reply Quote

arcturus Send message Joined: 20 Nov 07 Posts: 54 Credit: 2,663,789 RAC: 0	Message 72896 - Posted: 16 Apr 2022, 18:19:50 UTC - in response to Message 72800. This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example. So this backlog could of been reduced considerably with quicker proactive management. Issuance of new tasks should of stopped much sooner when inconclusives started escalating rapidly. It's like sending a fleet of cars into the field that became stranded because they ran out of gas and are now waiting for new cars to bring refills. ID: 72896 · Rating: 0 · rate: / Reply Quote

Septimus Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,905,857 RAC: 0	Message 72901 - Posted: 16 Apr 2022, 19:30:47 UTC - in response to Message 72896. This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example. So this backlog could of been reduced considerably with quicker proactive management. Issuance of new tasks should of stopped much sooner when inconclusives started escalating rapidly. It's like sending a fleet of cars into the field that became stranded because they ran out of gas and are now waiting for new cars to bring refills. What I find curious is that like others I have a big queue of inconclusive, but the few hundred I did today all were validated within a few minutes. ID: 72901 · Rating: 0 · rate: / Reply Quote

S984s5KN6muKjYePgfqf7F37RiXw5f... Send message Joined: 8 May 09 Posts: 3339 Credit: 524,355,475 RAC: 17,168	Message 72914 - Posted: 17 Apr 2022, 11:45:50 UTC - in response to Message 72896. This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example. So this backlog could of been reduced considerably with quicker proactive management. Issuance of new tasks should of stopped much sooner when inconclusives started escalating rapidly. It's like sending a fleet of cars into the field that became stranded because they ran out of gas and are now waiting for new cars to bring refills. In a word yes BUT the Admins thought that it would never get s bad as it did and thought that the new disk would arrive sooner than it did and then when added to the Raid array rebuild MUCH MUCH faster than it did. Hindsight is 20/20 but the Admin was going on the info they had at the time which it turns out wasn't very good at all. That happens when the old Admin leaves with little to no documentation of what they did and the results of it, leaving the new Admin to figure it out on the fly. ID: 72914 · Rating: 0 · rate: / Reply Quote

AndreyOR Send message Joined: 13 Oct 21 Posts: 44 Credit: 232,975,711 RAC: 31,285	Message 72926 - Posted: 17 Apr 2022, 20:24:39 UTC - in response to Message 72901. It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. ID: 72926 · Rating: 0 · rate: / Reply Quote

HRFMguy Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0	Message 72946 - Posted: 17 Apr 2022, 23:56:55 UTC - in response to Message 72926. It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. Last night I have been manually prioritizing my wing man tasks ahead of the xxx_0 tasks, using 'task suspended by user'. Run all _4 tasks first, then _3, and so on. Yes, it takes a lot of baby sitting, but it should help clear work units a little bit faster. It would be nice if the client did that on its own. Could be a force multiplier. Gonna add that to the make a wish list topic. Each type was jiggered separately, NVIDIA GPU, AMD GPU, n body mt, and single thread separation. When it was time to call it a night, set all to ready to start, and let the chips fall where they may. ID: 72946 · Rating: 0 · rate: / Reply Quote

S984s5KN6muKjYePgfqf7F37RiXw5f... Send message Joined: 8 May 09 Posts: 3339 Credit: 524,355,475 RAC: 17,168	Message 72970 - Posted: 18 Apr 2022, 11:33:09 UTC - in response to Message 72926. It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. You are correct both the Server and Client sides work on a FIFO basis, since the 2nd task isn't generated until the 1st task comes back it goes at the end of the list, ideally once things are 'back to normal' again the tasks should be fairly interspersed as people complete tasks and then get new ones in the next batch, since each of us has a different cache size and speed of our cpu's and gpu's there are ALOT of variables meaning fairly widely dispersed tasks. But as the Admins keep cancelling tasks and then dropping 10k at a time new ones on us things are a little more linear right now. ID: 72970 · Rating: 0 · rate: / Reply Quote

S984s5KN6muKjYePgfqf7F37RiXw5f... Send message Joined: 8 May 09 Posts: 3339 Credit: 524,355,475 RAC: 17,168	Message 72971 - Posted: 18 Apr 2022, 11:42:35 UTC - in response to Message 72946. It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. Last night I have been manually prioritizing my wing man tasks ahead of the xxx_0 tasks, using 'task suspended by user'. Run all _4 tasks first, then _3, and so on. Yes, it takes a lot of baby sitting, but it should help clear work units a little bit faster. It would be nice if the client did that on its own. Could be a force multiplier. Gonna add that to the make a wish list topic. Each type was jiggered separately, NVIDIA GPU, AMD GPU, n body mt, and single thread separation. When it was time to call it a night, set all to ready to start, and let the chips fall where they may. I had to laugh when I read this...I used to do that too and sometimes still do!! One thing the Server side can do is prioritize the non zero tasks, they can do it with any number so it gets processed quicker than the zero tasks. Seti tried sending the re-sends only to people who had returned a steady flow of tasks recently instead of newbies but as with all good ideas some of the 'ol vets' had pc problems and the idea fizzled out, I think in the end they just played with the deadlines instead. Part of the problem, especially for new people, is that the default cache size is 10 days meaning they can get ALOT of tasks that in some cases they can't finish in time so they end up either letting them expire or aborting them, both causing another version of the task to be created. ID: 72971 · Rating: 0 · rate: / Reply Quote

HRFMguy Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0	Message 72976 - Posted: 18 Apr 2022, 12:33:42 UTC - in response to Message 72971. It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago. Last night I have been manually prioritizing my wing man tasks ahead of the xxx_0 tasks, using 'task suspended by user'. Run all _4 tasks first, then _3, and so on. Yes, it takes a lot of baby sitting, but it should help clear work units a little bit faster. It would be nice if the client did that on its own. Could be a force multiplier. Gonna add that to the make a wish list topic. Each type was jiggered separately, NVIDIA GPU, AMD GPU, n body mt, and single thread separation. When it was time to call it a night, set all to ready to start, and let the chips fall where they may. I had to laugh when I read this...I used to do that too and sometimes still do!! One thing the Server side can do is prioritize the non zero tasks, they can do it with any number so it gets processed quicker than the zero tasks. Seti tried sending the re-sends only to people who had returned a steady flow of tasks recently instead of newbies but as with all good ideas some of the 'ol vets' had pc problems and the idea fizzled out, I think in the end they just played with the deadlines instead. Part of the problem, especially for new people, is that the default cache size is 10 days meaning they can get ALOT of tasks that in some cases they can't finish in time so they end up either letting them expire or aborting them, both causing another version of the task to be created. The deadlines are remarkably generous..... ID: 72976 · Rating: 0 · rate: / Reply Quote

.clair. Send message Joined: 3 Mar 13 Posts: 84 Credit: 779,527,712 RAC: 0	Message 72984 - Posted: 18 Apr 2022, 18:51:19 UTC - in response to Message 72976. The deadlines are remarkably generous..... The N-Body Simulation I have atm have a 12 day deadline . ID: 72984 · Rating: 0 · rate: / Reply Quote

Kiska Send message Joined: 31 Mar 12 Posts: 96 Credit: 152,502,225 RAC: 0	Message 72989 - Posted: 19 Apr 2022, 3:02:57 UTC - in response to Message 72976. The deadlines are remarkably generous..... Are they 2 months generous :D ID: 72989 · Rating: 0 · rate: / Reply Quote