Welcome to MilkyWay@home

WUs stucks with 'Completed, validation inconclusive' status

Message boards : Number crunching : WUs stucks with 'Completed, validation inconclusive' status
Message board moderation

To post messages, you must log in.

AuthorMessage
Hal Bregg

Send message
Joined: 28 Dec 18
Posts: 14
Credit: 1,419,832
RAC: 1
Message 72779 - Posted: 14 Apr 2022, 13:12:13 UTC

I have 152 N-Body WUs sitting in my account with status 'Completed, validation inconclusive'

Here's an example

https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=400797800


Looks like task for wingman was never sent.
ID: 72779 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,956,434
RAC: 2,768
Message 72782 - Posted: 14 Apr 2022, 13:33:29 UTC - in response to Message 72779.  

I have 152 N-Body WUs sitting in my account with status 'Completed, validation inconclusive'

Here's an example

https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=400797800


Looks like task for wingman was never sent.

This issue is known check one of the news threads

The second task will get sent... Eventually
ID: 72782 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
arcturus

Send message
Joined: 20 Nov 07
Posts: 54
Credit: 2,663,789
RAC: 0
Message 72790 - Posted: 14 Apr 2022, 18:01:23 UTC - in response to Message 72782.  

The second task will get sent... Eventually

You're confident about this? I haven't had a single confirmation of an inconclusive WU in a couple of days so let's see, based on the number of outstanding WU's I'm guessing 2024, but hey I guess that falls under 'eventually.' Doesn't instill confidence.

Some reason why these don't get priority? Seems to me the focus should be on WU's already completed. When a project can't mind the store with old WU's what does that say about prospects for the new ones? At any time there could be another crash and I haven't seen ANY provisions to address it other than a call for funds which may be insufficient. And I'll say what many are thinking .. while the software and hardware are contributors, competence has been lacking also.
ID: 72790 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,956,434
RAC: 2,768
Message 72800 - Posted: 14 Apr 2022, 22:26:07 UTC - in response to Message 72790.  

The second task will get sent... Eventually

You're confident about this? I haven't had a single confirmation of an inconclusive WU in a couple of days so let's see, based on the number of outstanding WU's I'm guessing 2024, but hey I guess that falls under 'eventually.' Doesn't instill confidence.

Some reason why these don't get priority? Seems to me the focus should be on WU's already completed. When a project can't mind the store with old WU's what does that say about prospects for the new ones? At any time there could be another crash and I haven't seen ANY provisions to address it other than a call for funds which may be insufficient. And I'll say what many are thinking .. while the software and hardware are contributors, competence has been lacking also.


This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example.
ID: 72800 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Hal Bregg

Send message
Joined: 28 Dec 18
Posts: 14
Credit: 1,419,832
RAC: 1
Message 72816 - Posted: 15 Apr 2022, 8:17:15 UTC - in response to Message 72800.  

The second task will get sent... Eventually

You're confident about this? I haven't had a single confirmation of an inconclusive WU in a couple of days so let's see, based on the number of outstanding WU's I'm guessing 2024, but hey I guess that falls under 'eventually.' Doesn't instill confidence.

Some reason why these don't get priority? Seems to me the focus should be on WU's already completed. When a project can't mind the store with old WU's what does that say about prospects for the new ones? At any time there could be another crash and I haven't seen ANY provisions to address it other than a call for funds which may be insufficient. And I'll say what many are thinking .. while the software and hardware are contributors, competence has been lacking also.


This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example.


👍
ID: 72816 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
arcturus

Send message
Joined: 20 Nov 07
Posts: 54
Credit: 2,663,789
RAC: 0
Message 72896 - Posted: 16 Apr 2022, 18:19:50 UTC - in response to Message 72800.  

This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example.

So this backlog could of been reduced considerably with quicker proactive management. Issuance of new tasks should of stopped much sooner when inconclusives started escalating rapidly. It's like sending a fleet of cars into the field that became stranded because they ran out of gas and are now waiting for new cars to bring refills.
ID: 72896 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,893,094
RAC: 399
Message 72901 - Posted: 16 Apr 2022, 19:30:47 UTC - in response to Message 72896.  

This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example.

So this backlog could of been reduced considerably with quicker proactive management. Issuance of new tasks should of stopped much sooner when inconclusives started escalating rapidly. It's like sending a fleet of cars into the field that became stranded because they ran out of gas and are now waiting for new cars to bring refills.


What I find curious is that like others I have a big queue of inconclusive, but the few hundred I did today all were validated within a few minutes.
ID: 72901 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,284,091
RAC: 19,644
Message 72914 - Posted: 17 Apr 2022, 11:45:50 UTC - in response to Message 72896.  

This is how the feeder works. The output of the SQL query is First In First Out and so with that in mind, the task IDs are incrementing sequentially when a tasks marked inconclusive it gets added to the end of the task table and that is what we're seeing. I've got tasks generated on the 15th of March still going, and it'll be a while before it gets to the 16th of March for example.


So this backlog could of been reduced considerably with quicker proactive management. Issuance of new tasks should of stopped much sooner when inconclusives started escalating rapidly. It's like sending a fleet of cars into the field that became stranded because they ran out of gas and are now waiting for new cars to bring refills.


In a word yes BUT the Admins thought that it would never get s bad as it did and thought that the new disk would arrive sooner than it did and then when added to the Raid array rebuild MUCH MUCH faster than it did. Hindsight is 20/20 but the Admin was going on the info they had at the time which it turns out wasn't very good at all. That happens when the old Admin leaves with little to no documentation of what they did and the results of it, leaving the new Admin to figure it out on the fly.
ID: 72914 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 13 Oct 21
Posts: 44
Credit: 225,160,613
RAC: 7,928
Message 72926 - Posted: 17 Apr 2022, 20:24:39 UTC - in response to Message 72901.  

It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago.
ID: 72926 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,038,236
RAC: 5,987
Message 72946 - Posted: 17 Apr 2022, 23:56:55 UTC - in response to Message 72926.  

It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago.

Last night I have been manually prioritizing my wing man tasks ahead of the xxx_0 tasks, using 'task suspended by user'. Run all _4 tasks first, then _3, and so on. Yes, it takes a lot of baby sitting, but it should help clear work units a little bit faster. It would be nice if the client did that on its own. Could be a force multiplier. Gonna add that to the make a wish list topic. Each type was jiggered separately, NVIDIA GPU, AMD GPU, n body mt, and single thread separation. When it was time to call it a night, set all to ready to start, and let the chips fall where they may.
ID: 72946 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,284,091
RAC: 19,644
Message 72970 - Posted: 18 Apr 2022, 11:33:09 UTC - in response to Message 72926.  

It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago.


You are correct both the Server and Client sides work on a FIFO basis, since the 2nd task isn't generated until the 1st task comes back it goes at the end of the list, ideally once things are 'back to normal' again the tasks should be fairly interspersed as people complete tasks and then get new ones in the next batch, since each of us has a different cache size and speed of our cpu's and gpu's there are ALOT of variables meaning fairly widely dispersed tasks. But as the Admins keep cancelling tasks and then dropping 10k at a time new ones on us things are a little more linear right now.
ID: 72970 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,284,091
RAC: 19,644
Message 72971 - Posted: 18 Apr 2022, 11:42:35 UTC - in response to Message 72946.  

It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago.


Last night I have been manually prioritizing my wing man tasks ahead of the xxx_0 tasks, using 'task suspended by user'. Run all _4 tasks first, then _3, and so on. Yes, it takes a lot of baby sitting, but it should help clear work units a little bit faster. It would be nice if the client did that on its own. Could be a force multiplier. Gonna add that to the make a wish list topic. Each type was jiggered separately, NVIDIA GPU, AMD GPU, n body mt, and single thread separation. When it was time to call it a night, set all to ready to start, and let the chips fall where they may.


I had to laugh when I read this...I used to do that too and sometimes still do!! One thing the Server side can do is prioritize the non zero tasks, they can do it with any number so it gets processed quicker than the zero tasks. Seti tried sending the re-sends only to people who had returned a steady flow of tasks recently instead of newbies but as with all good ideas some of the 'ol vets' had pc problems and the idea fizzled out, I think in the end they just played with the deadlines instead. Part of the problem, especially for new people, is that the default cache size is 10 days meaning they can get ALOT of tasks that in some cases they can't finish in time so they end up either letting them expire or aborting them, both causing another version of the task to be created.
ID: 72971 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,038,236
RAC: 5,987
Message 72976 - Posted: 18 Apr 2022, 12:33:42 UTC - in response to Message 72971.  

It doesn't seem like the queue is cleanly divided into first-attempt tasks and wingman tasks, groups of wingman tasks are interspersed with groups of first-attempt ones. So what's likely happening is that we got to a portion of the queue with wingman tasks. Processing those validated things immediately because the first attempt was already done. Now we've moved to another section of first-attempts (that's all I'm getting currently). Eventually we'll get to it all. Things should move quicker than before as we have almost 2600 users from a low of about 100 just a few days ago.


Last night I have been manually prioritizing my wing man tasks ahead of the xxx_0 tasks, using 'task suspended by user'. Run all _4 tasks first, then _3, and so on. Yes, it takes a lot of baby sitting, but it should help clear work units a little bit faster. It would be nice if the client did that on its own. Could be a force multiplier. Gonna add that to the make a wish list topic. Each type was jiggered separately, NVIDIA GPU, AMD GPU, n body mt, and single thread separation. When it was time to call it a night, set all to ready to start, and let the chips fall where they may.


I had to laugh when I read this...I used to do that too and sometimes still do!! One thing the Server side can do is prioritize the non zero tasks, they can do it with any number so it gets processed quicker than the zero tasks. Seti tried sending the re-sends only to people who had returned a steady flow of tasks recently instead of newbies but as with all good ideas some of the 'ol vets' had pc problems and the idea fizzled out, I think in the end they just played with the deadlines instead. Part of the problem, especially for new people, is that the default cache size is 10 days meaning they can get ALOT of tasks that in some cases they can't finish in time so they end up either letting them expire or aborting them, both causing another version of the task to be created.

The deadlines are remarkably generous.....
ID: 72976 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
.clair.

Send message
Joined: 3 Mar 13
Posts: 84
Credit: 779,527,671
RAC: 3,708
Message 72984 - Posted: 18 Apr 2022, 18:51:19 UTC - in response to Message 72976.  

The deadlines are remarkably generous.....

The N-Body Simulation I have atm have a 12 day deadline .
ID: 72984 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,956,434
RAC: 2,768
Message 72989 - Posted: 19 Apr 2022, 3:02:57 UTC - in response to Message 72976.  


The deadlines are remarkably generous.....


Are they 2 months generous :D
ID: 72989 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : WUs stucks with 'Completed, validation inconclusive' status

©2024 Astroinformatics Group