Welcome to MilkyWay@home

Validation inconclusive

Message boards : Number crunching : Validation inconclusive
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 18 · Next

AuthorMessage
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,900,464
RAC: 0
Message 73336 - Posted: 5 May 2022, 17:57:39 UTC - in response to Message 73294.  
Last modified: 5 May 2022, 18:27:54 UTC

Unless I am being silly I am confused as to why the Nbody unsent are not dropping very quickly. I have completed over 200 today that were validated. There are 4300 users on currently so assuming they are similar to me that could be 860,000 completed. The unsents have dropped by around 20,000 in the last 24 hours , not what I would expect. My validation inconclusive backlog has not changed at all, in fact it has grown very slightly. At the present rate I estimate it will take 550 days to clear the backlog.. Hope I am wrong.
ID: 73336 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 213
Credit: 108,397,914
RAC: 3,422
Message 73337 - Posted: 5 May 2022, 18:42:50 UTC - in response to Message 73336.  
Last modified: 5 May 2022, 19:38:45 UTC

Unless I am being silly I am confused as to why the Nbody unsent are not dropping very quickly. I have completed over 200 today that were validated. There are 4300 users on currently so assuming they are similar to me that could be 860,000 completed. The unsents have dropped by around 20,000 in the last 24 hours , not what I would expect. My validation inconclusive backlog has not changed at all, in fact it has grown very slightly. At the present rate I estimate it will take 550 days to clear the backlog..
I've been tracking this as well and can confirm that observation. For what it's worth, over 90% of the tasks I've seen recently are second tries which get validated more or less instantly, so the work units should be getting thinned out a bit!

I can think of three possibilities for the count staying high, not mutually exclusive:

  1. It might still be counting "Didn't need" tasks as unsent (in which case they won't disappear until the relevant work units are validated and assimilated;
  2. I've noticed a fairly large number of the tasks I've got in the Inconclusive category are retries for tasks recently aborted - the way N-Body seems to work at present, if someone aborts a retry the transitioner will produce another one!
  3. The work unit generator might have decided to add some new stuff again (despite claims that that shouldn't happen if there are lot of tasks already waiting to go out!)


You'll notice there's no "the unsent tasks query has a bug in it" option there -- I've looked at the Server status PHP code and the query is really simplistic (and is almost certainly counting "Didn't need" items in the total!)

If N-Body used adaptive replication (as Separation does) this wouldn't be a problem -- "reliable" hosts would validate without a wingman most of the time! However, if N-Body is supposed to be using adaptive replication it isn't working (configuration issues?)

I had thought of asking Tom about that in the "N-Body flush" thread, but I reckon it would get lost amongst the noise there :-) Perhaps he'll see this and let us know whether wingman-free validation should be working for N-Body or not -- it might be the only way of clearing the backlog without having to delete entire work units!

Cheers - Al.

[Edited to note my validate-to-inconclusive ratio and to rephrase the bit about the D/B query and "Didn't need" tasks]

ID: 73337 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,900,464
RAC: 0
Message 73338 - Posted: 5 May 2022, 18:55:41 UTC - in response to Message 73337.  

Thanks alanb 1951, very reassuring.
ID: 73338 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 213
Credit: 108,397,914
RAC: 3,422
Message 73339 - Posted: 5 May 2022, 21:43:28 UTC - in response to Message 73337.  

I can think of three possibilities for the count staying high, not mutually exclusive:

  1. It might still be counting "Didn't need" tasks as unsent (in which case they won't disappear until the relevant work units are validated and assimilated;
  2. I've noticed a fairly large number of the tasks I've got in the Inconclusive category are retries for tasks recently aborted - the way N-Body seems to work at present, if someone aborts a retry the transitioner will produce another one!
  3. The work unit generator might have decided to add some new stuff again (despite claims that that shouldn't happen if there are lot of tasks already waiting to go out!)


You'll notice there's no "the unsent tasks query has a bug in it" option there -- I've looked at the Server status PHP code and the query is really simplistic (and is almost certainly counting "Didn't need" items in the total!)

Correction -- option 1 shouldn't apply; I had another look at the PHP, and provided the "Didn't need" tasks are flagged properly they shouldn't be counted. So I'm at [more of] a loss to explain the increasing count if the generator isn't adding new work units! (And I sincerely hope it isn't...)

Cheers - Al.

Observation: it's a bit difficult trying to investigate/"debug" something {in order to explain it] from the opposite side of the Atlantic, especially without detailed access to data. :-)
ID: 73339 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 73346 - Posted: 6 May 2022, 10:22:04 UTC - in response to Message 73339.  

I can think of three possibilities for the count staying high, not mutually exclusive:

  1. It might still be counting "Didn't need" tasks as unsent (in which case they won't disappear until the relevant work units are validated and assimilated;
  2. I've noticed a fairly large number of the tasks I've got in the Inconclusive category are retries for tasks recently aborted - the way N-Body seems to work at present, if someone aborts a retry the transitioner will produce another one!
  3. The work unit generator might have decided to add some new stuff again (despite claims that that shouldn't happen if there are lot of tasks already waiting to go out!)


You'll notice there's no "the unsent tasks query has a bug in it" option there -- I've looked at the Server status PHP code and the query is really simplistic (and is almost certainly counting "Didn't need" items in the total!)

Correction -- option 1 shouldn't apply; I had another look at the PHP, and provided the "Didn't need" tasks are flagged properly they shouldn't be counted. So I'm at [more of] a loss to explain the increasing count if the generator isn't adding new work units! (And I sincerely hope it isn't...)

Cheers - Al.

Observation: it's a bit difficult trying to investigate/"debug" something {in order to explain it] from the opposite side of the Atlantic, especially without detailed access to data. :-)


And Tom the Admin is out of town as well so the Server could be doing things 'on it's own' again, 'on it's own' being what's is supposed to do but not exactly what Tom thinks it's supposed to be doing. MW certainly needs to get Tom some more help in the form of some IT grad student or someone like that that can help decipher the complexities of Boinc.
ID: 73346 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,900,464
RAC: 0
Message 73352 - Posted: 6 May 2022, 13:07:33 UTC - in response to Message 73346.  
Last modified: 6 May 2022, 13:18:43 UTC

Did a spot check from yesterday against today and NBody unsents have risen by almost 120,000. Still no movement up or down on my backlog of inconclusives .
ID: 73352 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Robert Coplin

Send message
Joined: 23 Sep 13
Posts: 19
Credit: 36,223,867
RAC: 0
Message 73353 - Posted: 6 May 2022, 13:21:41 UTC - in response to Message 73352.  

I can tell you that my Validation Inconclusive has gone down by over 400 in the last 2 days and I still have another 14,306 that are still listed as Validation Inconclusive so it's going to take awhile to get my Validation Inconclusive down to 0.I predict if World Community Grid comes back on May 9th as it's scheduled that some of the people will go back to World Community Grid which will make the number of NBody work units to be validated take even longer
ID: 73353 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GolfSierra

Send message
Joined: 11 Mar 22
Posts: 42
Credit: 21,902,543
RAC: 0
Message 73354 - Posted: 6 May 2022, 20:15:14 UTC - in response to Message 73353.  

I can tell you that my Validation Inconclusive has gone down by over 400 in the last 2 days and I still have another 14,306 that are still listed as Validation Inconclusive so it's going to take awhile to get my Validation Inconclusive down to 0.I predict if World Community Grid comes back on May 9th as it's scheduled that some of the people will go back to World Community Grid which will make the number of NBody work units to be validated take even longer


+1
ID: 73354 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 13 Oct 21
Posts: 44
Credit: 227,133,586
RAC: 11,713
Message 73355 - Posted: 6 May 2022, 21:02:13 UTC

One of the reasons validation seems slow moving (and the queue) is because of tasks like this: https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=391645869. There are probably thousands of them maybe even a few million (from the 14 million that were created after disk reconstruction). Eventually things will get cleared up but it will take time.
ID: 73355 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
arcturus

Send message
Joined: 20 Nov 07
Posts: 54
Credit: 2,663,789
RAC: 0
Message 73357 - Posted: 6 May 2022, 23:18:20 UTC - in response to Message 73353.  
Last modified: 6 May 2022, 23:19:19 UTC

I predict if World Community Grid comes back on May 9th as it's scheduled that some of the people will go back to World Community Grid which will make the number of NBody work units to be validated take even longer


The damage has already been done. I stopped crunching late March after seeing my inconclusives flatline without improvement. I'm sure I'm not alone. Moved to universe@home and now check only irregularly here just to see what WU's remain.
ID: 73357 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 73359 - Posted: 7 May 2022, 2:07:08 UTC - in response to Message 73355.  

One of the reasons validation seems slow moving (and the queue) is because of tasks like this: https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=391645869. There are probably thousands of them maybe even a few million (from the 14 million that were created after disk reconstruction). Eventually things will get cleared up but it will take time.


Yes alot of people have no clue about how to reduce the cache size when they reduce the amount of time their pc can run full speed, so end up with sooooo many units that can't finish in time and yet still let them just expire instead of aborting them.
ID: 73359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 73360 - Posted: 7 May 2022, 2:08:33 UTC - in response to Message 73357.  

I predict if World Community Grid comes back on May 9th as it's scheduled that some of the people will go back to World Community Grid which will make the number of NBody work units to be validated take even longer


The damage has already been done. I stopped crunching late March after seeing my inconclusives flatline without improvement. I'm sure I'm not alone. Moved to universe@home and now check only irregularly here just to see what WU's remain.


I'm sure your Team is liking your contribution to the Boinc Pentathlon assuming they signed up for it.
ID: 73360 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,038,236
RAC: 0
Message 73362 - Posted: 7 May 2022, 2:35:43 UTC - in response to Message 73346.  

MW certainly needs to get Tom some more help in the form of some IT grad student or someone like that that can help decipher the complexities of Boinc.

Amen. And Amen.
ID: 73362 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GolfSierra

Send message
Joined: 11 Mar 22
Posts: 42
Credit: 21,902,543
RAC: 0
Message 73419 - Posted: 9 May 2022, 8:14:13 UTC - in response to Message 73357.  

We will provide an update on the result of our efforts to resolve the issue Sunday evening.


I predict if World Community Grid comes back on May 9th as it's scheduled that some of the people will go back to World Community Grid which will make the number of NBody work units to be validated take even longer


Sunday evening passed by, no news. But the didn't say which Sunday ...
ID: 73419 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 73422 - Posted: 9 May 2022, 10:50:45 UTC - in response to Message 73419.  

We will provide an update on the result of our efforts to resolve the issue Sunday evening.


I predict if World Community Grid comes back on May 9th as it's scheduled that some of the people will go back to World Community Grid which will make the number of NBody work units to be validated take even longer


Sunday evening passed by, no news. But the didn't say which Sunday ...


Guessing that means they ran into more problems
ID: 73422 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,900,464
RAC: 0
Message 73432 - Posted: 9 May 2022, 20:45:29 UTC - in response to Message 73422.  
Last modified: 9 May 2022, 21:01:59 UTC

Think the validator has stopped, waiting for validation has shotup to over 35,000
ID: 73432 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 213
Credit: 108,397,914
RAC: 3,422
Message 73433 - Posted: 9 May 2022, 21:49:52 UTC - in response to Message 73432.  
Last modified: 9 May 2022, 21:52:34 UTC

Think the validator has stopped, waiting for validation has shotup to over 35,000

The validator is still working -- I've just checked my N-Body tasks flagged valid, and there are plenty returned since your post... However, there is an explanation for why the "Waiting for validation" is rising...

A lot of the N-Body work units that were stalled had no successful result returned when they were put on hold; As N=Body doesn't seem to be using adaptive replication, when someone returns the first successful unit it'll end up "Inconclusive" until a wingman returns a result, and as the request for a second result ends up at the back of the [over-large] queue those work units are going to push up the "Waiting for validation" count for quite a while :-(

For what it's worth, at the moment about 10% of the tasks my systems process each day are flagged inconclusive because of this. I have no idea how that compares with what other users are seeing, but if it's typical then (given the rate at which the unsent tasks count is being reduced) the "Waiting for validation" count could keep increasing by 20,000 or more per day until we get round to much more recent work units.

Cheers - Al.
ID: 73433 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,900,464
RAC: 0
Message 73434 - Posted: 9 May 2022, 23:30:30 UTC - in response to Message 73433.  

Thank you.
ID: 73434 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 213
Credit: 108,397,914
RAC: 3,422
Message 73437 - Posted: 10 May 2022, 5:49:08 UTC - in response to Message 73433.  

Think the validator has stopped, waiting for validation has shotup to over 35,000

The validator is still working -- I've just checked my N-Body tasks flagged valid, and there are plenty returned since your post... However, there is an explanation for why the "Waiting for validation" is rising...

A lot of the N-Body work units that were stalled had no successful result returned when they were put on hold; As N=Body doesn't seem to be using adaptive replication, when someone returns the first successful unit it'll end up "Inconclusive" until a wingman returns a result, and as the request for a second result ends up at the back of the [over-large] queue those work units are going to push up the "Waiting for validation" count for quite a while :-(

For what it's worth, at the moment about 10% of the tasks my systems process each day are flagged inconclusive because of this. I have no idea how that compares with what other users are seeing, but if it's typical then (given the rate at which the unsent tasks count is being reduced) the "Waiting for validation" count could keep increasing by 20,000 or more per day until we get round to much more recent work units.


Cheers - Al.
Oops = sleep-deprived misread of source code, and too late to edit the original... Whilst the above was right regarding how long it might take for inconclusive status to be resolved, it wasn't right about the validation count!!!

I note that the waiting for validation count has dropped back again -- I wonder if it was actually a rush of Separation tasks all being returned at once...

Sorry about that - Al.
ID: 73437 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 73440 - Posted: 10 May 2022, 10:52:36 UTC - in response to Message 73433.  

Think the validator has stopped, waiting for validation has shotup to over 35,000

The validator is still working -- I've just checked my N-Body tasks flagged valid, and there are plenty returned since your post... However, there is an explanation for why the "Waiting for validation" is rising...

A lot of the N-Body work units that were stalled had no successful result returned when they were put on hold; As N=Body doesn't seem to be using adaptive replication, when someone returns the first successful unit it'll end up "Inconclusive" until a wingman returns a result, and as the request for a second result ends up at the back of the [over-large] queue those work units are going to push up the "Waiting for validation" count for quite a while :-(

For what it's worth, at the moment about 10% of the tasks my systems process each day are flagged inconclusive because of this. I have no idea how that compares with what other users are seeing, but if it's typical then (given the rate at which the unsent tasks count is being reduced) the "Waiting for validation" count could keep increasing by 20,000 or more per day until we get round to much more recent work units.

Cheers - Al.


I have taken my machines off of MilkyWay for right now but the last tasks I finished up yesterday were all _2 tasks, no _0 or _1 tasks at all, I wonder if that means we are getting to a batch of validator tasks or if Tom found a way to send out the validator tasks sooner than the end of the database?
ID: 73440 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 18 · Next

Message boards : Number crunching : Validation inconclusive

©2024 Astroinformatics Group