Welcome to MilkyWay@home

Validation inconclusive

Message boards : Number crunching : Validation inconclusive
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 18 · Next

AuthorMessage
alanb1951

Send message
Joined: 16 Mar 10
Posts: 208
Credit: 105,447,129
RAC: 36,580
Message 73448 - Posted: 10 May 2022, 14:52:11 UTC - in response to Message 73440.  

I have taken my machines off of MilkyWay for right now but the last tasks I finished up yesterday were all _2 tasks, no _0 or _1 tasks at all, I wonder if that means we are getting to a batch of validator tasks or if Tom found a way to send out the validator tasks sooner than the end of the database?

Mikey,

It's almost certainly just normal processing of the queue, in order! The increase back to over 12 million tasks unsent should have been entirely down to work units becoming active again once the number of such tasks passed the limit that should stop new work units being generated -- happily, the generator doesn't seem to have over-produced again :-)

However, some of the tasks being sent out now are actually still "first attempts" in the sense that the _0 (and sometimes _1) task failed for some reason (too late?, aborted?) and when those return the validator will flag them up as needing another result, even if they return a [potentially] valid result! So those go back to the end of the work unit queue again ;-(

If the 10% I was seeing is typical, the fact that the unsent count is dropping by about 12,700 an hour would suggest that there are actually just over 14,000 results being returned each hour and about 1,400 of them get re-queued for one reason or another. At least the numbers are going down, and at a fairly steady rate; with luck it should drop below 11 million around midnight UTC tonight!

Hopefully, when Tom gets back from his business travels he might be able to do something about speeding up the reduction without throwing away [too many] results... And we might find out whether N-Body was supposed to be using adaptive replication [which could have vastly reduced the size of this problem] or not :-)

Cheers - Al.

P.S. It's a pity you quoted my original message, not the one intended to correct the [significant] error in one of the conclusions therein. Ah, well, never mind...
ID: 73448 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,946,492
RAC: 22,331
Message 73455 - Posted: 11 May 2022, 10:36:22 UTC - in response to Message 73448.  

I have taken my machines off of MilkyWay for right now but the last tasks I finished up yesterday were all _2 tasks, no _0 or _1 tasks at all, I wonder if that means we are getting to a batch of validator tasks or if Tom found a way to send out the validator tasks sooner than the end of the database?

Mikey,

It's almost certainly just normal processing of the queue, in order! The increase back to over 12 million tasks unsent should have been entirely down to work units becoming active again once the number of such tasks passed the limit that should stop new work units being generated -- happily, the generator doesn't seem to have over-produced again :-)

However, some of the tasks being sent out now are actually still "first attempts" in the sense that the _0 (and sometimes _1) task failed for some reason (too late?, aborted?) and when those return the validator will flag them up as needing another result, even if they return a [potentially] valid result! So those go back to the end of the work unit queue again ;-(

If the 10% I was seeing is typical, the fact that the unsent count is dropping by about 12,700 an hour would suggest that there are actually just over 14,000 results being returned each hour and about 1,400 of them get re-queued for one reason or another. At least the numbers are going down, and at a fairly steady rate; with luck it should drop below 11 million around midnight UTC tonight!

Hopefully, when Tom gets back from his business travels he might be able to do something about speeding up the reduction without throwing away [too many] results... And we might find out whether N-Body was supposed to be using adaptive replication [which could have vastly reduced the size of this problem] or not :-)

Cheers - Al.

P.S. It's a pity you quoted my original message, not the one intended to correct the [significant] error in one of the conclusions therein. Ah, well, never mind...


I sure hope you are correct!!! After the Pentathlon is over I will bring my machines back to get some more milestones in.
ID: 73455 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Grzegorz Roman Granowski

Send message
Joined: 20 Jul 09
Posts: 6
Credit: 3,000,120,629
RAC: 0
Message 73459 - Posted: 11 May 2022, 16:48:50 UTC

Greetings ... I have very many tasks marked as inconclusive ... on both computers recently ...

what might be the cause of that?

regards, Grzegorz Roman Granowski
ID: 73459 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GolfSierra

Send message
Joined: 11 Mar 22
Posts: 42
Credit: 21,902,543
RAC: 0
Message 73460 - Posted: 11 May 2022, 17:44:21 UTC - in response to Message 73459.  

Greetings ... I have very many tasks marked as inconclusive ... on both computers recently ...

what might be the cause of that?

regards, Grzegorz Roman Granowski


Greetings Grzegorz,
I recommend to read this thread from the start.
ID: 73460 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Grzegorz Roman Granowski

Send message
Joined: 20 Jul 09
Posts: 6
Credit: 3,000,120,629
RAC: 0
Message 73461 - Posted: 11 May 2022, 18:02:29 UTC - in response to Message 73460.  

OK, I'll do that...
ID: 73461 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 208
Credit: 105,447,129
RAC: 36,580
Message 73462 - Posted: 11 May 2022, 18:28:06 UTC - in response to Message 73459.  
Last modified: 11 May 2022, 18:28:35 UTC

Greetings ... I have very many tasks marked as inconclusive ... on both computers recently ...

what might be the cause of that?

regards, Grzegorz Roman Granowski
A lot of what's in this thread is specific to N-Body tasks, and as you only seem to be doing Separation GPU work...

A random sample of your Inconclusive tasks shows that there's no real reason to worry about them; any Separation work unit whose first task doesn't get validated without a wingman (by "adaptive replication") needs three results to complete validation, and some of those wingmen won't be that quick off the mark in returning them (especially if the wingman runs a CPU job rather than a GPU one!)

For an extreme example, you processed a result for work unit 435879155 on 29th April; a second task went out within 10 minutes of your result being reported but it took that one almost 12 days to return! A third task went out about 80 minutes after that returned (and to a fast-turnaround host, as it happens) so that one will probably validate later today.

There are some even more extreme ones out there, including examples where a wingman hasn't replied before the deadline, and it looks as if the next one won't reply either. There's not an awful lot you (or the Milkyway people) can do about that :-(

The time to start worrying about inconclusive tasks is when there's a large interval between one result coming in and the next task being sent out (or, worse, the next task sitting there saying something like "not sent"...

If you want to look at the status of your most overdue units without scrolling through 35 or 40 pages to get there, simply alter the offset value in the URL to something nearer the end -- for instance, I used offset=680 to get to the page that included the examples I mentioned above, as there were just over 700 Inconclusive tasks at the time.

Hope this helps.

Cheers - Al.

P.S. I see you've just replied to GolfSierra's reading suggestion :-) So I just hope you see this before you start wading through the 200+ messages; ah, well - I tried...
ID: 73462 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GolfSierra

Send message
Joined: 11 Mar 22
Posts: 42
Credit: 21,902,543
RAC: 0
Message 73465 - Posted: 11 May 2022, 18:49:42 UTC - in response to Message 73462.  


P.S. I see you've just replied to GolfSierra's reading suggestion :-) So I just hope you see this before you start wading through the 200+ messages; ah, well - I tried...


Thanks for the wrap-up, Al.
ID: 73465 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,030,134
RAC: 36,348
Message 73466 - Posted: 11 May 2022, 18:59:29 UTC - in response to Message 73462.  

There are some even more extreme ones out there, including examples where a wingman hasn't replied before the deadline, and it looks as if the next one won't reply either. There's not an awful lot you (or the Milkyway people) can do about that :-(
What would be the impact of shortening up the deadline? 12 days seems overly generous to me....
ID: 73466 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,946,492
RAC: 22,331
Message 73470 - Posted: 12 May 2022, 1:37:56 UTC - in response to Message 73466.  
Last modified: 12 May 2022, 1:39:36 UTC

There are some even more extreme ones out there, including examples where a wingman hasn't replied before the deadline, and it looks as if the next one won't reply either. There's not an awful lot you (or the Milkyway people) can do about that :-(


What would be the impact of shortening up the deadline? 12 days seems overly generous to me....


That's a Tom the Admin question but involves more than just that, if ie you say the _1 and _2 etc tasks get a shorter deadline than the original tasks, I don't even know if the Server version they use allows that, but it means those tasks could be returned sooner than the original _0 tasks meaning even more work for the Server to handle. Yes we are just shifting the same load but it's still a change for the Servers that needs to be considered. It could also mean less _0 tasks for your and I and more 'validation tasks' instead, no the credits don't change and yes our rac's should stay the same but we will get less _0 tasks at least in the short term.
ID: 73470 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,030,134
RAC: 36,348
Message 73479 - Posted: 12 May 2022, 3:13:31 UTC - in response to Message 73470.  

That's a Tom the Admin question but involves more than just that, if ie you say the _1 and _2 etc tasks get a shorter deadline than the original tasks, I don't even know if the Server version they use allows that, but it means those tasks could be returned sooner than the original _0 tasks meaning even more work for the Server to handle. Yes we are just shifting the same load but it's still a change for the Servers that needs to be considered. It could also mean less _0 tasks for your and I and more 'validation tasks' instead, no the credits don't change and yes our rac's should stay the same but we will get less _0 tasks at least in the short term.
What I was thinking was all tasks, _0 to _n, set to a shorter duration, say 5 or 6 days. Also, don't launch any wingman tasks until the previous task is returned. I wouldn't mind running wingman only tasks. As long as there were an option to accept original tasks if no wingman tasks were available. Having an option to run wingman only tasks might help out on the server side by flushing work units through a bit faster. I can see less dwell time required for a _0 task cooling its heels, if a wingman task were returned right away.
ID: 73479 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,030,134
RAC: 36,348
Message 73485 - Posted: 12 May 2022, 3:48:01 UTC - in response to Message 73479.  

What I was thinking was all tasks, _0 to _n, set to a shorter duration, say 5 or 6 days. Also, don't launch any wingman tasks until the previous task is returned. I wouldn't mind running wingman only tasks. As long as there were an option to accept original tasks if no wingman tasks were available. Having an option to run wingman only tasks might help out on the server side by flushing work units through a bit faster. I can see less dwell time required for a _0 task cooling its heels, if a wingman task were returned right away.
OK, just had a brain fart here(can you even say that on these boards?). How about putting the option to run wingman only tasks on the client side? Then no change to the servers at all. The server shouldn't give a hoot about when jobs are returned, only that they are returned before the 12 day deadline.
ID: 73485 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,946,492
RAC: 22,331
Message 73487 - Posted: 12 May 2022, 12:03:00 UTC - in response to Message 73485.  
Last modified: 12 May 2022, 12:05:36 UTC

What I was thinking was all tasks, _0 to _n, set to a shorter duration, say 5 or 6 days. Also, don't launch any wingman tasks until the previous task is returned. I wouldn't mind running wingman only tasks. As long as there were an option to accept original tasks if no wingman tasks were available. Having an option to run wingman only tasks might help out on the server side by flushing work units through a bit faster. I can see less dwell time required for a _0 task cooling its heels, if a wingman task were returned right away.


OK, just had a brain fart here(can you even say that on these boards?). How about putting the option to run wingman only tasks on the client side? Then no change to the servers at all. The server shouldn't give a hoot about when jobs are returned, only that they are returned before the 12 day deadline.


That won't work it has to be done from the Server side because all the Client side does is say 'I need x minutes of work' and then the Server checks to see what kind we have allowed and sends it to us if it has any. But yes they can add choices to the selection of tasks we get and it's not that hard at all to do, they just have to want to do it.

And I too would not mind doing validator tasks as my first choice with a choice then to get original tasks if none are available. Now if they ever decide to give out more credits for the original tasks that thinking might change depending on the difference.
ID: 73487 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 4 Jul 09
Posts: 85
Credit: 16,692,777
RAC: 4,468
Message 73498 - Posted: 14 May 2022, 2:09:20 UTC

Still all things considered we are doing better even if it is slowly. My Validation inconclusive tasks have dropped from 908 to 710 or about 22%. Seems easier on my emotions if I ignore them.

Bill F
ID: 73498 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 73502 - Posted: 14 May 2022, 7:03:55 UTC - in response to Message 73498.  

Still all things considered we are doing better even if it is slowly. My Validation inconclusive tasks have dropped from 908 to 710 or about 22%. Seems easier on my emotions if I ignore them.

Bill F


Agreed. - none of mine have been cleared yet though my total of over 500 is probably nothing compared to big munchers.
ID: 73502 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,946,492
RAC: 22,331
Message 73506 - Posted: 14 May 2022, 10:44:32 UTC - in response to Message 73502.  

Still all things considered we are doing better even if it is slowly. My Validation inconclusive tasks have dropped from 908 to 710 or about 22%. Seems easier on my emotions if I ignore them.

Bill F


Agreed. - none of mine have been cleared yet though my total of over 500 is probably nothing compared to big munchers.


Mine at Validation inconclusive (811) seems to be going down by about 3 tasks per day
ID: 73506 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Robert Coplin

Send message
Joined: 23 Sep 13
Posts: 19
Credit: 36,217,133
RAC: 0
Message 73544 - Posted: 16 May 2022, 3:21:32 UTC

My Validation Inconclusive for N-Body went down from 14,734 to 11,805 so I am seeing an improvement on N-Body work units being validated
ID: 73544 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,946,492
RAC: 22,331
Message 73546 - Posted: 16 May 2022, 3:34:41 UTC - in response to Message 73544.  

My Validation Inconclusive for N-Body went down from 14,734 to 11,805 so I am seeing an improvement on N-Body work units being validated


Mine went down by one task today
ID: 73546 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 73553 - Posted: 16 May 2022, 20:10:25 UTC - in response to Message 73546.  

The waiting for validation seems to have increased a lot now over 412,000…

My backlog of waiting for validation has not changed at all.
ID: 73553 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,946,492
RAC: 22,331
Message 73554 - Posted: 17 May 2022, 3:09:02 UTC - in response to Message 73553.  

The waiting for validation seems to have increased a lot now over 412,000…

My backlog of waiting for validation has not changed at all.


Mine went down 5 today!!!
ID: 73554 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 73561 - Posted: 17 May 2022, 12:32:24 UTC - in response to Message 73554.  

I think something is going wrong, server stats haven’t been updated for many hours, waiting for validation is now over 600,000 and nothing I have done today has been validated, all going into pending validation.
ID: 73561 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 18 · Next

Message boards : Number crunching : Validation inconclusive

©2024 Astroinformatics Group