Message boards :
Number crunching :
Validation inconclusive
Message board moderation
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 18 · Next
Author | Message |
---|---|
Send message Joined: 16 Mar 10 Posts: 211 Credit: 108,029,656 RAC: 3,231 |
I have taken my machines off of MilkyWay for right now but the last tasks I finished up yesterday were all _2 tasks, no _0 or _1 tasks at all, I wonder if that means we are getting to a batch of validator tasks or if Tom found a way to send out the validator tasks sooner than the end of the database? Mikey, It's almost certainly just normal processing of the queue, in order! The increase back to over 12 million tasks unsent should have been entirely down to work units becoming active again once the number of such tasks passed the limit that should stop new work units being generated -- happily, the generator doesn't seem to have over-produced again :-) However, some of the tasks being sent out now are actually still "first attempts" in the sense that the _0 (and sometimes _1) task failed for some reason (too late?, aborted?) and when those return the validator will flag them up as needing another result, even if they return a [potentially] valid result! So those go back to the end of the work unit queue again ;-( If the 10% I was seeing is typical, the fact that the unsent count is dropping by about 12,700 an hour would suggest that there are actually just over 14,000 results being returned each hour and about 1,400 of them get re-queued for one reason or another. At least the numbers are going down, and at a fairly steady rate; with luck it should drop below 11 million around midnight UTC tonight! Hopefully, when Tom gets back from his business travels he might be able to do something about speeding up the reduction without throwing away [too many] results... And we might find out whether N-Body was supposed to be using adaptive replication [which could have vastly reduced the size of this problem] or not :-) Cheers - Al. P.S. It's a pity you quoted my original message, not the one intended to correct the [significant] error in one of the conclusions therein. Ah, well, never mind... |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 12 |
I have taken my machines off of MilkyWay for right now but the last tasks I finished up yesterday were all _2 tasks, no _0 or _1 tasks at all, I wonder if that means we are getting to a batch of validator tasks or if Tom found a way to send out the validator tasks sooner than the end of the database? I sure hope you are correct!!! After the Pentathlon is over I will bring my machines back to get some more milestones in. |
Send message Joined: 20 Jul 09 Posts: 6 Credit: 3,000,120,629 RAC: 0 |
Greetings ... I have very many tasks marked as inconclusive ... on both computers recently ... what might be the cause of that? regards, Grzegorz Roman Granowski |
Send message Joined: 11 Mar 22 Posts: 42 Credit: 21,902,543 RAC: 0 |
Greetings ... I have very many tasks marked as inconclusive ... on both computers recently ... Greetings Grzegorz, I recommend to read this thread from the start. |
Send message Joined: 20 Jul 09 Posts: 6 Credit: 3,000,120,629 RAC: 0 |
OK, I'll do that... |
Send message Joined: 16 Mar 10 Posts: 211 Credit: 108,029,656 RAC: 3,231 |
Greetings ... I have very many tasks marked as inconclusive ... on both computers recently ...A lot of what's in this thread is specific to N-Body tasks, and as you only seem to be doing Separation GPU work... A random sample of your Inconclusive tasks shows that there's no real reason to worry about them; any Separation work unit whose first task doesn't get validated without a wingman (by "adaptive replication") needs three results to complete validation, and some of those wingmen won't be that quick off the mark in returning them (especially if the wingman runs a CPU job rather than a GPU one!) For an extreme example, you processed a result for work unit 435879155 on 29th April; a second task went out within 10 minutes of your result being reported but it took that one almost 12 days to return! A third task went out about 80 minutes after that returned (and to a fast-turnaround host, as it happens) so that one will probably validate later today. There are some even more extreme ones out there, including examples where a wingman hasn't replied before the deadline, and it looks as if the next one won't reply either. There's not an awful lot you (or the Milkyway people) can do about that :-( The time to start worrying about inconclusive tasks is when there's a large interval between one result coming in and the next task being sent out (or, worse, the next task sitting there saying something like "not sent"... If you want to look at the status of your most overdue units without scrolling through 35 or 40 pages to get there, simply alter the offset value in the URL to something nearer the end -- for instance, I used offset=680 to get to the page that included the examples I mentioned above, as there were just over 700 Inconclusive tasks at the time. Hope this helps. Cheers - Al. P.S. I see you've just replied to GolfSierra's reading suggestion :-) So I just hope you see this before you start wading through the 200+ messages; ah, well - I tried... |
Send message Joined: 11 Mar 22 Posts: 42 Credit: 21,902,543 RAC: 0 |
Thanks for the wrap-up, Al. |
Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0 |
There are some even more extreme ones out there, including examples where a wingman hasn't replied before the deadline, and it looks as if the next one won't reply either. There's not an awful lot you (or the Milkyway people) can do about that :-(What would be the impact of shortening up the deadline? 12 days seems overly generous to me.... |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 12 |
There are some even more extreme ones out there, including examples where a wingman hasn't replied before the deadline, and it looks as if the next one won't reply either. There's not an awful lot you (or the Milkyway people) can do about that :-( That's a Tom the Admin question but involves more than just that, if ie you say the _1 and _2 etc tasks get a shorter deadline than the original tasks, I don't even know if the Server version they use allows that, but it means those tasks could be returned sooner than the original _0 tasks meaning even more work for the Server to handle. Yes we are just shifting the same load but it's still a change for the Servers that needs to be considered. It could also mean less _0 tasks for your and I and more 'validation tasks' instead, no the credits don't change and yes our rac's should stay the same but we will get less _0 tasks at least in the short term. |
Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0 |
That's a Tom the Admin question but involves more than just that, if ie you say the _1 and _2 etc tasks get a shorter deadline than the original tasks, I don't even know if the Server version they use allows that, but it means those tasks could be returned sooner than the original _0 tasks meaning even more work for the Server to handle. Yes we are just shifting the same load but it's still a change for the Servers that needs to be considered. It could also mean less _0 tasks for your and I and more 'validation tasks' instead, no the credits don't change and yes our rac's should stay the same but we will get less _0 tasks at least in the short term.What I was thinking was all tasks, _0 to _n, set to a shorter duration, say 5 or 6 days. Also, don't launch any wingman tasks until the previous task is returned. I wouldn't mind running wingman only tasks. As long as there were an option to accept original tasks if no wingman tasks were available. Having an option to run wingman only tasks might help out on the server side by flushing work units through a bit faster. I can see less dwell time required for a _0 task cooling its heels, if a wingman task were returned right away. |
Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0 |
What I was thinking was all tasks, _0 to _n, set to a shorter duration, say 5 or 6 days. Also, don't launch any wingman tasks until the previous task is returned. I wouldn't mind running wingman only tasks. As long as there were an option to accept original tasks if no wingman tasks were available. Having an option to run wingman only tasks might help out on the server side by flushing work units through a bit faster. I can see less dwell time required for a _0 task cooling its heels, if a wingman task were returned right away.OK, just had a brain fart here(can you even say that on these boards?). How about putting the option to run wingman only tasks on the client side? Then no change to the servers at all. The server shouldn't give a hoot about when jobs are returned, only that they are returned before the 12 day deadline. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 12 |
What I was thinking was all tasks, _0 to _n, set to a shorter duration, say 5 or 6 days. Also, don't launch any wingman tasks until the previous task is returned. I wouldn't mind running wingman only tasks. As long as there were an option to accept original tasks if no wingman tasks were available. Having an option to run wingman only tasks might help out on the server side by flushing work units through a bit faster. I can see less dwell time required for a _0 task cooling its heels, if a wingman task were returned right away. That won't work it has to be done from the Server side because all the Client side does is say 'I need x minutes of work' and then the Server checks to see what kind we have allowed and sends it to us if it has any. But yes they can add choices to the selection of tasks we get and it's not that hard at all to do, they just have to want to do it. And I too would not mind doing validator tasks as my first choice with a choice then to get original tasks if none are available. Now if they ever decide to give out more credits for the original tasks that thinking might change depending on the difference. |
Send message Joined: 4 Jul 09 Posts: 90 Credit: 17,217,377 RAC: 1,804 |
Still all things considered we are doing better even if it is slowly. My Validation inconclusive tasks have dropped from 908 to 710 or about 22%. Seems easier on my emotions if I ignore them. Bill F |
Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,900,464 RAC: 0 |
Still all things considered we are doing better even if it is slowly. My Validation inconclusive tasks have dropped from 908 to 710 or about 22%. Seems easier on my emotions if I ignore them. Agreed. - none of mine have been cleared yet though my total of over 500 is probably nothing compared to big munchers. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 12 |
Still all things considered we are doing better even if it is slowly. My Validation inconclusive tasks have dropped from 908 to 710 or about 22%. Seems easier on my emotions if I ignore them. Mine at Validation inconclusive (811) seems to be going down by about 3 tasks per day |
Send message Joined: 23 Sep 13 Posts: 19 Credit: 36,223,867 RAC: 8 |
My Validation Inconclusive for N-Body went down from 14,734 to 11,805 so I am seeing an improvement on N-Body work units being validated |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 12 |
My Validation Inconclusive for N-Body went down from 14,734 to 11,805 so I am seeing an improvement on N-Body work units being validated Mine went down by one task today |
Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,900,464 RAC: 0 |
The waiting for validation seems to have increased a lot now over 412,000… My backlog of waiting for validation has not changed at all. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 12 |
The waiting for validation seems to have increased a lot now over 412,000… Mine went down 5 today!!! |
Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,900,464 RAC: 0 |
I think something is going wrong, server stats haven’t been updated for many hours, waiting for validation is now over 600,000 and nothing I have done today has been validated, all going into pending validation. |
©2024 Astroinformatics Group