Message boards :
Number crunching :
What is the meaning of "Consecutive valid tasks" statistic?
Message board moderation
Author | Message |
---|---|
Send message Joined: 9 May 11 Posts: 15 Credit: 22,575,796 RAC: 0 |
I've been trying desperately to figure out my error rate on this project. I'd love to know if I'm producing useful results. Hell, I'd love to know if I'm producing useless results. But there is very little feedback and all my results disappear so quickly that I'm never sure what's going on. Am I contributing to science? Am I churning uselessly? Because of the mild annoyance of never knowing what is going on, I've been swimming through menus, looking for hints. Under "Application details for host..." I found one variable that lists "Number of tasks completed" and other that lists "Consecutive valid tasks." Missing app version Number of tasks completed 4083 Max tasks per day 13490 Number of tasks today 0 Consecutive valid tasks 5 Average processing rate 88.65894888594 Average turnaround time 0.10 days MilkyWay@Home 0.80 windows_x86_64 (cuda_opencl) Number of tasks completed 20 Max tasks per day 10020 Number of tasks today 0 Consecutive valid tasks 4 Average processing rate 96.326203241313 Average turnaround time 0.17 days MilkyWay@Home 0.82 windows_x86_64 (cuda_opencl) Number of tasks completed 98 Max tasks per day 10098 Number of tasks today 18 Consecutive valid tasks 17 Average processing rate 95.334224612124 Average turnaround time 0.09 days If I truly can't string together more than five valid results in a row on one app version, and only seventeen on another, well that would imply that I'm getting terrible results. Ideally I'd have hundreds of consecutive valid tasks, no? But, of course, I don't know. Thus all the question marks. |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
i'm wondering the same thing...here's what my "application details" page looks like:
obviously a low number of consecutive valid tasks doesn't necessarily mean i'm getting computation errors - it could just as well be validation errors, or perhaps it can be triggered by a successful result that was submitted too late to be counted as valid (possibly due to a wingman returning a resent task before my resent task gets returned). also, i thought of another scenario that might affect "consecutive valid tasks," but i'd like someone to confirm or deny it for reassurance purposes. suppose i have 5 tasks that are "ready to report," and they all get reported to the server simultaneously. now suppose 4 of them get validated right away, but the 5th task ends up in the "pending" state for some time. now, despite the fact that there is nothing noticeably wrong with the pending 5th task (i.e. we know it completed successfully, but we don't yet know if it'll get validated), could this scenario result in the server recognizing only 4 consecutive valid tasks simply due to the fact that the 5th task has not yet been validated? the reason i think that something so silly might affect the "consecutive valid task" calculation is b/c i cannot find a more logical reason for having such a low "consecutive valid tasks" value all the time. that is to say, when i go to my MW@H account page, click on the tasks link, and monitor/filter through the various types of tasks (in progress/pending/valid/invalid/error), i don't find errors or invalids often enough to explain my low "consecutive valid tasks" value. and when i say i'm monitoring the data, i mean i'm sitting in front of the computer for a significant amount of time (30 minutes or so), refreshing the view every few minutes to ensure that i don't miss a page update. when i do this for upwards of 30 minutes and i see zero errors and/or invalids, i have to wonder why my number of consecutive valid tasks is so low... |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
also, i thought of another scenario that might affect "consecutive valid tasks," but i'd like someone to confirm or deny it for reassurance purposes. suppose i have 5 tasks that are "ready to report," and they all get reported to the server simultaneously. now suppose 4 of them get validated right away, but the 5th task ends up in the "pending" state for some time. now, despite the fact that there is nothing noticeably wrong with the pending 5th task (i.e. we know it completed successfully, but we don't yet know if it'll get validated), could this scenario result in the server recognizing only 4 consecutive valid tasks simply due to the fact that the 5th task has not yet been validated? If this helps answer your question: on my 7 MW machines the current lowest consecutive valid task count is 53. That would mean that pendings aren't counted as invalid. Yesterday the bad test WUs were a problem. Otherwise you may have an issue on 1 or more GPUs. |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
...That would mean that pendings aren't counted as invalid. Yesterday the bad test WUs were a problem. Otherwise you may have an issue on 1 or more GPUs. thank you for confirming that pending tasks don't hold up the "consecutive valid tasks" count. also, i just noticed my "consecutive valid tasks" count go from 19 to 1, and confirmed that a computation error occurred...here's the task info: Stderr output i bolded the lines that looked strange to me. i don't know if its pertinent, but the original task (de_separation_13_3s_free_2_595178_1308064239_0) was a validate error, the first resend (de_separation_13_3s_free_2_595178_1308064239_1) was my computation error, and the 2nd resend (de_separation_13_3s_free_2_595178_1308064239_2) was also a computation error. does anyone else here see the "bigger picture" and know what's going on here? also, here's another interesting bit of info: the above computation error was reported almost 25 minutes ago. since then several tasks have been reported, and yet the server is currently showing 3 consecutive valid tasks. yet i've counted manually more than 3 valid tasks since the error got reported...something is fishy here. |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
*update* got up to 4 consecutive valid tasks before it dropped back down to 0. this time, the task completed successfully, but got marked invalid. upon viewing the work unit info page, 2 other wingmen got credit for the same task...yet i returned my result before both wingmen did. that would mean that my task had to have been "pending" until one or both of these wingmen returned their results. 1) why would my result get marked invalid then? 2) how are some folks returning hundreds of consecutive valid tasks (i.e. how are some folks able to get so few validate errors and computation errors? |
Send message Joined: 19 Jul 10 Posts: 578 Credit: 18,845,239 RAC: 856 |
From my observation (and I'm looking quite carefully right now, as I'm testing a new system), the real number of consecutive valid tasks is "Max tasks per day" - 10000. That's the part I'm 99% sure about. What here is counted as "Consecutive valid tasks" are I think those tasks, that becomes inconclusive and get resend. When that validates, than the number of "Consecutive valid tasks" goes up by one, however only if you are the first one that has got that WU, not if you're the wingman. But I'm not so sure about this theory, I didn't follow that exactly, but the number of my consecutive valid tasks would fit to that pretty well. |
Send message Joined: 9 May 11 Posts: 15 Credit: 22,575,796 RAC: 0 |
From my observation (and I'm looking quite carefully right now, as I'm testing a new system), the real number of consecutive valid tasks is "Max tasks per day" - 10000. That's the part I'm 99% sure about. What makes you 99% sure of that? Everything that you said sounds theoretically possible, but I'm having a hell of a time trying to prove things on my PC. I never know when a wingman's going to validate or when the results will head off to database heaven, unreachable by us mortals. Regardless, I am gratified to hear that other people are puzzling over the same things as I. Thanks for your and Sunny's and Beyond's input. |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
ok, i'm not so sure i'm having the problems i thought i was. since my last update, "consecutive valid tasks" has dropped back to 0 twice. first, i had a run of 7 consecutive valid tasks, which was cut short by an invalid task. then i had a single valid task, followed by a computation error. since then, i haven't had an error or an invalid, and i'm up to 67 consecutive valid tasks. so now i'm beginning to explore the possibility that something was wrong with a particular batch of tasks i crunched earlier today, resulting in some errors and some invalids. the only thing that leaves me quite uncertain of this is that one would think a bad batch of work would result in either all errors or all invalids, and not necessarily a mix of both. at any rate, i guess the only thing to do at this point is to continue monitoring the statistics to be sure that the hiccups earlier today were nothing more that a statistical glitch in the work. |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
while my "consecutive valid tasks" continues to grow for the time being, i'm still noticing some funny behavior. for instance, i've been monitoring and documenting the "number of tasks today" and "consecutive valid tasks" values at not-so-regular time intervals. i've noticed that my "number of tasks today" grows much faster than my "consecutive valid tasks" do. one might be tempted to dismiss this phenomenon as nothing more than the fact that a fraction of all tasks completed and reported will validate right away, while the rest remain in the "pending" state while they wait to be validated... ...however, i noticed the following behavior going on over the course of the last hour or so: at 12:16am local time, my "number of tasks today" was at 3 and my "consecutive valid tasks" was at 69. then at 12:39am, my "number of tasks today" was at 18 and my "consecutive valid tasks" was at 71. only 2 tasks validated out of the 15 that completed and reported in that 23-minute span? not only did i find this a bit fishy, but i also checked my valid tasks tab and found that 6 tasks, all completed and reported between 12:34am and 12:39am, had been validated!!! in fact, here they are:
...so if 6 tasks were clearly validated between 12:34am and 12:38am, then why does my application details web page claim that only 2 tasks validated between 12:16am and 12:39am?!? |
Send message Joined: 9 May 11 Posts: 15 Credit: 22,575,796 RAC: 0 |
Well, I'm starting to build consecutive valid tasks: Missing app version Number of tasks completed 4083 Max tasks per day 13490 Number of tasks today 0 Consecutive valid tasks 5 Average processing rate 88.65894888594 Average turnaround time 0.10 days MilkyWay@Home 0.80 windows_x86_64 (cuda_opencl) Number of tasks completed 20 Max tasks per day 10020 Number of tasks today 0 Consecutive valid tasks 4 Average processing rate 96.326203241313 Average turnaround time 0.17 days MilkyWay@Home 0.82 windows_x86_64 (cuda_opencl) Number of tasks completed 282 Max tasks per day 10284 Number of tasks today 9 Consecutive valid tasks 53 Average processing rate 96.115415942437 Average turnaround time 0.07 days It's all kind of fishy. Didn't a developer or moderator make a post saying it would be easy for them to display our error rates? It seems like a necessity for a project that burns after reading. |
Send message Joined: 19 Jul 10 Posts: 578 Credit: 18,845,239 RAC: 856 |
From my observation (and I'm looking quite carefully right now, as I'm testing a new system), the real number of consecutive valid tasks is "Max tasks per day" - 10000. That's the part I'm 99% sure about. Make 100% out of it now, after I saw this thread I looked more carefully at that and with my HD3850 making one WU every 10 minutes it's pretty easy to follow and I didn't see any WU, that would no fit to this theory. The second part I'm still not sure about, but my best guess ATM would be, that anything that becomes inconclusive and than resend counts as a "Consecutive valid task" in addition to increasing the "Max tasks per day" by one as well (if it's valid). So here it seems not to matter if you are the first one, who has get that WU, as I though at first. But it would be really nice, if a moderator could explain how the values are calculated, it's obviously not like on other projects. @Sunny129: "number of tasks today" is the amount of tasks, that was assigned to this machine today (whenever that starts), if this number reaches the value of "Max tasks per day", you won't get any tasks on that day anymore. This value is reseted to 0 once a day. Well, that's at least how I know that from SETI. |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
ok, i'm not so sure i'm having the problems i thought i was. since my last update, "consecutive valid tasks" has dropped back to 0 twice. first, i had a run of 7 consecutive valid tasks, which was cut short by an invalid task. then i had a single valid task, followed by a computation error. since then, i haven't had an error or an invalid, and i'm up to 67 consecutive valid tasks. It's hard to follow the tasks because of insta-purge. I think you're having some kind of problem though. Just checked the consecutive valid tasks on my 7 machines and they're: 129, 102, 33, 32, 217, 62, 55. |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
It's hard to follow the tasks because of insta-purge. I think you're having some kind of problem though. Just checked the consecutive valid tasks on my 7 machines and they're: 129, 102, 33, 32, 217, 62, 55. yeah, i've come to realize that following the "consecutive valid tasks" calculation and trying to reconcile it with other statistics (such as "number of tasks today" or "valid tasks" for instance) is going to lead to some inconsistencies due to both the way certain statistics are calculated and the fact that the database servers tend to fall behind and catch back up over and over again. that being the case, i'll point out again that shortly after i started to think i had a problem (due to several short runs of consecutive valid tasks throughout the day yesterday that never seemed to grow larger than ~20 tasks), i stopped getting errors and invalids. in fact, i'm currently up to 103 consecutive valid tasks, which is why i think i may have encountered some bad WU's yesterday (as opposed to something being wrong with my software or hardware configuration). |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
that being the case, i'll point out again that shortly after i started to think i had a problem (due to several short runs of consecutive valid tasks throughout the day yesterday that never seemed to grow larger than ~20 tasks), i stopped getting errors and invalids. in fact, i'm currently up to 103 consecutive valid tasks, which is why i think i may have encountered some bad WU's yesterday (as opposed to something being wrong with my software or hardware configuration). There are definitely some bad tasks being sent out here and there. Most likely some runs of the test WUs. It's so hard to determine what's going wrong though because of insta-purge. |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
Well, I'm starting to build consecutive valid tasks:... what was the highest "consecutive valid tasks" value you ever saw before it dropped back to zero and started the climb toward 53? i ask b/c 53 is a pretty good number of consecutive valid tasks...that is, our situations may be one and the same - at first it appeared that there might be something wrong with our hosts, but now its starting to look more and more like we both just happened upon some bad WU's the other day...granted, i noticed your consecutive valid tasks has been recently reset by yet another error or invalid, whereas i have not had a problem since then. i'd monitor it for another 48 hours or so...hopefully we'll see a run of tasks that aren't error-prone during that period. if we do, and you continue to get errors or invalids often enough to prevent your "consecutive valid tasks" stat from climbing very high, then we'll be that much closer to ruling out bad batches of tasks as the culprit (at least with your specific host). |
Send message Joined: 11 Jun 10 Posts: 329 Credit: 1,166,222,661 RAC: 0 |
MilkyWay@Home 0.82 windows_x86_64 (ati14) Number of tasks completed 7653 Max tasks per day 17748 Number of tasks today 2016 Consecutive valid tasks 662 Average processing rate 431.94021858927 Average turnaround time 0.01 days |
Send message Joined: 9 May 11 Posts: 15 Credit: 22,575,796 RAC: 0 |
53 is the highest I saw at any point. I've been on and off my computer though. I'm sure it got higher overnight. This latest reset happened while I was asleep, and -- obviously not a coincidence -- there was a validate error in the morning: http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=50033167 Name de_separation_10_3s_fix20_2_868311_1308125661_1 Workunit 34658604 Created 15 Jun 2011 | 11:04:59 UTC Sent 15 Jun 2011 | 11:08:50 UTC Received 15 Jun 2011 | 12:57:02 UTC Server state Over Outcome Validate error Client state Done Exit status 0 (0x0) Computer ID 286349 Report deadline 27 Jun 2011 | 11:08:50 UTC Run time 516.94 CPU time 11.75 Validate state Invalid Credit 0.00 Application version MilkyWay@Home v0.82 (cuda_opencl) Stderr output <core_client_version>6.10.60</core_client_version> <![CDATA[ <stderr_txt> </stderr_txt> ]]> Clearly, every compute error, validate error, or bad workunit from the server drops this count back to zero. It's looking to me like the figure means exactly what it sounds like: "Consecutive valid tasks" is a running count of the number of valid results you've been able to string together without interruption. Whether the periodic reset is due to milkyway hiccuping and sending out invalid tasks, or our own computers flipping a one instead of a zero, is something for us to find out for ourselves. I have a mild overclock on my graphics card, which I've decreased slightly. We'll see if I build more validity by the end of today as a result. |
Send message Joined: 9 May 11 Posts: 15 Credit: 22,575,796 RAC: 0 |
MilkyWay@Home 0.82 windows_x86_64 (ati14) Your valid task count dwarfs mine, but it is also much lower than the number of tasks (2016) you've been sent today. Not sure if that's a useful comparison, but a broader range of results is helpful. |
Send message Joined: 12 Aug 09 Posts: 262 Credit: 92,631,041 RAC: 0 |
The dictionary says this: consecutive; following in regular unbroken order. So that can be meaning tasks with error, tasks without error, tasks in order as they where sent in, or sent back...or...? I would go for tasks without error. But I think Travis or Matt have to explane what they mean by that. The previous app had 483 of these and the current app(0.82) just 2 at the moment, hoever I am almost sure I had more valid tasks. Greetings from, TJ |
Send message Joined: 12 Aug 09 Posts: 262 Credit: 92,631,041 RAC: 0 |
I have monitored the results page while refreshing everytime results where out of the BOINC tasks list, and my pc has 4 Consecutive valid tasks, but there where much more and no errors as well. And the list of Waiting for validation has not become longer... Greetings from, TJ |
©2024 Astroinformatics Group