Welcome to MilkyWay@home

Validate errors

Message boards : Number crunching : Validate errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 46784 - Posted: 28 Mar 2011, 13:47:02 UTC - in response to Message 46782.  
Last modified: 28 Mar 2011, 13:50:01 UTC

Yep, all free_ wu's have a validate error. Off to DNETC until this gets resolved...

so its not just the "_10_3s_free" WUs? its all "_free" WUs? the reason i ask is b/c my que of MW@H tasks are all de separation tasks at the moment, most of which are "_10_3s_free" WUs, and the few remaining are "_13_3s_free" WUs. i'm currently at work, and my host in question is at home. the project is also currently suspended, so i can't do any experimenting or troubleshooting at the moment.


The only ones I saw were the 10_3s_free WUs, thats not to say there were no other _free varients. However those alone were bad enough, 60-80% errored out on initial validation. I spent 14 hours babysitting them, zapping the errent ones hoping someone would at least stop them at the server end pending review. Gave in last night, and switched projects until resolved. Baby sitting at weekends - for me - is practical enough as I'm always sitting at a PC, but come weekdays, others things to do. Yesterday was also my first time back on this Project after being away for a while - looks like my personal demon struck again rofl.

If they error out after a few seconds I would'nt fuss about it, but they go through crunching and fail initial validation even before comparison with another crunchers efforts. Thats just a total waste of time and effort. Its life, these things happen, but there's only so many crosses I'll burn myself on before reaching for the fire hose :)

Regards
Zy
ID: 46784 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 46785 - Posted: 28 Mar 2011, 13:55:29 UTC - in response to Message 46784.  
Last modified: 28 Mar 2011, 14:15:30 UTC

If they error out after a few seconds I would'nt fuss about it, but they go through crunching and fail initial validation even before comparison with another crunchers efforts. Thats just a total waste of time and effort. Its life, these things happen, but there's only so many crosses I'll burn myself on before reaching for the fire hose :)

Regards
Zy

i hear ya...i just want to be sure which of my WU's are going to end up with validate errors before i start aborting them. while my error rate is far less than yours (mine is more like 20-25%), that's still 20-25% of my GPU cycles completely wasted. so i may stitch back to SETI@Home MB and AP tasks for my HD 5870 GPU until MW@H tasks are back to normal...

*EDIT* - i'm also relieved to know that so many users are getting these validate errors. you see, we had one heck of a rainstorm last night here in Sarasota, FL, with tons of lightning. i would have shut down my rigs had i known the storm was coming, but it hit in the middle of the night. when i awoke this morning, my home office rig was frozen. when i restarted the machine and noticed all the MW@H validate errors, i thought for sure that the storm had fried parts of my machine. fortunately i had enough common sense to research validate errors on the message boards this morning, and stumbled across this thread. i think i can rest assured that nothing is wrong with my GPU or other components, and that these validate errors are simply due to a server-side issue that hopefully gets fixed real soon.
ID: 46785 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matthew
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 6 May 09
Posts: 217
Credit: 6,856,375
RAC: 0
Message 46787 - Posted: 28 Mar 2011, 16:22:39 UTC

I've stopped the "de_separation_10_3s_free_1" until we can figure out what's wrong with them. Let me know if the '13_3s_free' WUs are also causing problems.

(It may take a bit for the existing 10_3s_free WUs to filter out of the system)

-Matthew
ID: 46787 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 46789 - Posted: 28 Mar 2011, 17:36:49 UTC - in response to Message 46787.  

I've stopped the "de_separation_10_3s_free_1" until we can figure out what's wrong with them. Let me know if the '13_3s_free' WUs are also causing problems.

(It may take a bit for the existing 10_3s_free WUs to filter out of the system)

-Matthew

will do. of the 31 remaining MW@H tasks in my que, 23 are "de_separation_10_3s_free_1" tasks, while the other 8 are "de_separation_13_3s_free_1" tasks. again, MW@H is currently suspended on my host, so i won't be able to confirm/verify anything until this evening. i'll post up as soon as i know more...
ID: 46789 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 46790 - Posted: 28 Mar 2011, 19:10:19 UTC

You're a star. Thanks Matthew.

I'll run out the WUs on the other project and flip back tonight and yell if any pesky non-10-3s _free validate out.

Regards
Zy
ID: 46790 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
Message 46791 - Posted: 28 Mar 2011, 19:33:04 UTC

Just did run a couple of 13_3s_free_1 to make sure what I remembered from yesterday, all of them validated.
ID: 46791 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 46792 - Posted: 28 Mar 2011, 20:01:32 UTC
Last modified: 28 Mar 2011, 20:02:55 UTC

Same here, just restarted and the 13_3s_free's appear fine.

10_3s_free's now very rare, so they appear to have all but worked their way through the system now.

Regards
Zy
ID: 46792 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ed.T

Send message
Joined: 1 Feb 11
Posts: 17
Credit: 16,245,184
RAC: 0
Message 46793 - Posted: 28 Mar 2011, 20:29:03 UTC

So, I had a few 10_3 errors, now I've a quadrillion "Completed, validation inconclusive" as punishment. So the system, it seems, sends these to other folks flagged and scores their results as "Completed, validation inconclusive" too and then sends the WU off to another and so and so on ... this one is on its 5th machine -

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=260891871

... will it ever end?

- Ed.T
Please: WCG - Help Cure Muscular Dystrophy
ID: 46793 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 46797 - Posted: 28 Mar 2011, 23:05:27 UTC

ok, so of the 5 "de_separation_13_3s_free_1" tasks i had left, 3 were valid and 2 are still pending. of the 6 "de_separation_10_3s_free_1" tasks i had left, 5 were valid and 1 is still pending.
ID: 46797 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 46799 - Posted: 28 Mar 2011, 23:41:17 UTC

Just swapped here with my ATI GPUs, as DNETC is having WU validation issues on all crunching. I've lost over 500K in credit because of it.

My Milkyway WUs are reporting and waiting in pending. But, one WU on each of my quads has validated, so I hope the rest follow suit.
Go away, I was asleep


ID: 46799 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 46802 - Posted: 28 Mar 2011, 23:59:11 UTC - in response to Message 46799.  

The suspect WUs were only one type " _10_3s_free_1 " - they have been taken out of the system now. All other WUs were validating as normal prior to the saga, and all appear ok now. Should be fine so you havent jumped out the frying pan into the fire :)

Regards
Zy
ID: 46802 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Blurf
Volunteer moderator
Project administrator

Send message
Joined: 13 Mar 08
Posts: 804
Credit: 26,380,161
RAC: 0
Message 46804 - Posted: 29 Mar 2011, 0:36:18 UTC
Last modified: 29 Mar 2011, 0:36:51 UTC

Sunny-The de_separation_10_3s_free_1 are failing to validate because the run is over. As Matt stated, it'll take a little bit of time for all of the run to work their way through.

ID: 46804 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 46805 - Posted: 29 Mar 2011, 2:03:20 UTC

yes i know. that's why it is of interest that 4 of my 5 remaining de_separation_10_3s_free_1's still validated and gave me credit for the work after completing,uploading, and reporting to the server. and this is after the e_separation_10_3s_free_1 run was disabled. nevertheless, its of little consequence. ever since Matthew ended the run of bad tasks, everything else i've crunched since then has validated and earned credit.
ID: 46805 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Viking69
Avatar

Send message
Joined: 12 Sep 07
Posts: 17
Credit: 6,447,519
RAC: 4,100
Message 46806 - Posted: 29 Mar 2011, 2:27:22 UTC - in response to Message 46792.  
Last modified: 29 Mar 2011, 2:29:19 UTC

>>>>>>Same here, just restarted and the 13_3s_free's appear fine.

10_3s_free's now very rare, so they appear to have all but worked their way through the system now.

Regards
Zy
<<<<<<<

I dissagree...

03/27/2011 10:53:26 PM|Milkyway@home|Started download of de_separation_13_3s_free_1_1278565_1301291624_search_parameters
03/27/2011 10:53:27 PM|Milkyway@home|Finished download of de_separation_13_3s_free_1_1278565_1301291624_search_parameters
03/27/2011 10:53:28 PM|Milkyway@home|Starting de_separation_13_3s_free_1_1278565_1301291624_0
03/27/2011 10:53:28 PM|Milkyway@home|[error] Process creation failed:
03/27/2011 10:53:29 PM|Milkyway@home|[error] Process creation failed:
03/27/2011 10:53:30 PM|Milkyway@home|[error] Process creation failed:
03/27/2011 10:53:30 PM|Milkyway@home|[error] Process creation failed:
03/27/2011 10:53:30 PM|Milkyway@home|[error] Process creation failed:
03/27/2011 10:53:30 PM|Milkyway@home|Computation for task de_separation_13_3s_free_1_1278565_1301291624_0 finished
03/27/2011 10:54:31 PM|Milkyway@home|Sending scheduler request: To fetch work. Requesting 21601 seconds of work, reporting 1 completed tasks
03/27/2011 10:54:36 PM|Milkyway@home|Scheduler request succeeded: got 0 new tasks
03/27/2011 10:54:36 PM|Milkyway@home|Message from server: No work sent
03/27/2011 10:54:36 PM|Milkyway@home|Message from server: (reached daily quota of 97 tasks)

on my host http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=38865
Hey, I'm trying to do work here !
ID: 46806 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46812 - Posted: 29 Mar 2011, 5:25:51 UTC

Here is one that had a validate error, and it was not a member of the de_separation_10_3s_free_1 run: de_separation_13_3s_free_1_65998_1301362073 also known as work unit 261694197.
ID: 46812 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 12 Aug 09
Posts: 262
Credit: 92,631,041
RAC: 0
Message 46813 - Posted: 29 Mar 2011, 9:00:13 UTC

Well I got new 10_3S units to crunch and when ready some where already validated.
Greetings from,
TJ
ID: 46813 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46814 - Posted: 29 Mar 2011, 10:47:03 UTC - in response to Message 46806.  

Seeing that you have plenty of compute errors, I think that it might be your computer at fault. I think you are looking in the wrong area.
ID: 46814 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Validate errors

©2024 Astroinformatics Group