Welcome to MilkyWay@home

Validate errors

Message boards : Number crunching : Validate errors
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46750 - Posted: 27 Mar 2011, 11:36:41 UTC

I am getting plenty of validate errors lately. Could someone look into the server to see what is going on? Example work units that are getting validate errors for their results include work units 260747742, 260762949, 260774277, and 260745047.
ID: 46750 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46751 - Posted: 27 Mar 2011, 11:45:23 UTC
Last modified: 27 Mar 2011, 11:45:32 UTC

Another work unit that resulted in validate errors is work unit 260747648.
ID: 46751 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46752 - Posted: 27 Mar 2011, 12:02:53 UTC

New validate error: work unit 260786638
ID: 46752 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46753 - Posted: 27 Mar 2011, 12:05:25 UTC
Last modified: 27 Mar 2011, 12:05:43 UTC

And another one bites the dust due to validate errors: work unit 260784171
ID: 46753 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46754 - Posted: 27 Mar 2011, 12:10:11 UTC

One more looking like it will die due to validate error: 260791824

I will not report more soon because I am draining my work unit queue in preparation for a BOINC upgrade because BOINC 6.10.60 just got released, and my MilkyWay@home queue got drained.
ID: 46754 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile krahulik

Send message
Joined: 7 Nov 08
Posts: 14
Credit: 180,768,799
RAC: 0
Message 46756 - Posted: 27 Mar 2011, 12:40:42 UTC

I also have new validate errors (error rate is between 10-15 %).
ID: 46756 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 46757 - Posted: 27 Mar 2011, 12:48:33 UTC

Yes I think they are all de_separation_10_3s_free_1 type. May also have max # of error/total/success tasks of 1, 9, 6. Been happening for some time now, I aborted many before switching to another project.
ID: 46757 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cncguru
Avatar

Send message
Joined: 11 Jun 10
Posts: 329
Credit: 1,166,222,661
RAC: 0
Message 46758 - Posted: 27 Mar 2011, 12:48:34 UTC

OK I am getting loads of validate errors on all 5 of mine!
One, maybe too much overclocking, 5?
we have a serverside problem methinks!
ID: 46758 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
Message 46761 - Posted: 27 Mar 2011, 13:23:34 UTC
Last modified: 27 Mar 2011, 13:23:49 UTC

Yep, de_separation_10_3s_free_1 are causing trouble.
Checked a couple of them and had like half of them not validating :(
ID: 46761 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 46762 - Posted: 27 Mar 2011, 13:58:41 UTC
Last modified: 27 Mar 2011, 13:59:43 UTC

Been away from the Project for a while, restarted today - reckon I hexed it :)

Same here - its the _10_3s_free WUs. Happening on both machines on 5850s as well as 5970s. Validate errors mostly - couple of other types, but those small number of "non _free" types was me settling in. Runs to about 40% dead WUs overall.

Majority of the those _10_3s_free types falling over. Going to switch Project's for a short while until this one settles back again.

Regards
Zy
ID: 46762 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 46763 - Posted: 27 Mar 2011, 14:06:50 UTC
Last modified: 27 Mar 2011, 14:14:36 UTC

Trying one more thing .... set nnt on both machines and aborted all _free's, see if the ones left go through error free - I suspect so, but giving it a whirl, should nail it one way of the other.

EDIT
Its definitely the _10_3s_free's. I aborted those in the queue on both machines, and validate errors stopped, all other types went through no bother. Something is amiss with the _free's, and need stopping at the server

Regards
Zy
ID: 46763 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
vandiesel

Send message
Joined: 10 May 10
Posts: 27
Credit: 43,104,187
RAC: 0
Message 46764 - Posted: 27 Mar 2011, 14:16:51 UTC
Last modified: 27 Mar 2011, 14:18:04 UTC

same here both machines 2x4870 1x6950
ID: 46764 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 46765 - Posted: 27 Mar 2011, 14:27:27 UTC
Last modified: 27 Mar 2011, 14:32:15 UTC

Watch Out with these ...... I was originally worried re temperatures. I monitored the non free's and they are normal, the _frees are all crunching at around +10 degrees above normal on the card VRMs.

That will put the card into very hot territory without the User knowing unless the card VRM is being monitored. The card will show a slightly upped GPU temperature, but still appears ok, however a check on the VRM temperatures is a different matter.

The _free's are heating up the VRMs alarmingly - if you are still running the WUs, watch your VRMs like a hawk.

Its time to freeze all WUs, its too dangerous for VRMs on the _free's.

Regards
Zy
ID: 46765 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mark Doom

Send message
Joined: 27 Mar 11
Posts: 1
Credit: 309,533
RAC: 0
Message 46766 - Posted: 27 Mar 2011, 19:46:04 UTC

I've been having all the same issues today.

I have been just keeping an eye on it though and aborting any WU that comes up as the "free_1" name.

Seems everything else is working just fine.
ID: 46766 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46775 - Posted: 28 Mar 2011, 2:06:52 UTC

If you have a Radeon HD 6xxx series card, you should not have to abort results due to the risk of overheating the VRM. AMD implemented PowerTune in these cards to automatically underclock the GPU by various levels if the code being sent to it would force the GPU over power usage limits until it reaches an underclocking level that would allow it to run under power usage limits, and then restore the clock once the power-hungry code goes away.
ID: 46775 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46778 - Posted: 28 Mar 2011, 5:23:54 UTC

Two more validate errors: work units 261192073 and 261168214
ID: 46778 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46779 - Posted: 28 Mar 2011, 6:49:02 UTC

One more: work unit 261208862
ID: 46779 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46780 - Posted: 28 Mar 2011, 8:00:57 UTC

Two more: 261218718 and 261221074
ID: 46780 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 46781 - Posted: 28 Mar 2011, 9:59:37 UTC

Yep, all free_ wu's have a validate error. Off to DNETC until this gets resolved...
ID: 46781 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 46782 - Posted: 28 Mar 2011, 13:12:59 UTC - in response to Message 46781.  
Last modified: 28 Mar 2011, 13:13:15 UTC

Yep, all free_ wu's have a validate error. Off to DNETC until this gets resolved...

so its not just the "_10_3s_free" WUs? its all "_free" WUs? the reason i ask is b/c my que of MW@H tasks are all de separation tasks at the moment, most of which are "_10_3s_free" WUs, and the few remaining are "_13_3s_free" WUs. i'm currently at work, and my host in question is at home. the project is also currently suspended, so i can't do any experimenting or troubleshooting at the moment.
ID: 46782 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Validate errors

©2024 Astroinformatics Group