Message boards :
Number crunching :
Validate errors
Message board moderation
Author | Message |
---|---|
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
|
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
Another work unit that resulted in validate errors is work unit 260747648. |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
New validate error: work unit 260786638 |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
And another one bites the dust due to validate errors: work unit 260784171 |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
One more looking like it will die due to validate error: 260791824 I will not report more soon because I am draining my work unit queue in preparation for a BOINC upgrade because BOINC 6.10.60 just got released, and my MilkyWay@home queue got drained. |
Send message Joined: 7 Nov 08 Posts: 14 Credit: 180,768,799 RAC: 0 |
I also have new validate errors (error rate is between 10-15 %). |
Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0 |
Yes I think they are all de_separation_10_3s_free_1 type. May also have max # of error/total/success tasks of 1, 9, 6. Been happening for some time now, I aborted many before switching to another project. |
Send message Joined: 11 Jun 10 Posts: 329 Credit: 1,166,222,661 RAC: 0 |
OK I am getting loads of validate errors on all 5 of mine! One, maybe too much overclocking, 5? we have a serverside problem methinks! |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
Yep, de_separation_10_3s_free_1 are causing trouble. Checked a couple of them and had like half of them not validating :( |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Been away from the Project for a while, restarted today - reckon I hexed it :) Same here - its the _10_3s_free WUs. Happening on both machines on 5850s as well as 5970s. Validate errors mostly - couple of other types, but those small number of "non _free" types was me settling in. Runs to about 40% dead WUs overall. Majority of the those _10_3s_free types falling over. Going to switch Project's for a short while until this one settles back again. Regards Zy |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Trying one more thing .... set nnt on both machines and aborted all _free's, see if the ones left go through error free - I suspect so, but giving it a whirl, should nail it one way of the other. EDIT Its definitely the _10_3s_free's. I aborted those in the queue on both machines, and validate errors stopped, all other types went through no bother. Something is amiss with the _free's, and need stopping at the server Regards Zy |
Send message Joined: 10 May 10 Posts: 27 Credit: 43,104,187 RAC: 0 |
same here both machines 2x4870 1x6950 |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Watch Out with these ...... I was originally worried re temperatures. I monitored the non free's and they are normal, the _frees are all crunching at around +10 degrees above normal on the card VRMs. That will put the card into very hot territory without the User knowing unless the card VRM is being monitored. The card will show a slightly upped GPU temperature, but still appears ok, however a check on the VRM temperatures is a different matter. The _free's are heating up the VRMs alarmingly - if you are still running the WUs, watch your VRMs like a hawk. Its time to freeze all WUs, its too dangerous for VRMs on the _free's. Regards Zy |
Send message Joined: 27 Mar 11 Posts: 1 Credit: 309,533 RAC: 0 |
I've been having all the same issues today. I have been just keeping an eye on it though and aborting any WU that comes up as the "free_1" name. Seems everything else is working just fine. |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
If you have a Radeon HD 6xxx series card, you should not have to abort results due to the risk of overheating the VRM. AMD implemented PowerTune in these cards to automatically underclock the GPU by various levels if the code being sent to it would force the GPU over power usage limits until it reaches an underclocking level that would allow it to run under power usage limits, and then restore the clock once the power-hungry code goes away. |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
|
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
One more: work unit 261208862 |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
|
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
Yep, all free_ wu's have a validate error. Off to DNETC until this gets resolved... |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
Yep, all free_ wu's have a validate error. Off to DNETC until this gets resolved... so its not just the "_10_3s_free" WUs? its all "_free" WUs? the reason i ask is b/c my que of MW@H tasks are all de separation tasks at the moment, most of which are "_10_3s_free" WUs, and the few remaining are "_13_3s_free" WUs. i'm currently at work, and my host in question is at home. the project is also currently suspended, so i can't do any experimenting or troubleshooting at the moment. |
©2024 Astroinformatics Group