Welcome to MilkyWay@home

Validation Problem

Message boards : Number crunching : Validation Problem
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Sabroe_SMC
Avatar

Send message
Joined: 2 Aug 08
Posts: 24
Credit: 374,440,641
RAC: 0
Message 51003 - Posted: 11 Sep 2011, 8:11:55 UTC

Hi Folks
Since this weekend i have many wu's that were not validated. Take a look here:
http://milkyway.cs.rpi.edu/milkyway/results.php?userid=6702&offset=240&show_names=0&state=2&appid=
I saw that there may be problems with the anonymous platform (app_info)so i removed MW and added it for new. But also with the stock app there are so much wu's with no validation.
Anyone with a hint?
Greetz to all
ID: 51003 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 51004 - Posted: 11 Sep 2011, 8:49:16 UTC

You could perhaps try a more recent Catalyst driver on the computer that is having trouble. Version 11.1 gave problems for some I remember.
ID: 51004 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Sabroe_SMC
Avatar

Send message
Joined: 2 Aug 08
Posts: 24
Credit: 374,440,641
RAC: 0
Message 51005 - Posted: 11 Sep 2011, 12:11:26 UTC - in response to Message 51004.  

I used this driver since it was rolled out with no problems. Only since this weekend it made some trouble. Maybe its the new app not compatibel with the old driver?
And its not only one cmputer affected. Both of mine with ATIs are have same troubles.
Greetz to all
ID: 51005 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 51006 - Posted: 11 Sep 2011, 15:14:26 UTC
Last modified: 11 Sep 2011, 15:18:33 UTC

I only saw one computer ID: 246261 with all tasks processed initially having a status of "Completed, validation inconclusive" which then become invalid after they have been reported by a wingman. The other 2 computers that had the later version of the Catalyst driver looked OK with just the usual number of pendings. Hard to tell though with a quick look as work units clear extremely quickly once they are completed.

OK, I had another look and the other 2 computers have had a few tasks validated. The Cypress 58xx in computer 246261 is still having trouble though as it continues to produce invalid results after you have updated the driver. Is it running hot? Perhaps you could try it at a much lower core speed and reduce the memory speed to 500 MHz or so as well. This may show whether it is a hardware issue.
ID: 51006 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Sabroe_SMC
Avatar

Send message
Joined: 2 Aug 08
Posts: 24
Credit: 374,440,641
RAC: 0
Message 51010 - Posted: 11 Sep 2011, 16:57:52 UTC - in response to Message 51006.  
Last modified: 11 Sep 2011, 16:58:58 UTC

All GPUs were @stock and no one is hot.
Both of my 5870 updated to driver 11.8 and BM 6.12.x
I took a lokk into the results of my third one with a 6950 GPU and it has the same problems. Maybe it is a spreading problem?

BTW Last weekend i had no problem with all 3 gpu's. Got around 500K per day. Now its down to 300K
Greetz to all
ID: 51010 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 51011 - Posted: 11 Sep 2011, 18:13:52 UTC
Last modified: 11 Sep 2011, 18:14:24 UTC

There is a single task with a computation error currently showing on your HD 6950 but other tasks appear to be validating OK. This is a different type of error to your HD 5870 on computer 246261 which is completing and reporting tasks but giving incorrect, invalid results.

Trying with a lower core clock and memory clock on the 5870 that is producing invalid results was just a suggestion to try and find if it is a hardware problem. If there is something wrong with a video card when processing tasks, sometimes reducing the load by reducing the speeds can enable it to work successfully. If it starts producing valid results at lower core and memory speeds then you know it is the hardware at fault and not software. It is just a way of trying to diagnose the problem, that's all. Just like you tried a newer Catalyst driver to see if that was causing the problem.

If you prefer you could swap the 5870 into the computer with the other 5870 that is working correctly. If it still produces invalid results after you swap it into the other computer then it is likely to be a fault with the 5870 itself and not software related.
ID: 51011 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Sabroe_SMC
Avatar

Send message
Joined: 2 Aug 08
Posts: 24
Credit: 374,440,641
RAC: 0
Message 51012 - Posted: 11 Sep 2011, 19:52:45 UTC - in response to Message 51011.  
Last modified: 11 Sep 2011, 19:56:04 UTC

Thx Kashi
I will change the power supply of the rig. maybe its too weak. I had some problems with it to get it working. Its a 400W PSU. i7 920@stock + 5870 maybe too much for it.
The change of cards through the rigs i will do later, but i will give it a try. Next weekend i will see more

BTW The core clocks of the card are 875 Mhz directly occed from XFX. I tried it with 850 Mhz /1200 Mhz memory clock.
Greetz to all
ID: 51012 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 51013 - Posted: 12 Sep 2011, 3:07:59 UTC

Yes could be related to the power supply. I would think 400W with an i7 920 and a 5870 is really pushing your luck. AMD system requirements for 5870 = "500 Watt or greater power supply". MilkyWay in standard configuration puts a lot of strain on ATI/AMD cards. Not only are they at full load but it is the equivalent of a heavy load. The current draw often exceeds that of the most demanding stress test and this doesn't just last for a short time but continues all day every day while MilkyWay is being processed. Cards that come overclocked slightly from the factory are done so in relation to their use for playing games. The same small factory overclock that is fine for playing games may be unstable when using the card for MilkyWay.

When I suggested trying a much lower core speed with 500 MHz memory speed as a test, I meant substantially lower, for example 100-200 MHz or so below default core speed.

The reduction in memory speed is important too as it usually reduces heat/power draw a noticeable amount. Many who process MilkyWay on ATI/AMD cards use a low memory speed all the time. This reduces power consumption and heat and potentially helps the card last longer. I use 500 MHz memory speed and have done so for a long time now.

Although there have been one or two with heavily overclocked cards who claimed that reduced memory speed reduced their processing speed the majority have found that it has no effect and does not slow MilkyWay at all. Lower memory speed in MilkyWay = lower electricity bill, less heat and a more stable, durable GPU.
ID: 51013 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Sabroe_SMC
Avatar

Send message
Joined: 2 Aug 08
Posts: 24
Credit: 374,440,641
RAC: 0
Message 51101 - Posted: 17 Sep 2011, 22:59:03 UTC - in response to Message 51013.  

Problem solved. It was the weak PSU.
Type of PSU is FSP400-60EMDN with 18A + 18A on 12VDC
Now the 5870 is oc to 965/1000 and it runs well
Greetz to all
ID: 51101 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Validation Problem

©2024 Astroinformatics Group