Welcome to MilkyWay@home

VPU recovery & invalid WU results

Message boards : Number crunching : VPU recovery & invalid WU results
Message board moderation

To post messages, you must log in.

AuthorMessage
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 40372 - Posted: 13 Jun 2010, 16:14:17 UTC

Hi Guys

A bit of advice please on 2 questions -

First some background on changes

Yesterday I decided to move the crunching on my HD3850 and HD4850 (different machines - 1 GPU each) from Collatz to Milkyway. As I was running the ATI Catalyst 8.12 this meant I needed to delete V8.12 and chose to install 10.3 on both PCs.

Both PCs run WinXP pro 32bit as the OS.

Questions

1. Since the change over to ATI 10.3 drivers the HD3850 runs fine, but the HD4850 seems to be showing VPU recoveries (3 in the last 16 hours). The card is running at 82C and the clock is 650MHz GPU core and 1050 MHz memory (bottom end of CCC settings). The GPU load is 92% to 96% and fan speed 1648rpm (97% fan speed).

What might lead to these intermittent, but fairly frequent, VPU errors, and what can I do about it (if anything)?

2. Probably this is linked to the VPU recovery, but I seem to be getting about 1/3rd of the completed WUs marked as invalid. Any thoughts as to why?
Go away, I was asleep


ID: 40372 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 40373 - Posted: 13 Jun 2010, 19:00:41 UTC

Did you uninstall V8.12, reboot then install 10.3? 10.5 is now the latest driver version. I see the invalid wu's appear to have gone away - did you do anything? I always try and reduce the temperature as I think anything above 80°C is too hot.
ID: 40373 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 40374 - Posted: 13 Jun 2010, 19:18:46 UTC - in response to Message 40373.  

Did you uninstall V8.12, reboot then install 10.3? 10.5 is now the latest driver version. I see the invalid WU's appear to have gone away - did you do anything? I always try and reduce the temperature as I think anything above 80°C is too hot.



After looking in at other people's posts I did the uninstall of 8.12, close down and reboot, then install 10.3 from local drive and a close down and reboot. But, I first set a restore point before starting it all.

I see the GPU temperature is down to 76C, and the last 50 completed WUs were OK.

I wonder if the heat is causing VPU recovery and invalid results?

I will see in the morning when the day gets warmer.

Thanks TGG.

When the pocket book allows (years from now) I will pick your brain on multiple GPUs and how to get them working in harmony.
Go away, I was asleep


ID: 40374 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Werkstatt

Send message
Joined: 19 Feb 08
Posts: 350
Credit: 141,284,369
RAC: 0
Message 40375 - Posted: 13 Jun 2010, 19:30:51 UTC

Hi John,

a couple of months ago I had the same problems. I downgraded the driver to 10.2; my both PC's work fine since then. And yes, I uninstalled the previous drivers first.
ID: 40375 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 40376 - Posted: 13 Jun 2010, 19:35:34 UTC

ATM the GPU is running fine without any invalid results. So, I will stay with V10.3, but this leaves the option to uninstall 10.3 and then move back to 10.2.

At least this works, and is comfortably above the 9.3+ demanded by the servers on new work requests.

Thanks for the pointer Werkstatt.
Go away, I was asleep


ID: 40376 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
James Nunley

Send message
Joined: 29 Nov 07
Posts: 39
Credit: 74,300,629
RAC: 0
Message 40443 - Posted: 16 Jun 2010, 18:59:45 UTC - in response to Message 40376.  
Last modified: 16 Jun 2010, 19:03:12 UTC

Are you overclocking with CCC?

My 4850 crashes and returns invalid results if I overclock it to 650Mhz using CCC. Incidentally 650Mhz is the maximum CCC will allow.


Overclocking using ATI Tray tools v 1.6.9.1486 I can get a perfectly stable 725mhz overclock no crashes and all results valid.


GPU temp is in the 70 to 75 degree range this is a powercolor 4850 which comes with the arctic cooling Accelero L2 Pro attached which is by no means a huge beast of a cooler like the Acceloro S2, but it is definately better than most stock coolers.

ATI Tray tools also allows reducing voltage to the GPU which can only help if you are having heat issues. Stock voltage on the 4850 is 1.120V mine runs @725MHz downvolted to 1.082 perfectly stable.

Give it a go see what you think.

James

/edit Oh yes and it also allows working fan speed control on my card so you can set the fan speed to whatever you want. /edit
ID: 40443 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 40447 - Posted: 16 Jun 2010, 21:02:24 UTC
Last modified: 16 Jun 2010, 21:04:27 UTC

My version of CCC shows the HD4850 GPU @ 650MHz and the memory clock at 1050. These are the minimum for the card, and CCC can take the GPU to 750MHz and the memory to 1150MHz.

Running at these settings the card is giving valis results and no VPU recovery issues.

I think, as the card is set, overclocking (no me ATM) would result in too much heat.

Where can I find ATI Tray tools?
Go away, I was asleep


ID: 40447 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : VPU recovery & invalid WU results

©2024 Astroinformatics Group