Welcome to MilkyWay@home

All(?) results are Validate Errors

Message boards : Number crunching : All(?) results are Validate Errors
Message board moderation

To post messages, you must log in.

AuthorMessage
curiously_indifferent

Send message
Joined: 15 Dec 09
Posts: 8
Credit: 171,824,676
RAC: 7,037
Message 47856 - Posted: 14 Apr 2011, 21:04:00 UTC

I need some guidance on what I need to do to get my GPU to actually be useful to MW again.

My setup:

Vista Home Premium x64 Edition, Service Pack 2
CAL ATI Radeon HD 4700/4800 (RV740/RV770) (512MB) driver: 1.4.1332
Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz

Until the change this past weekend, this setup was solid for well over a year. I upgraded the driver on the GPU (MW only uses the GPU) in the hopes that it would resolve all of the Validate Errors. It has not. Below is a sample error:

Name de_separation_13_3s_free_1_558584_1302647729_0
Workunit 1505084
Created 12 Apr 2011 | 22:35:31 UTC
Sent 12 Apr 2011 | 22:36:54 UTC
Received 14 Apr 2011 | 20:44:58 UTC
Server state Over
Outcome Validate error
Client state Done
Exit status 0 (0x0)
Computer ID 129540
Report deadline 20 Apr 2011 | 22:36:54 UTC
Run time 1,558.91
CPU time 1,586.64
Validate state Invalid
Credit 0.00
Application version MilkyWay@Home
Anonymous platform (ATI GPU)
Stderr output

<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
Running Milkyway@home ATI GPU application version 0.20b (Win64, CAL 1.4) by Gipsel
ignoring unknown input argument in app_info.xml: -np
ignoring unknown input argument in app_info.xml: 14
ignoring unknown input argument in app_info.xml: -p
ignoring unknown input argument in app_info.xml: 0.4872334339940767000000000
ignoring unknown input argument in app_info.xml: 1.8807146890464890000000000
ignoring unknown input argument in app_info.xml: 0.4343906314590382000000000
ignoring unknown input argument in app_info.xml: 220.3354939711574300000000000
ignoring unknown input argument in app_info.xml: 45.5980476520267150000000000
ignoring unknown input argument in app_info.xml: 5.1723717971970320000000000
ignoring unknown input argument in app_info.xml: -5.5680541993174480000000000
ignoring unknown input argument in app_info.xml: 19.9999499598045320000000000
ignoring unknown input argument in app_info.xml: 0.1605799132292079500000000
ignoring unknown input argument in app_info.xml: 193.7431431834900800000000000
ignoring unknown input argument in app_info.xml: 10.3163052637960280000000000
ignoring unknown input argument in app_info.xml: -4.1338652689211735000000000
ignoring unknown input argument in app_info.xml: 2.7940179261407447000000000
ignoring unknown input argument in app_info.xml: 6.6926061846224060000000000
scaling the wait times with 10
instructed by BOINC client to use device 0
APP: error reading search parameters file (for read): data_file == NULL
CPU: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz (8 cores/threads) 2.65998 GHz (470ms)

CAL Runtime: 1.4.1332
Found 1 CAL device

Device 0: ATI Radeon HD4700/4800 (RV740/RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 625 MHz, memory clock: 993 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Starting WU on GPU 0

main integral, 640 iterations
predicted runtime per iteration is 247 ms (33.3333 ms are allowed), dividing each iteration in 8 parts
borders of the domains at 0 200 400 600 800 1000 1200 1400 1600
Calculated about 2.39368e+013 floatingpoint ops on GPU, 2.47165e+008 on FPU. Approximate GPU time 1586.64 seconds.

probability calculation (stars)
Calculated about 3.11921e+009 floatingpoint ops on FPU.

WU completed.
CPU time: 1.32601 seconds, GPU time: 1586.64 seconds, wall clock time: 1587.94 seconds, CPU frequency: 2.66 GHz

</stderr_txt>

I assume all of my results are errors of some sort since my RAC has flatlined since the weekend. The reason I write 'assume' is because I am not really sure what is going on - I can't find many results for some reason.

From the error, I figure something is wrong with the app_info.xml file - but I am not sure what I need to modify. Any help would be appreciated.

ID: 47856 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Chris S
Avatar

Send message
Joined: 20 Sep 08
Posts: 1391
Credit: 203,563,566
RAC: 0
Message 47857 - Posted: 14 Apr 2011, 21:14:18 UTC

Running Milkyway@home ATI GPU application version 0.20b (Win64, CAL 1.4) by Gipsel


All the old applications are deprecated. Try detaching then re-attaching.
ID: 47857 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 47858 - Posted: 14 Apr 2011, 21:14:33 UTC - in response to Message 47856.  

From the error, I figure something is wrong with the app_info.xml file - but I am not sure what I need to modify. Any help would be appreciated.

The old ATI applications don't work anymore. The old ones produce a separate output file, but now all results are returned with stderr. So the validator isn't looking for the results in the old output file so nothing from the old applications ever validates. You need to update to the current version (0.62).
ID: 47858 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 47859 - Posted: 14 Apr 2011, 21:15:43 UTC - in response to Message 47857.  

Running Milkyway@home ATI GPU application version 0.20b (Win64, CAL 1.4) by Gipsel


All the old applications are deprecated. Try detaching then re-attaching.
Deprecated was really the wrong word to use; they're just gone.
ID: 47859 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
curiously_indifferent

Send message
Joined: 15 Dec 09
Posts: 8
Credit: 171,824,676
RAC: 7,037
Message 47862 - Posted: 14 Apr 2011, 21:31:56 UTC - in response to Message 47857.  

Thanks. I detached/reattached and downloaded a new 0.62 file. I now have a new question: How do I throttle the GPU usage? Previously, I dialed the GPU back to about 10%. The 0.62 file seems to ignore the throttling which causes my computer to sound like a turbine and the GPU temperature went to 101C after 30 seconds of running. I needed to suspend the processing.

ID: 47862 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 47864 - Posted: 14 Apr 2011, 21:34:32 UTC - in response to Message 47862.  

Thanks. I detached/reattached and downloaded a new 0.62 file. I now have a new question: How do I throttle the GPU usage? Previously, I dialed the GPU back to about 10%. The 0.62 file seems to ignore the throttling which causes my computer to sound like a turbine and the GPU temperature went to 101C after 30 seconds of running. I needed to suspend the processing.

If you check the news post about the new release, there are command line flags you can add in app_info.xml to regulate GPU usage.
ID: 47864 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 47865 - Posted: 14 Apr 2011, 21:36:22 UTC

Download the app from Arkayn's site - see the optimised app sticky thread. It has the new app_info file with a small error the app version should be 62 to 57. See the options file that comes with the 7zip file.
ID: 47865 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
curiously_indifferent

Send message
Joined: 15 Dec 09
Posts: 8
Credit: 171,824,676
RAC: 7,037
Message 47868 - Posted: 14 Apr 2011, 23:24:25 UTC - in response to Message 47864.  

To be clear, I have read the news posts on the new release and I am unable to throttle the GPU.

If I understand correctly,--gpu-target-frequency <number> should throttle the GPU. I have entered this in the <cmdline>. I do not see any change in the GPU when putting a any number than 30 in.

I know I am doing something wrong. Am I putting --gpu-target-frequency in the wrong place? What <number> would throttle the GPU back to say 10%?


ID: 47868 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 47883 - Posted: 15 Apr 2011, 10:38:17 UTC - in response to Message 47868.  
Last modified: 15 Apr 2011, 10:38:36 UTC

..... GPU core clock: 625 MHz, memory clock: 993 MHz ....


One thing you can do immediately, whatever the cause of the problems - and it may, for many reasons be just this - bring the memory speed down, its not needed high at MW, its just a waste of power and heats up for no reason.

The next bit .... and I have no idea if its this causing your problem ... can be an issue. If the PSU output is on the edge of maximum, it would not take much to push it over, and it will devolt supply - or just plain fail to do its job if its a "budget" PSU. A PSU failing to meet demand causes all sorts of unpredictable "wierdness".

Power draw on the edge of PSU limits does not take much to push it over the edge, and even slightly different useage patterns can trigger it. If declocking memory (bring it down to circa 200) does not solve the main issue, no worries, it will at least get the card running cooler and save you cash.

Regards
Zy
ID: 47883 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 47893 - Posted: 15 Apr 2011, 14:10:12 UTC - in response to Message 47868.  

To be clear, I have read the news posts on the new release and I am unable to throttle the GPU.

If I understand correctly,--gpu-target-frequency should throttle the GPU. I have entered this in the . I do not see any change in the GPU when putting a any number than 30 in.

I know I am doing something wrong. Am I putting --gpu-target-frequency in the wrong place? What would throttle the GPU back to say 10%?


That would be quite the slowdown. Increasing the target frequency or increasing the polling interval both have the effect of reducing GPU usage. Try setting --gpu-target-frequency to something like 60, and --gpu-polling-mode to something higher (in milliseconds). I'd guess if you want to get it down that much, maybe try setting the --gpu-target-frequency around 300.
ID: 47893 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
curiously_indifferent

Send message
Joined: 15 Dec 09
Posts: 8
Credit: 171,824,676
RAC: 7,037
Message 47923 - Posted: 16 Apr 2011, 16:20:26 UTC - in response to Message 47883.  

One thing you can do immediately, whatever the cause of the problems - and it may, for many reasons be just this - bring the memory speed down, its not needed high at MW, its just a waste of power and heats up for no reason.


I brought the memory speed down to 750MHZ (seems to be the lowest it will go) and it has taken the memory temperature down a few C. As you said, it does not have any effect on MW. Thanks Zydor!

That would be quite the slowdown. Increasing the target frequency or increasing the polling interval both have the effect of reducing GPU usage. Try setting --gpu-target-frequency to something like 60, and --gpu-polling-mode to something higher (in milliseconds). I'd guess if you want to get it down that much, maybe try setting the --gpu-target-frequency around 300.


After a lot of testing, I have settled on 300 for the gpu target and 350 for the gpu polling. These inputs definitely throttle the gpu: a file that would normally take just under 5 minutes to complete now takes about 40 minutes.

I may continue to experiment with the inputs, but these are giving me a tolerable noise level. I won't know for a couple of days, but I suspect the current setup has a slightly better performance output than the setup I had previous to last weekend. Thanks Matt!
ID: 47923 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 2 Jan 08
Posts: 122
Credit: 69,480,026
RAC: 1,421
Message 47955 - Posted: 17 Apr 2011, 12:30:06 UTC

Well after finding that all work was getting validate errors I went looking for an answer and found that for the last 4 days to maybe a week I have been getting this problem.
I did not realise that the applications were no longer usable and had kept processing 0.23 apps which it now seems have been wasting electricity and I didn't know it.

Have now downloaded the new application and adjusted the app_info.xml file for the correct application number and all now running fine.

That is a quite a few thousand lost credits.

Conan.
ID: 47955 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 47960 - Posted: 17 Apr 2011, 13:37:21 UTC - in response to Message 47923.  
Last modified: 17 Apr 2011, 13:39:30 UTC

...I brought the memory speed down to 750MHZ (seems to be the lowest it will go) and it has taken the memory temperature down a few C. As you said, it does not have any effect on MW ....


It will go lower with little effort given the right software. Try downloading MSI Afterburner, load it up, run it, and leaving it litterally at defaults, look to the lower left, you will see a bar showing memory setting - yours currently 750.

MSI Afterburner 2.2.0 Beta 2 (its a good Beta release so far, have no concerns)

Grab the bar slider with the mouse, and drag left as low as it will go, but no further than 200 (for now) even if it lets you. Look down slightly, press the apply button, and it will set the memory speed to what you requested.

At that point, open up Catalyst Control Centre, and with the settings shown (now includes you tweeked memory setting) create a Preset (or Profile depending on the version running), give it a meaningful name that you will remember what its for, save it, and your done. Each time you need to reset the driver for whatever reason, you will have the Preset(Profile) lurking to enable you to do it.

Sorry .... bit wordy wordy of necessity - but its real easy to do, 30 seconds once you have MSI Afterburner running.

The software is safe, well respected tweeking software widely used. Dont mess with other parts of it until you have had a look at functionality/useage etc.

Regards
Zy
ID: 47960 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : All(?) results are Validate Errors

©2024 Astroinformatics Group