Message boards :
Number crunching :
Problem Clients
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
There are many CPU clients spewing out invalid results at a rate of ~1/second. This is causing a lot of WUs to be flagged as invalid as the number of reties exceed what's allowed. The worst offenders seem to be: Error while computing 0.00 0.00 --- MilkyWay@Home v0.50 Error while computing 0.00 0.00 --- MilkyWay@Home v0.50 (sse2) Because finished WUs are cleared from the database so quickly this may not be obvious. Can't the server be set NOT to send WUs to these clients that are throwing massive numbers of errors? |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Interesting .... I thought the BOINC Server had an automatic choke, testing the rate of invalids, and restricting supply of new ones for a set time frame until good valids return from the errent host, to prevent run-aways. Maybe that parameter has been knocked a bit during the recent hassles ? Regards Zy |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
Interesting .... I thought the BOINC Server had an automatic choke, testing the rate of invalids, and restricting supply of new ones for a set time frame until good valids return from the errent host, to prevent run-aways. Zy It does, but maybe it's not set or not set aggressively enough at the moment. |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
Kashi also posted about this problem here: Still getting "Completed, can't validate" invalids. This is due to wingmen exceeding the maximum number of errors of 3. http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2353&nowrap=true#47949 |
Send message Joined: 16 Aug 10 Posts: 15 Credit: 32,160,978 RAC: 0 |
I have downloaded and installed the lastest drivers from the AMD web site and still get errors on all GPU calculations. I have disabled GPU until I can find a resolution to this issue. Any ideas besides lastest drivers? ATI 4870 ATI 3200 |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
I have downloaded and installed the lastest drivers from the AMD web site and still get errors on all GPU calculations. I have disabled GPU until I can find a resolution to this issue. Any ideas besides lastest drivers? Can you unhide your computers so we can have a look at their details and the wu errors? |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
More machines still putting out massive numbers of errors: http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=169654 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=245341 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=160357 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=89895 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=234579 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=117575 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=110171 Just a few of the many... |
Send message Joined: 16 Aug 10 Posts: 15 Credit: 32,160,978 RAC: 0 |
Not sure if I did what you requested, but see if you can see what you needed now. |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Not sure if I did what you requested, but see if you can see what you needed now. Its unhiding the computers so that the details of WUs can be seen. Go to bottom of this page - click on Account In Account halfway down you will see an option "Preferences for this project", click on the blue words "MilkyWay@home preferences" Inside preferences, got to the sixth option line from the top "Should MilkyWay@home show your computers on its web site?", change that to "yes" - you do that by clicking the blue words "Edit MilkyWay@home preferences", and follow your nose. Save the changes afterwards, else any change made is lost. Once you've done that, update your BAM Client, and we can then the details which may be able to point at what is happening. Regards Zy |
Send message Joined: 16 Aug 10 Posts: 15 Credit: 32,160,978 RAC: 0 |
Everything but the BAM client I had done with my last post. I just completely shut down boinc and re-started it which I assume will update the BAM client. |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
All failed WUs are showing the same errors: Device does not support double precision MW needs Double Precision capable cards - it will not work on single precision cards. What is the card type number of the AMD GPU Card(s) you are using ? Did you load the APP Driver set as well as the main drivers when you loaded 11.3? Regards Zy |
Send message Joined: 16 Aug 10 Posts: 15 Credit: 32,160,978 RAC: 0 |
I have a ATI4875 (Gigabyte branded) as my primary display and yes the APP Driver set was loaded. I am wonder if the ATI 3200 built into the 780G chipset is the problem, and if short of disabling it completely ( I use it to drive a secondary monitor ) is there a means to tell milkyway not to use it. That would also explain why 1 GPU process seems to run OK while the other GPU work units quickly error out. That is the 4875 is processing correctly and the 3200 errors out over and over quickly consuming all the wu. The $64 question is that up until the recent server issues both GPUs ran fine, in fact they ran at similar speeds. |
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
I have a ATI4875 (Gigabyte branded) as my primary display and yes the APP Driver set was loaded. I am wonder if the ATI 3200 built into the 780G chipset is the problem, and if short of disabling it completely ( I use it to drive a secondary monitor ) is there a means to tell milkyway not to use it. That would also explain why 1 GPU process seems to run OK while the other GPU work units quickly error out. That is the 4875 is processing correctly and the 3200 errors out over and over quickly consuming all the wu. The $64 question is that up until the recent server issues both GPUs ran fine, in fact they ran at similar speeds. You can disable the 3200 in boinc to prevent it from trashing more WUs. Shut down boinc and after that, create a file called cc_config.xml (using notepad) with the following content: <cc_config> <options> <ignore_ati_dev>1</ignore_ati_dev> </options> </cc_config> copy that file to "C:\Documents and Settings\All Users\Application Data\BOINC" start boinc again. Join Support science! Joinc Team BOINC United now! |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
..... The $64 question is that up until the recent server issues both GPUs ran fine, in fact they ran at similar speeds. The new application will not work on 3XXX GPUs, the minimum level is 4XXX. Thats why it started falling over from the start of the changes on the Server. Crunch3r's post specifically disables (to BOINC, will still work ok) the 3200, and you can work as usual, except the 3200 can no longer crunch MW. Regards Zy |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
That's not true. See my hosts. I have a 3850 working... |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Oppps - my error - apologies, 38XX will work ..... Keep your eye on development though as the writing is on the wall for 38XX at MW in not too distant future. See Matt's post re future development trend Regards Zy |
Send message Joined: 20 Sep 08 Posts: 1391 Credit: 203,563,566 RAC: 0 |
Keep your eye on development though as the writing is on the wall for 38XX at MW in not too distant future. I have found that it is the number of shaders that give ATI cards their power. 3850 cards have 320 shaders 4770 cards have 640 shaders 4770 cards are available on Ebay for very little more than 3850 ones. |
Send message Joined: 16 Aug 10 Posts: 15 Credit: 32,160,978 RAC: 0 |
Thank you the de_seperation are running again without erroring out. |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
More machines still putting out massive numbers of errors: And a few more of the many more trashing a WU every 1-2 seconds: http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=244669 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=272109 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=194972 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=271150 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=160056 http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=248855 |
Send message Joined: 25 May 09 Posts: 6 Credit: 23,439,967 RAC: 4,848 |
Hi. Apologize this silly question: how? I have a 3850 AGP (with ATI hotfix 11.3, BOINC mgr 6.12.22 and XP SP2) but the WU with MW application 0.62 always finish in Compute Error. That GPU had crunched a lot of WU (for me... :-)) with the Optimized Application by Gipsel until the version 0.23. Can you help me? Thanks a lot, regards. Marco. |
©2024 Astroinformatics Group