Message boards :
Number crunching :
Compute Errors
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6
Author | Message |
---|---|
Send message Joined: 2 Jan 08 Posts: 23 Credit: 495,882,464 RAC: 0 |
Looks like the searches are stopped, we'll not do 3 stream runs until the ATI code is fixed :) Thanks Travis. It was a very good decision. Now, Cluster Physik gave us fixed code since 3 days. Perhaps it's time to restart 3 stream runs. This kind of WUs is longer to calculate than others. It could give more work for everyone. Thank you, Thierry. |
Send message Joined: 28 Sep 08 Posts: 1 Credit: 18,371,048 RAC: 71,619 |
I'm getting a lot of sigsev errors on 2s-4 and 2s-6. Is this a related problem? |
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
I'm getting a lot of sigsev errors on 2s-4 and 2s-6. No. It's Ubuntu causing it. Get a proper linux distribution or downgrade to 8.xx. Join Support science! Joinc Team BOINC United now! |
Send message Joined: 26 Jan 09 Posts: 589 Credit: 497,834,261 RAC: 0 |
|
Send message Joined: 13 Feb 08 Posts: 1124 Credit: 46,740 RAC: 0 |
I'm getting a lot of sigsev errors on 2s-4 and 2s-6. Ha, that was the conclusion I came to as well ;-) It runs einstein okay but practically nothing else. |
Send message Joined: 22 Nov 08 Posts: 136 Credit: 319,414,799 RAC: 0 |
Something caused both of my gpu crunchers to freeze up around 5:30 UTC. I say that time because that was the time the last result remaining on my systems was sent out to me.I have no idea what it was. That was 12:30 AM my time. I just got up this AM and found it. I don't see any errors but insta purge probably took the units away before I could see them. Might be something worth looking into. <edit> It was only milkyway that froze. Also running Prime Grid on both systems. I was still running. <edit 2> It just froze up on one machine again. It was running ps_sgr_208_3s_6 and a ps_sgr_210_3s_5 and a ps_sgr_235_2s_6 When it froze. This is definately worth looking into. <edit 3> It was either the 210_3 or the 235_2 that locked it up. 4870 GPU 4870 GPU |
Send message Joined: 11 May 09 Posts: 30 Credit: 81,093 RAC: 0 |
KWSN, Are you using the code recently released by Cluster Physik ( http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=886#24282 ) that included a fix for 3 stream runs on ATI GPUs? This is most likely the issue if its the *_3s_* runs crashing and only on GPU. Thanks, John Vickers |
Send message Joined: 9 Nov 07 Posts: 151 Credit: 8,391,608 RAC: 0 |
I'm not so much getting a system freeze but a GPU reset! These seem to be happenong with the 3s, specifically the '3s 6', I have visually seen this happen, have'nt noticed it on '3s 5', but will be watching - difficult to catch as it has only happened on three seperate occasions in the last 36 hours. |
Send message Joined: 22 Nov 08 Posts: 136 Credit: 319,414,799 RAC: 0 |
Yes, I am using the latest version 0.19f. I was able to play around with it a little more last night. It seems that absolutely everything is running high priority for some reason. I made no changes to my BOINC prefrences either. I also found that if I suspend my other project (prime grid) everthing starts back up. That will however have a negative impact on PG. None of this started happening until a recent windows update. I'm very much open to suggestions on how to correct it. I thought I might try and reinstall 19f as soon as I can get a chance in case something got messed up with the update. If that doesn't work maybe reinstalling BOINC. The two systems are running 6.4.7 4870 GPU 4870 GPU |
Send message Joined: 12 Dec 08 Posts: 56 Credit: 269,889,439 RAC: 0 |
Yes, I am using the latest version 0.19f. I was able to play around with it a little more last night. It seems that absolutely everything is running high priority for some reason. I made no changes to my BOINC prefrences either. I also found that if I suspend my other project (prime grid) everthing starts back up. That will however have a negative impact on PG. None of this started happening until a recent windows update. I'm very much open to suggestions on how to correct it. I thought I might try and reinstall 19f as soon as I can get a chance in case something got messed up with the update. If that doesn't work maybe reinstalling BOINC. The two systems are running 6.4.7 I have seen that with my 4850 in my i7. It seems like the WU hangs at some point, either from the CPU getting overloaded (all MW tasks 'running' but only 3 crunching) or BOINC trying to do task switching. Click to help Seti City. |
Send message Joined: 9 Sep 08 Posts: 96 Credit: 336,443,946 RAC: 0 |
I am getting the 0.19f GPU lock up as well on 2 different pc's this am. MW GPU app locks up but the SETI AP CPU keeps going without problems. Stop BOINC and start BOINC again gets the GPU app going again. I am also getting High Priority running on GPU WU's as well. |
Send message Joined: 22 Nov 08 Posts: 136 Credit: 319,414,799 RAC: 0 |
Look at this one. Note the GPU time and the wall clock time. <EDIT> I would definately call that a hang! Task ID 77725727 Name ps_sgr_235_2s_6_1603847_1244722735_0 Workunit 76446733 Created 11 Jun 2009 12:18:59 UTC Sent 11 Jun 2009 12:20:07 UTC Received 11 Jun 2009 15:50:10 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 39176 Report deadline 14 Jun 2009 12:20:07 UTC CPU time 11878.53 stderr out <core_client_version>6.4.7</core_client_version> <![CDATA[ <stderr_txt> Running Milkyway@home ATI GPU application version 0.19f by Gipsel CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz (4 cores/threads) 3.54598 GHz (227ms) CAL Runtime: 1.3.145 Found 1 CAL device Device 0: ATI Radeon HD 4800 (RV770) 1024 MB local RAM (remote 28 MB cached + 1024 MB uncached) GPU core clock: 750 MHz, memory clock: 900 MHz 800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads supporting double precision 3 WUs already running on GPU 0 No free GPU! Waiting ... 93.7969 seconds. Starting WU on GPU 0 main integral, 160 iterations predicted runtime per iteration is 145 ms (33.3333 ms are allowed), dividing each iteration in 5 parts borders of the domains at 0 320 640 960 1280 1600 Calculated about 3.70012e+012 floatingpoint ops on GPU, 6.34181e+007 on FPU. Approximate GPU time 11878.5 seconds. probability calculation (stars) Calculated about 1.20373e+009 floatingpoint ops on FPU. WU completed. CPU time: 1.35938 seconds, GPU time: 11878.5 seconds, wall clock time: 12032.2 seconds, CPU frequency: 3.546 GHz </stderr_txt> ]]> Validate state Valid Claimed credit 82.9045335151938 Granted credit 27.75994 application version 0.19 4870 GPU 4870 GPU |
Send message Joined: 22 Nov 07 Posts: 285 Credit: 1,076,786,368 RAC: 0 |
Imcrazy, This happens quit a bit on hosts that are shared with other projects. It seems that the shorter the other projects WU's the more MW hangs. I believe it has something to do with the way BOINC handles debt. I think you mentioned you are also crunching Prime Grid and Aqua. The shorter WU's will suspend your MW WU's until your short term, long term debt is cleared. The new Multi Thread aqua can play havoc on the ATI app since a Aqua WU now wants to use multiple CPU's and will occasionally put MW in suspend mode. To test this, when you see MW hung up, just suspend the other projects, MW should take off and start crunching again without having to reset or reboot your box . |
Send message Joined: 12 Dec 08 Posts: 56 Credit: 269,889,439 RAC: 0 |
Imcrazy, Thanks, Kevin...a good explanation, since I am running PG on my i7 with the GPU doing MW, and I see the PSP sieve WUs jumping into EDF mode, even though the due date is 7 days off and I have a .5 day cache. Click to help Seti City. |
Send message Joined: 22 Mar 08 Posts: 38 Credit: 48,762,331 RAC: 0 |
I've had 3 systems hang up also...and I'm not running any other project. Do you think the shorter WU that Travis took care of was causing this? |
Send message Joined: 22 Nov 08 Posts: 136 Credit: 319,414,799 RAC: 0 |
The only other project that I have running on that system is prime grid. After the windows update it seems like everthing PG and MW went into high priority. It did that on both quads. Last night I reset my debts to 0 on one system using BOINC DV. I'll try that on the other(the one this long winded WU came from) tonight. 4870 GPU 4870 GPU |
Send message Joined: 22 Nov 08 Posts: 136 Credit: 319,414,799 RAC: 0 |
Thanks for the info! I did notice that i could suspend the Prime Grid PSP Sieve units and MW would imediately start back up and run for an extended period. Now it's happening with Prime Grid PSP LLR units (Prime Grid Challenge). Not exactly a short running task 30+ hours for each one. I'm only running 3 at a time to leave 1 core free for MW. I did start a new thread "Hanging Work Units" Please look there for any new developments or suggestions. 4870 GPU 4870 GPU |
Send message Joined: 19 Mar 09 Posts: 27 Credit: 117,670,452 RAC: 0 |
Can you people using 3850 tell what settings you are using in app_info.xml? I have tried many different settings, but still get VPU Recovery events.. I am running on Win7 64bit, ATI 0.19f and BOINC 6.6.36 Everything is fine and the WU's are processed at peak efficiency IF I don't do anything on the computer. But if it is used normally (watch videos, browse web with Firefox etc) then I constantly get blank screen+VPU recovery+jammed WU that I have to either kill with task manager or restart BOINC. I don't seem to have this problem on my 48xx series cards only the 3850. I get somewhat better functionality (still problems but less) with f60 w1.7 n1, but then GPU utilization is only 50-60%. |
Send message Joined: 28 Nov 08 Posts: 4 Credit: 47,239,720 RAC: 418 |
Hi this might be of only little help as both my pcs where I have a 3850 are old and dedicated to crunching but anyway.. I have two AGP bus 3850 on MW, one is on a AMD Athlon and the other Intel Celeron, both running XP Home. Both are running on standard settings except n2 as the cards are 512 mb and the motherboards have only 256 mb each. Basic functionality (web etc) is ok even if as said these are dedicated to MW and pretty much taken back to use as I noted that I can fit a AGP 3850 into them and do some crunching |
©2024 Astroinformatics Group