Message boards :
News :
maximum time limit elapsed bug
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next
Author | Message |
---|---|
Send message Joined: 13 Mar 09 Posts: 5 Credit: 1,366,490 RAC: 0 |
It's working with the file "app_info.xml". Thank you :) But one WU on CPU has been deleted I think because of this :( 5 hours lost :( Look at this : 19/07/2011 22:36:01 | | ATI GPU 0: ATI Radeon HD 4700/4800 (RV740/RV770) (CAL version 1.4.1417, 512MB, 1000 GFLOPS peak) 19/07/2011 22:36:01 | Milkyway@home | Found app_info.xml; using anonymous platform 19/07/2011 22:36:01 | Milkyway@home | [error] State file error: missing application milkyway_nbody 19/07/2011 22:36:01 | Milkyway@home | [error] Can't handle workunit in state file 19/07/2011 22:36:01 | Milkyway@home | [error] State file error: missing task ps_nbody_test3_499724 19/07/2011 22:36:01 | Milkyway@home | [error] Can't link task ps_nbody_test3_499724_0 in state file 19/07/2011 22:36:01 | Milkyway@home | [error] State file error: result ps_nbody_test3_499724_0 not found for task |
Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,461,693,501 RAC: 0 |
Switching to BOINC 6.13 got me my first valid unit for my ATI 4890 http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=261689 Previously, the timeout interval was 82 seconds (from my understanding of the stderr) and my units would fail between %40 and %60 complete. After switching from 6.12.26 to 6.13 all work units are validing. I have not yet tested PrimeGrid which also failed with the 4890 board. Driver is 11.6 from ATI and when I brought up the catalyst control center, I got a driver failure and a message from CCC that it was switching to compatibility mode (whatever that means). I continued to get MW failures even after rebooting and then decided to switch to 6.13. The ATI 4890 was an xfxforce warrantee replacement for my defective nVidia gtx280. Seems they ran out of gtx280's. [EDIT] hmm ... spoke too soon. Got failures on two MW units on the 5850. Had not had any errors on that system before upgrading to 6.13 PrimeGrid is still failing on this 4890, but so far, all milkyways are getting to completion |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
hmm ... spoke too soon. Got failures on two MW units on the 5850. Had not had any errors on that system before upgrading to 6.13 Stay away from 6.13.0 ..... it was withdrawn very quickly with major bugs present. They seem to have fixed most of the ones that crept into 6.13.0 with the release of 6.13.1. However, they are all Alpha releases, so doubt 6.13.1 is really trustworthy just yet. Stick to 6.12.33 which is the latest stable release. I doubt it will solve all (any?) of your current ills, but its a racing certainty that 6.13.0 will not in any way help, and is highly likely to be detrimental to say the least. Regards Zy |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
6.13.1 is really really bad too :( |
Send message Joined: 4 Oct 08 Posts: 1734 Credit: 64,228,409 RAC: 0 |
Tried to upgrade the client to the Milkyway client - milkyway_separation_0.82_windows_intelx86_ati14, with the App_info file. It seemed to be OK but would not download any work, and when update requests were made FreeHAl kept resetting the work being crunched. In the end I gave up and returned to DNETC. I hope the stock GPU client is reworked to overcome this time lapse problem Go away, I was asleep |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Nothing dramatically new, but posting an observation as a lot of work has gone into the beasts lately. Had three go bang in fairly short order for some reason. Temperatures are fine, PC seems stable, came out of the blue really. <core_client_version>6.12.33</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' Error reading astronomy parameters from file 'astronomy_parameters.txt' Trying old parameters file Using SSE3 path Found 4 CAL devices Chose device 2 I did notice the card was decidedly "dragy" after them, I had to switch tabs a few times to encourage the counter to get going. In the end I stopped the BOINC Client, and restarted the BOINC Client, that seemed to clear it fine, and away she went again. No other problems. Whether or not the bad WU(s) were stuck in the GPU after crashing, and caused delays loading fresh ones, no idea, but it gave that impression as all was well when the Client restarted. Edit: Woa .... Welcome home Murphy .... Just had a Blue Screen, first one for weeks. No CPU WUs running, so its "clean run" for MW. Too fast to notice the Blue Screen notes, but was definitely "ati....something" as the nominal errent file, so something is still lurking inside the GPU application wise. Driver appears sold at present - no driver resets happening. Regards Zy |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
John, one problem is that you need to update your BOINC version to current. Then we can perhaps offer suggestions. |
Send message Joined: 13 Mar 09 Posts: 5 Credit: 1,366,490 RAC: 0 |
My computer's graphics become slow when my GPU is working. (the windows move slowly for example). Can we adjust the % of the GPU like the CPU? With a limit of 80% for example? |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
Had three go bang in fairly short order for some reason. Temperatures are fine, PC seems stable, came out of the blue really. Interesting because I had WUs on 3 machines do the same thing a few hours ago after those boxes had been error free for quite a while. Maybe a rash of bad WUs? |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
...... and another BSOD, the nominal file was "atikpmag.sys", and seeing it again, it was that one last time. I'll turn down the card a bit see what happens, but it ran fine at 760/300 for well over an hour. All a bit strange, see how she goes. EDIT: Been running for 15 mins after that last BSOD @750/300 - Murphy allowing, seems ok again. Maybe 760/300 was just that little over the top, but seems strange after running with no problems for well over an hour, and 760/300 is hardly fast and furious for 2x5970s...... still, Crunch on as they say :) Regards Zy |
Send message Joined: 12 Aug 09 Posts: 262 Credit: 92,631,041 RAC: 0 |
John, one problem is that you need to update your BOINC version to current. Then we can perhaps offer suggestions. I don't think so, I run 6.10.58 and that works fine for MW via ATI and Einstein, Rosetta, MW via CPU without any issues. I never update to the latest version as thay usually have more or newer bugs. Greetings from, TJ |
Send message Joined: 22 Apr 11 Posts: 5 Credit: 5,578,008 RAC: 0 |
I enountered errors with PrimeGrid after installing Catalyst 11.6 also and rolled back to 11.5 on my primary Win7 installation. I have not tested BOINC on my secondary Win7 installation which is running Catalyst 11.6 I think its an ATI driver problem as CCC 11.6 seems to be getting a few a,b,c, and d patches made available for it. But I don't know if they'll patch the problem with BOINC and PrimeGrid yet. Dual GPU's aren't both clocking to the maximum, one insists on remaining at default clocks...and the crashed PrimeGrid WU's seem to occurr when the video card switches to 3D mode or begins computation during the 3D mode switch too abruptly, and so BOINC or PrimeGrid may have to adjust a delayed respone time in ms. Again unclear wether this is a BOINC problem or an ATI problem. 4870 X2 here, somtimes in quadfire. I think its faster than the 6990 in computation due to having more cores in the 4870 X2, and clocks are secondary to speed. BOINC 6.12.33 as far as I can tell is stable with a new behavior, it seems to turn work units in soon after completion rather than allowing completed work units across projects to stack up for a manual update. |
Send message Joined: 22 Apr 11 Posts: 5 Credit: 5,578,008 RAC: 0 |
My 4870 X2 Cores run at max 800 Mhz...its just the memory won't OC to the max anymore like it did for a few older CCC versions. Otherwise X2 cores are stable even at temps of 175-190 Fahrenheit, its also my gaming rig. |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
Currently my both MW-crunching systems run fine, but ... - my mainsys: since I've upgradet to 11.6 my second screen flickers sometimes. The problem is described in the hotfix-list for the update to 11.6b; however, the flicker problem is not solved. - my Integrator-pc , which usually has one ATI and one nVidia- card, was unusable after upgrading to 11.6. It failed with a blue-screen twice an hour, with or without boinc running, always with an ati-file shown in the dump(sorry, I cannot tell you which one). I removed ALL drivers, nVidia and ATI, uninstalled all vendor software and reinstalled CCC 11.6b. It's running clean now since three days. |
Send message Joined: 15 Jul 11 Posts: 14 Credit: 5,978,191 RAC: 0 |
Can I ask a stupid guestion ? Why are people discussing other problems ? This is a thread for "maximum time limit elapsed bug". |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
The problem is, that no one can say for shure, what the cause of the problem is. In this case it makes sense to eliminate all possible sources. The one, which can not be eliminated is most likely the one which causes the pain. |
Send message Joined: 15 Jul 11 Posts: 14 Credit: 5,978,191 RAC: 0 |
I agree, but not if people are talking BSOD\Incorrect function. (0x1) - exit code 1 (0x1)\etc. instead of the maximum time limit error. On that note: I wonder what will happen if instead of <flops>1.0e11</flops> (100 GFLOPS) in my app_info file I put in a number greater then the estimated task size received from the server. So if the server estimates 50000 GFLOPS for the WU and I enter a value greater then that. Then the program should be finished in less then a second which is not true of course it takes longer. Perhaps a test for later. |
Send message Joined: 15 Jul 11 Posts: 14 Credit: 5,978,191 RAC: 0 |
I just did a small test and I got: <core_client_version>6.12.33</core_client_version> <![CDATA[ <message> Maximum elapsed time exceeded </message> ]]> I changed the <flops>1.0e11</flops> to <flops>6002.0e11</flops> in the app_info file. So if the program runs without app_info and the estimated application speed is greater then the estimated task size you will receive this error. |
Send message Joined: 19 Jul 10 Posts: 644 Credit: 19,475,991 RAC: 718 |
Not surprising, if you increase the flops of the app, BOINC thinks it is faster and expect it to be ready with the task faster than with lower value (= slower app). You either have to decrease the flops of the app or increase rsc_fpops_bound for BOINC should allow the app to run longer. |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
John, one problem is that you need to update your BOINC version to current. Then we can perhaps offer suggestions. If you looked you might have seen that he's running 6.12.22 which was quite buggy. Your version is more stable but 6.12.33 is better yet. It's hard to properly diagnose the type of problem he's having when he has a BOINC version with many known problems. I really don't make suggestions to hear myself talk. If you guys want me to stop helping people I will. Don't really want to waste time on senseless arguments. |
©2025 Astroinformatics Group