another change for the maximum time limit elapsed bug

Author	Message
Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 48626 - Posted: 9 May 2011, 10:24:30 UTC I've tried yet another fix (the rsc_fpops_bound is now 10000 times higher than our estimate). I'm really hoping this should cover most everyone that's still having workunit immediately error out. Let me know if it works. ID: 48626 · Rating: 0 · rate: / Reply Quote

Alinator Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0	Message 48656 - Posted: 9 May 2011, 21:32:14 UTC Last modified: 9 May 2011, 21:40:21 UTC Latest tweeaks are still not working for 2 of the 4 hosts I have running MW CPU apps which have more than one core. The two which run them successfully are a Dell Latitude notebook with an Intel CD T2400 processor runing XPP x86 SP3 and the Ph II X4 955 running W7U x64. One note about the 955 is since it's running in protected app mode (service) BOINC cannot currently detect the fact the IGP is enabled. Also, the IGP is the only GPU it has. The other Ph II X4's fail N-Body instantly with a 'code 128; No child processes to wait for' error. One is a 945 with the IGP enabled and is the only GPU, and the other is the 955 with the HD 4850 (both enabled in BIOS, IGP primary, but the IGP disabled for BOINC in cc_config). Both the hosts are running XPP x64 SP2, and all the hosts with ATI graphics are running Cat 11.3. I was able to have Process Monitor running just before the 955 in front of me tried to run the last N-Body it got, but I haven't looked over the capture file yet. So there might be some more intel to gained. ID: 48656 · Rating: 0 · rate: / Reply Quote

ChrisS Send message Joined: 10 Feb 09 Posts: 1 Credit: 1,771,086 RAC: 0	Message 48659 - Posted: 9 May 2011, 22:43:42 UTC I've just had approximatley 10 work-units fail due to 'computational errors' after approx 5-15 mins is this a related issue? ID: 48659 · Rating: 0 · rate: / Reply Quote

Alinator Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0	Message 48660 - Posted: 9 May 2011, 22:54:08 UTC - in response to Message 48659. It would seem there might be an issue with some systems not related to the resource bounds problems Travis has been working on. Obviously it helped some folks based on other posts I've seen (code 177's), but I'd almost bet good money it hasn't any difference for hosts throwing code 128's, 185's and 226's. ID: 48660 · Rating: 0 · rate: / Reply Quote

Jesse Viviano Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0	Message 48661 - Posted: 9 May 2011, 23:17:29 UTC You might need to double the deadline. I just got a computation error on an N-body work unit after getting around 63% of the work unit done. I am using a Core i7 980X with hyper-threading enabled. Nothing is overclocked. ID: 48661 · Rating: 0 · rate: / Reply Quote

Gumpokc Send message Joined: 7 Sep 09 Posts: 3 Credit: 147,071 RAC: 0	Message 48672 - Posted: 10 May 2011, 6:10:11 UTC 5/10/2011 1:04:36 AM Milkyway@home Starting de_nbody_orphan_test_2model_4_20631_1304909701_1 5/10/2011 1:04:36 AM Milkyway@home Starting task de_nbody_orphan_test_2model_4_20631_1304909701_1 using milkyway_nbody version 40 5/10/2011 1:04:41 AM Milkyway@home Computation for task de_nbody_orphan_test_2model_4_20631_1304909701_1 finished doing the exact same thing ever since nbody 40 came out. I've tried setting it so I don't even get nbody's anymore, but it keeps downloading them. I'm going to drop MW@H, maybe i'll check back in 6 months and see if things have been worked out. ID: 48672 · Rating: 0 · rate: / Reply Quote

olav Send message Joined: 31 Mar 10 Posts: 2 Credit: 248,732 RAC: 0	Message 48674 - Posted: 10 May 2011, 7:20:58 UTC Hi! It seems to work on my machine now. The latest n_body task is at 8% progress now, which quite a few before never reached due to some software error. Good job! Cheers, Olav ID: 48674 · Rating: 0 · rate: / Reply Quote

olav Send message Joined: 31 Mar 10 Posts: 2 Credit: 248,732 RAC: 0	Message 48675 - Posted: 10 May 2011, 7:30:02 UTC - in response to Message 48674. Sorry, white smoke too early... It crashed again - after completing roughly 60% of the unit. ID: 48675 · Rating: 0 · rate: / Reply Quote

dskagcommunity Send message Joined: 26 Feb 11 Posts: 170 Credit: 205,557,553 RAC: 0	Message 48733 - Posted: 13 May 2011, 6:59:47 UTC Let the MW Computer run overnight, all seems fine again :) DSKAG Austria Research Team: http://www.research.dskag.at ID: 48733 · Rating: 0 · rate: / Reply Quote

Anton Rang Send message Joined: 25 Feb 11 Posts: 2 Credit: 1,353,635 RAC: 0	Message 48737 - Posted: 13 May 2011, 15:00:40 UTC - in response to Message 48626. On my Intel Mac, Iâ€™m still seeing the nbody computations error out (though after about 19 seconds rather than 3 seconds of elapsed time), but now the separation jobs have estimated completion times which are two orders of magnitude more time than they actually take. (It looks like those nbody jobs were received yesterday and today, May 12 & 13.) ID: 48737 · Rating: 0 · rate: / Reply Quote

POPSIE Send message Joined: 25 Jan 11 Posts: 12 Credit: 16,960,651 RAC: 0	Message 48845 - Posted: 18 May 2011, 6:04:26 UTC since Mai 7. n-body produces error. <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> Maximum elapsed time exceeded </message> <stderr_txt> <search_application>milkywayathome nbody 0.40 Windows x86_64 double OpenMP Crlibm</search_application> 07:19:24: Using OpenMP 4 max threads on a system with 4 processors </stderr_txt> ]]> For more Info look at this ID: 48845 · Rating: 0 · rate: / Reply Quote