1)
Message boards :
Number crunching :
Computational Error?
(Message 58474)
Posted 1 Jun 2013 by TLSI2000 Post: To get back to operational, I downgraded back to 13.1 and it works fine. To be successful at this downgrade, you *must* use the separate Catalyst Uninstall which is downloadable from the same page as the installs. The Catalyst Install Manger's included Uninstall option does not do the clean-up necessary to get back to a solid re-install condition. I am now back in operation. Good Luck. Uninstall Link http://support.amd.com/us/gpudownload/windows/Pages/catalyst-uninstall-utility.aspx |
2)
Message boards :
Number crunching :
Computational Error?
(Message 58407)
Posted 26 May 2013 by TLSI2000 Post: As I updated from AMD Catalyst 13.1 to 13.4 today, the MilkyWay work units are all failing. Something that I did not expect, so searching the forum brought up this thread. Running BOINC 7.0.64 on Win7 64bit AMD 69xx Cayman GPU The error log indicates an exception, and dumps a BOINC debug trace to the log. The error seems to occur at the end of processing. -------------------------------------------- Example for Task: 481431579 WorkUnit: 368827958 computer: 366518 Using AMD IL kernel Binary status (0): CL_SUCCESS Estimated AMD GPU GFLOP/s: 2765 SP GFLOP/s, 691 DP FLOP/s Using a target frequency of 30.0 Using a block size of 6144 with 121 blocks/chunk Using clWaitForEvents() for polling (mode -1) Range: { nu_steps = 320, mu_steps = 1600, r_steps = 1400 } Iteration area: 2240000 Chunk estimate: 3 Num chunks: 4 Chunk size: 743424 Added area: 733696 Effective area: 2973696 Initial wait: 27 ms Integration time: 37.905295 s. Average time per iteration = 118.454047 ms Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x000007FEDE811DCD read attempt to address 0x00000010 Engaging BOINC Windows Runtime Debugger... |
3)
Message boards :
News :
N-Body 1.08
(Message 57584)
Posted 21 Mar 2013 by TLSI2000 Post: These seem to be running fine, with a run time coming in at 2 to 4 hours for an older AMD 2.4 ghz But the credit calculation seems to be a bit odd Run time _ _ CPU time _ _ Credit _ _ Application 6,357.59 _ _ 6,357.59 _ _ 26.84 _ _ MilkyWay@Home N-Body Simulation v1.08 6,404.13 _ _ 6,404.13 _ _ 27.04 _ _ MilkyWay@Home N-Body Simulation v1.08 9,509.63 _ _ 9,496.64 _ _ 13.22 _ _ MilkyWay@Home N-Body Simulation v1.08 |
4)
Message boards :
News :
might have found the error
(Message 55782)
Posted 14 Oct 2012 by TLSI2000 Post: And I have seen a number of errors on the version '3' WUs overnight as well. These seem to have an #IND in the result as: <stream_only_likelihood> -3.638176510600306 -10.877692835483177 -1.#IND00000000000 </stream_only_likelihood> |
5)
Message boards :
News :
might have found the error
(Message 55773)
Posted 14 Oct 2012 by TLSI2000 Post: A Follow-up... After two hours of the version '3' , I have seen *zero* errors on them. Still having a few errors on the version '1' and '2' WUs as those batches run their course. It looks like the problem is solved. Thx. |
6)
Message boards :
News :
another test run 'de_separation_22_3s_edge_1'
(Message 55754)
Posted 13 Oct 2012 by TLSI2000 Post: I have been seeing computational errors that occur at the end of processing on roughly one out of ten WUs Error Examples Milkyway@Home 1.02 MilkyWay@Home (opencl_amd_ati) de_separation_22_3s_edge_1_1350087199_193416_2 00:00:51 (00:00:01) 10/13/2012 9:41:15 AM 10/13/2012 9:43:56 AM 0.05 CPUs + 1 ATI GPU 1.96 Reported: Computation error (1,) Sagita Milkyway@Home 1.02 MilkyWay@Home (opencl_amd_ati) ps_separation_22_3s_free_1_1350087199_232226_2 00:00:56 (00:00:02) 10/13/2012 9:30:33 AM 10/13/2012 9:32:29 AM 0.05 CPUs + 1 ATI GPU 3.57 Reported: Computation error (1,) Sagita |
7)
Message boards :
News :
NBody Update and New Runs
(Message 55668)
Posted 7 Oct 2012 by TLSI2000 Post: As we wait for the new N-body tasks, does the sub-project have an intended restart date ? |
8)
Message boards :
News :
New NBody test searches
(Message 55327)
Posted 10 Aug 2012 by TLSI2000 Post: When the outstanding n-Body work units finally get down to zero, is there to be another series ? The count is now at 2. |
9)
Message boards :
News :
Nbody updated to 0.60
(Message 49419)
Posted 19 Jun 2011 by TLSI2000 Post: Thank You !!!!! I have had all of these on my WinXP64 systems fail since the last version. They are now going through. Thanks ! |
10)
Message boards :
News :
another attempt at the max time limit elapsed fix
(Message 48925)
Posted 22 May 2011 by TLSI2000 Post: Even since the new version of the NBody .40 I have been getting another error: <core_client_version>6.12.26</core_client_version> <![CDATA[ <message> There are no child processes to wait for. (0x80) - exit code 128 (0x80) </message> ]]> This means that three of my servers cannot process for MW. These just com up for processing and exit immediately. no app_info - just normal processing |
11)
Message boards :
Number crunching :
annoying pop-up
(Message 48658)
Posted 9 May 2011 by TLSI2000 Post: I know it is not in there at version 6.12.15 -- so I just upgraded to 6.12.26 to be able to make that thing go away. It really was becoming anoying Thanks for the info on it. |
12)
Message boards :
News :
fix to the invalid workunit problem
(Message 48595)
Posted 8 May 2011 by TLSI2000 Post: 3 up - three down - all immediate computation errors Thanks for the effort. I'm not such a big player, so I think that I will go elsewhere for a while and come back later. Thanks. |
13)
Message boards :
News :
fix to the invalid workunit problem
(Message 48586)
Posted 8 May 2011 by TLSI2000 Post: I have three systems, all running the MT version of the NBody code on CPUs (no GPUs) The 32-bit runs fine on the dual processor system, without an app_info file. Both 64bit systems with 12 cores fail immediately, both with and without an app_info file. So the problem is not isolated to just the GPU systems. |
14)
Message boards :
News :
N-body updated to 0.40
(Message 48443)
Posted 2 May 2011 by TLSI2000 Post: I am looking at two servers that will not calculate an n-body correctly at all. I have reset the project on each (twice), and currently am running with no XML file for these. They all error out immediately with an exit status 128 I have tried the several versions on the XML file presented here, but to no avail. The version I am using is the one automatically downloaded on the resets, for a 64-bit XP server -- milkyway_nbody_0.40_windows_x86_64__mt and the two associated dlls are thee as well. |
15)
Message boards :
News :
updated the CPU applications
(Message 42748)
Posted 11 Oct 2010 by TLSI2000 Post: a few examples: this Milkyway@home 0.40 MilkyWay@Home de_16_2s_5_19106_1286482216_0 22:17:35 (22:13:34) 10/9/2010 4:14:37 PM 10/9/2010 6:00:44 PM Reported: Computation error (0,) this Milkyway@home 0.40 MilkyWay@Home de_13_2s_5_609171_1286475411_0 21:31:01 (21:27:20) 10/9/2010 1:03:55 PM 10/9/2010 2:00:26 PM Reported: Computation error (0,) this Milkyway@home 0.40 MilkyWay@Home de_16_3s_5_612076_1286476176_0 20:20:48 (20:16:59) 10/9/2010 12:24:45 PM 10/9/2010 12:54:46 PM Reported: Computation error (0,) |
16)
Message boards :
News :
updated the CPU applications
(Message 42708)
Posted 9 Oct 2010 by TLSI2000 Post: What is really painful is to watch one of the new WUs process, then get to that 20-22 hour mark and end in what looks like a 'normal' end of processing, but show up as a computation error. So far, this is about 6 cpu-days of processing that is thrown away. I am close to aborting all MW in the queue and going elsewhere for a while. |
17)
Message boards :
News :
started a new nbody search: de_nbody_model1_1
(Message 42052)
Posted 11 Sep 2010 by TLSI2000 Post: Most (about 70%) abort in the first second. On my two systems, they are taking 20-40 minutes, of the few that don't abort immediately. And I have had a couple of 'runaways', that completed less than 1% after 20-25 minutes, with an ever increasing estimated time of completion well over an hour. I aborted these manually |
©2024 Astroinformatics Group