Message boards :
Number crunching :
Error while computing again and again
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Nov 09 Posts: 1 Credit: 824,834 RAC: 0 |
Hi, I recently updated my nvidia driver to work on 0.50 gpu,and it worked for some time,but now I get "error while computing" over and over again like before. It doesn't seem to come from the same problem though,since others are crashing with cpu versions of milkyway,or so it seems. Here are my latest workunits: http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=226592741 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=226587309 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=226576105 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=226572017 Why are those workunits not working? |
Send message Joined: 30 Dec 08 Posts: 30 Credit: 6,999,702 RAC: 0 |
Same results with me too. Computation errors everytime. Somebody needs to check on this ASAP! |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
The cuda version is known as having a problem with de_separation_23_3s WU's. The next version of the app should fix that. |
Send message Joined: 12 Sep 07 Posts: 2 Credit: 10,025,948 RAC: 0 |
For my two systems, I'm seeing: - GTX260 errors on the _23_3s_fix series - GTX460 success on the _23_3s_fix series I'll keep an eye on it, but so *far* it seems that might be a clue. My fermi card works but not the older one. |
Send message Joined: 19 Aug 09 Posts: 23 Credit: 631,303 RAC: 0 |
I've recently started N-body work on a CPU, after upgrading from an optimized ap. I'm seeing lots of computing errors as well, some after a long time crunching. Anyone have any ideas? http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=226730964 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=226593122 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=225091853 |
Send message Joined: 18 Oct 07 Posts: 35 Credit: 4,684,314 RAC: 0 |
Same issue for me on MW v0.50 with NVIDIA GTX 260. All tasks are running up to 100% and crash on finish line. Just updated driver to 266.58 on one machine with no avail. Checked my wingmen and it seems that MW v0.23 with ATI is crashing, too. Here's the log: <core_client_version>6.10.17</core_client_version> <![CDATA[ <message> Unzulässige Funktion. (0x1) - exit code 1 (0x1) </message> <stderr_txt> <search_application> milkywayathome separation 0.50 Windows x86 double OpenCL </search_application> Found 1 platforms Platform 0 information: Platform name: NVIDIA CUDA Platform version: OpenCL 1.0 CUDA 3.2.1 Platform vendor: Platform profile: Platform extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll Using device 0 on platform 0 Found 2 CL devices Device GeForce GTX 260 (NVIDIA Corporation:0x10de) Type: CL_DEVICE_TYPE_GPU Driver version: 266.58 Version: OpenCL 1.0 CUDA Compute capability: 1.3 Little endian: CL_TRUE Error correction: CL_FALSE Image support: CL_TRUE Address bits: 32 Max compute units: 27 Clock frequency: 1104 Mhz Global mem size: 939327488 Max mem alloc: 234831872 Global mem cache: 0 Cacheline size: 0 Local mem type: CL_LOCAL Local mem size: 16384 Max const args: 9 Max const buf size: 65536 Max parameter size: 4352 Max work group size: 512 Max work item dim: 3 Max work item sizes: { 512, 512, 64 } Mem base addr align: 2048 Min type align size: 128 Timer resolution: 1000 ns Double extension: MW_CL_KHR_FP64 Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 Found a compute capability 1.3 device. Using -cl-nv-maxrregcount=32 Compiler flags: -cl-mad-enable -cl-no-signed-zeros -cl-strict-aliasing -cl-finite-math-only -DUSE_CL_MATH_TYPES=0 -DUSE_MAD=1 -DUSE_FMA=0 -cl-nv-verbose -cl-nv-maxrregcount=32 -DDOUBLEPREC=1 -DMILKYWAY_MATH_COMPILATION -DNSTREAM=3 -DFAST_H_PROB=1 -DAUX_BG_PROFILE=0 -DUSE_IMAGES=1 -DI_DONT_KNOW_WHY_THIS_DOESNT_WORK_HERE=0 Build status: CL_BUILD_SUCCESS Build log: : Considering profile 'compute_13' for gpu='sm_13' in 'cuModuleLoadDataEx_4' : Retrieving binary for 'cuModuleLoadDataEx_4', for gpu='sm_13', usage mode=' --verbose --maxrregcount 32 ' : Considering profile 'compute_13' for gpu='sm_13' in 'cuModuleLoadDataEx_4' : Control flags for 'cuModuleLoadDataEx_4' disable search path : Ptx binary found for 'cuModuleLoadDataEx_4', architecture='compute_13' : Ptx compilation for 'cuModuleLoadDataEx_4', for gpu='sm_13', ocg options=' --verbose --maxrregcount 32 ' ptxas info : Compiling entry function 'mu_sum_kernel' for 'sm_13' ptxas info : Used 32 registers, 800+0 bytes lmem, 48+16 bytes smem, 56 bytes cmem[1], 4 bytes cmem[2], 4 bytes cmem[3], 4 bytes cmem[4], 4 bytes cmem[5], 4 bytes cmem[6] Kernel work group info: Work group size = 512 Kernel local mem size = 64 Compile work group size = { 0, 0, 0 } Group size = 64, per CU = 8, threads per CU = 512 Block size = 13824 Desired = 163 Min sol: 163 13312 Lower n solution: n = 163, x = 13312 Higher n solution: n = 163, x = 13312 Using solution: n = 163, x = 13312 Range: { nu_steps = 640, mu_steps = 1600, r_steps = 1400 } Iteration area: 2240000 Chunk estimate: 163 Num chunks: 163 Added area: 13312 Effective area: 2253312 Integration time: 957.124801 s. Average time per iteration = 1495.507502 ms Kernel work group info: Work group size = 512 Kernel local mem size = 64 Compile work group size = { 0, 0, 0 } Group size = 64, per CU = 8, threads per CU = 512 Block size = 13824 Desired = 21 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Didn't find a solution. Using fallback solution n = 20, x = 0 Using solution: n = 20, x = 0 Range: { nu_steps = 160, mu_steps = 400, r_steps = 700 } Iteration area: 280000 Chunk estimate: 21 Num chunks: 20 Added area: 0 Effective area: 280000 Global dimensions not divisible by local Failed to find good run sizes Failed to calculate integral 1 12:30:48 (2372): called boinc_finish </stderr_txt> ]]> Couldn't be a BOINC client version issue because the ATIs are crashing on 6.10.56 while I'm using 6.10.17. Will set my boxes to NNW until further notice. |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
WU: de_separation_23_3s GPU: NVIDIA GTX 260 Looks like the problem with GTX2xx and WUs de_separation_23_3s Matt is aware of and going to fix in the next version. Other WUs should run on your GPU. The only WU with error I still could find in your list is Workunit 228247254 and there is an ATI card (HD5850?) that finished the WU and is waiting for validation (not chrashed). |
Send message Joined: 18 Oct 07 Posts: 35 Credit: 4,684,314 RAC: 0 |
Yes, you're right. They all have been de_separation_23_3s WUs, the last one crashed right now. So I will wait for the fix to come. In the meantime, there are other projects waiting for my GPUs ;-))) WU: de_separation_23_3s |
©2024 Astroinformatics Group