Message boards :
News :
Nvidia OpenCL updated
Message board moderation
Author | Message |
---|---|
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
I've updated the Nvidia/OpenCL application to 0.52 which should fix the failures on the 23* tasks. |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 555,377,969 RAC: 38,101 |
Matt, how do I find the name of the new 0.52 OpenCL app so I can update my app_info file to download it? I see it listed in the project apps list but can't figure out how to get it without reverting back to no app_info. I use app_info to change my count to .5. Thanks, Keith |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 555,377,969 RAC: 38,101 |
Never mind, I found the download directory. Keith |
Send message Joined: 6 Oct 07 Posts: 1 Credit: 76,523,762 RAC: 10,718 |
All my linux hosts seem to be failing all tasks with the new 0.52 OpenCL app. I have set them to no new work. So far my Windows hosts are doing fine with the new 0.52 OpenCL app. |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 555,377,969 RAC: 38,101 |
Yes, I am having computation errors with all tasks using the new .52 Linux OpenCL app also. Will revert back to the .50 app until it gets figured out. Keith |
Send message Joined: 5 Jun 08 Posts: 21 Credit: 245,803,013 RAC: 0 |
I have also the same problem : all the wu 0.52 are failing on my linux ubuntu 10.10 host http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=66288 I am using nvidia 270.18 x64 with a GTX260, boinc 6.10.58 x64 and have reseted the project. Any help, please ? |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
All my linux hosts seem to be failing all tasks with the new 0.52 OpenCL app.I made a really dumb mistake in the Linux build. Should be fixed now (0.54). |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 555,377,969 RAC: 38,101 |
Matt, thanks for making the new build so quickly and fixing the problem. Running 0.54 quite well now. Keith |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
Matt, Thanks for the update. I too was having huge problems. Looking forward to producing good WUs again. Regards, Steve Ubuntu 10.04 |
Send message Joined: 8 Jan 11 Posts: 1 Credit: 1,584,923 RAC: 0 |
Matt, Im running 052 ver on 2 gtx 460's and for some reason the cards will not finish the unit? it will keep working untill i close bonic. one unit worked up untill 175%? i could complete a unit in 12 min, now hours. Any ideas how to fix this problem? happy crunching, Landon Oswalt |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
Matt, Im running 052 ver on 2 gtx 460's and for some reason the cards will not finish the unit? it will keep working untill i close bonic. one unit worked up untill 175%? i could complete a unit in 12 min, now hours. Any ideas how to fix this problem?A bunch of workunits were started which were way too big and taking too long on CPUs and many weaker GPUs. The total number of steps in the progress calculation was overflowing the 32 bit limit and wrapping around, causing progress bars to go over 100%. These should go away soon. |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
I'm still having major problems on my machines (Ubuntu 10.04 with GTX460 cards). Up until about 5 days ago, things were running great, the cards were switching off with Einstein running 2 WU's simultaneously. Life was good. Now, despite detaching and re=attaching through BoincStats and doing anything else I can think of, I can't even get work units or the apps downloaded. The one machine that had a big backlog of v .50 WU's was unaffected but has now finished all of those WUs. It has been showing 3 WUs for nbodySim 0.21 for days in "downloading" status, but nothing has come through. Einstein is having a field day with all of the GPUs to itself. Is there anything that I can do from here to get things going again with MW? Thanks for the help. Regards, Steve |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
I'm still having major problems on my machines (Ubuntu 10.04 with GTX460 cards). Up until about 5 days ago, things were running great, the cards were switching off with Einstein running 2 WU's simultaneously. Life was good. If you checked the server status page, you will have noticed that there are no work units to download. Something is messed up with the server at this moment. |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
Well, at least I won't be sending back bad WU's anymore due to computational errors. Hope it all gets sorted out.Judging from the various postings, it sounds like multiple unrelated problems. Regards, Steve |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 555,377,969 RAC: 38,101 |
I just switched back over to the Linux side and now have 14 tasks that exited with a compute error on the new 0.54 Linux OpenCL app. Could this be because of the recent incorrectly sized work that was sent out? Here is a shortened result from a task that errored out: <core_client_version>6.12.12</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> <stderr_txt> <search_application> milkywayathome separation 0.54 Linux x86_64 double OpenCL </search_application> Found 1 platforms Platform 0 information: Platform name: NVIDIA CUDA Platform version: OpenCL 1.0 CUDA 3.2.1 Platform vendor: Platform profile: Platform extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll Using device 0 on platform 0 Found 1 CL devices Device GeForce GTX 460 (NVIDIA Corporation:0x10de) Type: CL_DEVICE_TYPE_GPU Driver version: 260.19.36 Version: OpenCL 1.0 CUDA Compute capability: 2.1 Little endian: CL_TRUE Error correction: CL_FALSE Image support: CL_TRUE Address bits: 32 Max compute units: 7 Clock frequency: 1430 Mhz Global mem size: 1072889856 Max mem alloc: 268222464 Global mem cache: 114688 Cacheline size: 128 Local mem type: CL_LOCAL Local mem size: 49152 Max const args: 9 Max const buf size: 65536 Max parameter size: 4352 Max work group size: 1024 Max work item dim: 3 Max work item sizes: { 1024, 1024, 64 } Mem base addr align: 4096 Min type align size: 128 Timer resolution: 1000 ns Double extension: MW_CL_KHR_FP64 Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 Compiler flags: -cl-mad-enable -cl-no-signed-zeros -cl-strict-aliasing -cl-finite-math-only -DUSE_CL_MATH_TYPES=0 -DUSE_MAD=1 -DUSE_FMA=0 -cl-nv-verbose -DDOUBLEPREC=1 -DMILKYWAY_MATH_COMPILATION -DNSTREAM=1 -DFAST_H_PROB=1 -DAUX_BG_PROFILE=0 -DUSE_IMAGES=1 -DI_DONT_KNOW_WHY_THIS_DOESNT_WORK_HERE=1 Build status: CL_BUILD_SUCCESS Build log: : Considering profile 'compute_20' for gpu='sm_21' in 'cuModuleLoadDataEx_4' Kernel work group info: Work group size = 576 Kernel local mem size = 0 Compile work group size = { 0, 0, 0 } Group size = 64, per CU = 32, threads per CU = 2048 Block size = 14336 Desired = 367 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Min sol: 1 0 Didn't find a solution. Using fallback solution n = 375, x = 0 Using solution: n = 375, x = 0 Range: { nu_steps = 1500, mu_steps = 3500, r_steps = 3000 } Iteration area: 10500000 Chunk estimate: 367 Num chunks: 375 Added area: 0 Effective area: 10500000 Block size: 14336 Global dimensions not divisible by local Failed to find good run sizes Failed to finish: CL_INVALID_COMMAND_QUEUE Failed to run nu step: CL_INVALID_COMMAND_QUEUE Failed to calculate integral 0 02:49:02 (2522): called boinc_finish </stderr_txt> ]]> |
Send message Joined: 4 Aug 08 Posts: 1 Credit: 526,155 RAC: 0 |
All my 0.52 (cuda_opencl) WU's end within one or at most some seconds with a computation error. My laptop uses a NVIDIA Geforce GT445M. What can be wrong? |
Send message Joined: 6 Oct 09 Posts: 39 Credit: 78,881,405 RAC: 0 |
I'm seeing 100% failure with win7-64 and GTX260s/ |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 555,377,969 RAC: 38,101 |
Other than the few WU that were too large and errored out, I seem to be running OpenCL WU on my 64 bit Linux 0.54 app successfully. Just looked at two WU, just reported and they validated. Maybe Matt needs to look at the build of the 0.52 Windows app and see if he missed something like the obvious goof he made on the Linux 0.52 app. Keith |
Send message Joined: 30 Apr 09 Posts: 101 Credit: 29,874,293 RAC: 0 |
Ohh.. a pity, MW@h canceled the CUDA apps? Now OpenCL? AFAIK, at least 197.x nVIDIA driver needed. But, my machines need to stay with 190.38 which give the best performance @ S@h/stock CUDA23 app. I tested one MW@h WU with the new OpenCL app - immediately error. But, why got my machine with 190.38 the OpenCL app? It's not possible (via server) to send out the OpenCL app only to hosts with at least 197.x nVIDIA driver? If there are a lot of < 197.x driver hosts out there, wasted project server performance. DL/errors, DL/errors and DL/errors.. It's possible via app_info.xml to use the old (IIRC 0.24 CUDA23) app? Or this app don't work with the new WUs? BTW. In past I saw the german translation of this site. Since a few months only english. It's a mistake or wanted? |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
Ohh.. a pity, MW@h canceled the CUDA apps? Please read this http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=1505&nowrap=true#46230 Cuda should still work. |
©2024 Astroinformatics Group