Welcome to MilkyWay@home

Nvidia OpenCL updated

Message boards : News : Nvidia OpenCL updated
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 46058 - Posted: 8 Feb 2011, 1:00:11 UTC

I've updated the Nvidia/OpenCL application to 0.52 which should fix the failures on the 23* tasks.
ID: 46058 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 712
Credit: 553,280,724
RAC: 55,371
Message 46067 - Posted: 8 Feb 2011, 20:36:35 UTC - in response to Message 46058.  

Matt, how do I find the name of the new 0.52 OpenCL app so I can update my app_info file to download it? I see it listed in the project apps list but can't figure out how to get it without reverting back to no app_info. I use app_info to change my count to .5.

Thanks, Keith

ID: 46067 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 712
Credit: 553,280,724
RAC: 55,371
Message 46068 - Posted: 8 Feb 2011, 20:50:17 UTC - in response to Message 46067.  

Never mind, I found the download directory.

Keith
ID: 46068 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Paul Sands
Avatar

Send message
Joined: 6 Oct 07
Posts: 1
Credit: 76,124,642
RAC: 583
Message 46069 - Posted: 8 Feb 2011, 21:04:17 UTC

All my linux hosts seem to be failing all tasks with the new 0.52 OpenCL app.
I have set them to no new work. So far my Windows hosts are doing fine with the new 0.52 OpenCL app.
ID: 46069 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 712
Credit: 553,280,724
RAC: 55,371
Message 46070 - Posted: 8 Feb 2011, 21:23:51 UTC - in response to Message 46069.  

Yes, I am having computation errors with all tasks using the new .52 Linux OpenCL app also. Will revert back to the .50 app until it gets figured out.

Keith

ID: 46070 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [AF>EDLS]GuL
Avatar

Send message
Joined: 5 Jun 08
Posts: 21
Credit: 245,803,013
RAC: 0
Message 46071 - Posted: 8 Feb 2011, 23:04:24 UTC - in response to Message 46070.  

I have also the same problem : all the wu 0.52 are failing on my linux ubuntu 10.10 host http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=66288
I am using nvidia 270.18 x64 with a GTX260, boinc 6.10.58 x64 and have reseted the project. Any help, please ?
ID: 46071 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 46072 - Posted: 8 Feb 2011, 23:07:56 UTC - in response to Message 46069.  

All my linux hosts seem to be failing all tasks with the new 0.52 OpenCL app.
I made a really dumb mistake in the Linux build. Should be fixed now (0.54).
ID: 46072 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 712
Credit: 553,280,724
RAC: 55,371
Message 46076 - Posted: 9 Feb 2011, 7:09:35 UTC - in response to Message 46072.  

Matt, thanks for making the new build so quickly and fixing the problem. Running 0.54 quite well now.

Keith
ID: 46076 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
europa

Send message
Joined: 29 Oct 10
Posts: 89
Credit: 39,246,947
RAC: 0
Message 46085 - Posted: 9 Feb 2011, 10:12:52 UTC - in response to Message 46072.  

Matt,

Thanks for the update. I too was having huge problems. Looking forward to producing good WUs again.

Regards,
Steve
Ubuntu 10.04
ID: 46085 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Landon Oswalt

Send message
Joined: 8 Jan 11
Posts: 1
Credit: 1,584,923
RAC: 0
Message 46103 - Posted: 9 Feb 2011, 20:40:03 UTC

Matt, Im running 052 ver on 2 gtx 460's and for some reason the cards will not finish the unit? it will keep working untill i close bonic. one unit worked up untill 175%? i could complete a unit in 12 min, now hours. Any ideas how to fix this problem?


happy crunching,
Landon Oswalt
ID: 46103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 46107 - Posted: 9 Feb 2011, 20:53:56 UTC - in response to Message 46103.  

Matt, Im running 052 ver on 2 gtx 460's and for some reason the cards will not finish the unit? it will keep working untill i close bonic. one unit worked up untill 175%? i could complete a unit in 12 min, now hours. Any ideas how to fix this problem?
A bunch of workunits were started which were way too big and taking too long on CPUs and many weaker GPUs. The total number of steps in the progress calculation was overflowing the 32 bit limit and wrapping around, causing progress bars to go over 100%. These should go away soon.
ID: 46107 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
europa

Send message
Joined: 29 Oct 10
Posts: 89
Credit: 39,246,947
RAC: 0
Message 46113 - Posted: 9 Feb 2011, 23:02:50 UTC - in response to Message 46107.  

I'm still having major problems on my machines (Ubuntu 10.04 with GTX460 cards). Up until about 5 days ago, things were running great, the cards were switching off with Einstein running 2 WU's simultaneously. Life was good.

Now, despite detaching and re=attaching through BoincStats and doing anything else I can think of, I can't even get work units or the apps downloaded. The one machine that had a big backlog of v .50 WU's was unaffected but has now finished all of those WUs. It has been showing 3 WUs for nbodySim 0.21 for days in "downloading" status, but nothing has come through.

Einstein is having a field day with all of the GPUs to itself.

Is there anything that I can do from here to get things going again with MW?

Thanks for the help.

Regards,
Steve
ID: 46113 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46114 - Posted: 9 Feb 2011, 23:14:34 UTC - in response to Message 46113.  

I'm still having major problems on my machines (Ubuntu 10.04 with GTX460 cards). Up until about 5 days ago, things were running great, the cards were switching off with Einstein running 2 WU's simultaneously. Life was good.

Now, despite detaching and re=attaching through BoincStats and doing anything else I can think of, I can't even get work units or the apps downloaded. The one machine that had a big backlog of v .50 WU's was unaffected but has now finished all of those WUs. It has been showing 3 WUs for nbodySim 0.21 for days in "downloading" status, but nothing has come through.

Einstein is having a field day with all of the GPUs to itself.

Is there anything that I can do from here to get things going again with MW?

Thanks for the help.

Regards,
Steve

If you checked the server status page, you will have noticed that there are no work units to download. Something is messed up with the server at this moment.
ID: 46114 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
europa

Send message
Joined: 29 Oct 10
Posts: 89
Credit: 39,246,947
RAC: 0
Message 46137 - Posted: 10 Feb 2011, 17:41:05 UTC - in response to Message 46114.  

Well, at least I won't be sending back bad WU's anymore due to computational errors.

Hope it all gets sorted out.Judging from the various postings, it sounds like multiple unrelated problems.

Regards,
Steve
ID: 46137 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 712
Credit: 553,280,724
RAC: 55,371
Message 46139 - Posted: 10 Feb 2011, 19:08:03 UTC - in response to Message 46137.  

I just switched back over to the Linux side and now have 14 tasks that exited with a compute error on the new 0.54 Linux OpenCL app. Could this be because of the recent incorrectly sized work that was sent out? Here is a shortened result from a task that errored out:

<core_client_version>6.12.12</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
<search_application> milkywayathome separation 0.54 Linux x86_64 double OpenCL </search_application>
Found 1 platforms
Platform 0 information:
Platform name: NVIDIA CUDA
Platform version: OpenCL 1.0 CUDA 3.2.1
Platform vendor:
Platform profile:
Platform extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
Using device 0 on platform 0
Found 1 CL devices
Device GeForce GTX 460 (NVIDIA Corporation:0x10de)
Type: CL_DEVICE_TYPE_GPU
Driver version: 260.19.36
Version: OpenCL 1.0 CUDA
Compute capability: 2.1
Little endian: CL_TRUE
Error correction: CL_FALSE
Image support: CL_TRUE
Address bits: 32
Max compute units: 7
Clock frequency: 1430 Mhz
Global mem size: 1072889856
Max mem alloc: 268222464
Global mem cache: 114688
Cacheline size: 128
Local mem type: CL_LOCAL
Local mem size: 49152
Max const args: 9
Max const buf size: 65536
Max parameter size: 4352
Max work group size: 1024
Max work item dim: 3
Max work item sizes: { 1024, 1024, 64 }
Mem base addr align: 4096
Min type align size: 128
Timer resolution: 1000 ns
Double extension: MW_CL_KHR_FP64
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64

Compiler flags:
-cl-mad-enable -cl-no-signed-zeros -cl-strict-aliasing -cl-finite-math-only -DUSE_CL_MATH_TYPES=0 -DUSE_MAD=1 -DUSE_FMA=0 -cl-nv-verbose -DDOUBLEPREC=1 -DMILKYWAY_MATH_COMPILATION -DNSTREAM=1 -DFAST_H_PROB=1 -DAUX_BG_PROFILE=0 -DUSE_IMAGES=1 -DI_DONT_KNOW_WHY_THIS_DOESNT_WORK_HERE=1

Build status: CL_BUILD_SUCCESS
Build log:

: Considering profile 'compute_20' for gpu='sm_21' in 'cuModuleLoadDataEx_4'
Kernel work group info:
Work group size = 576
Kernel local mem size = 0
Compile work group size = { 0, 0, 0 }
Group size = 64, per CU = 32, threads per CU = 2048
Block size = 14336
Desired = 367
Min sol: 1 0
Min sol: 1 0
Min sol: 1 0
Min sol: 1 0
Didn't find a solution. Using fallback solution n = 375, x = 0
Using solution: n = 375, x = 0
Range: { nu_steps = 1500, mu_steps = 3500, r_steps = 3000 }
Iteration area: 10500000
Chunk estimate: 367
Num chunks: 375
Added area: 0
Effective area: 10500000
Block size: 14336
Global dimensions not divisible by local
Failed to find good run sizes
Failed to finish: CL_INVALID_COMMAND_QUEUE
Failed to run nu step: CL_INVALID_COMMAND_QUEUE
Failed to calculate integral 0
02:49:02 (2522): called boinc_finish

</stderr_txt>
]]>

ID: 46139 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Roel

Send message
Joined: 4 Aug 08
Posts: 1
Credit: 526,155
RAC: 0
Message 46217 - Posted: 13 Feb 2011, 14:46:57 UTC

All my 0.52 (cuda_opencl) WU's end within one or at most some seconds with a computation error. My laptop uses a NVIDIA Geforce GT445M. What can be wrong?
ID: 46217 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DanNeely

Send message
Joined: 6 Oct 09
Posts: 39
Credit: 78,881,405
RAC: 0
Message 46227 - Posted: 13 Feb 2011, 22:41:01 UTC

I'm seeing 100% failure with win7-64 and GTX260s/
ID: 46227 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 712
Credit: 553,280,724
RAC: 55,371
Message 46229 - Posted: 14 Feb 2011, 2:03:56 UTC - in response to Message 46227.  

Other than the few WU that were too large and errored out, I seem to be running OpenCL WU on my 64 bit Linux 0.54 app successfully. Just looked at two WU, just reported and they validated. Maybe Matt needs to look at the build of the 0.52 Windows app and see if he missed something like the obvious goof he made on the Linux 0.52 app.

Keith
ID: 46229 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Dirk Sadowski

Send message
Joined: 30 Apr 09
Posts: 101
Credit: 29,874,293
RAC: 0
Message 46231 - Posted: 14 Feb 2011, 3:39:12 UTC

Ohh.. a pity, MW@h canceled the CUDA apps?

Now OpenCL? AFAIK, at least 197.x nVIDIA driver needed.
But, my machines need to stay with 190.38 which give the best performance @ S@h/stock CUDA23 app.

I tested one MW@h WU with the new OpenCL app - immediately error.
But, why got my machine with 190.38 the OpenCL app?
It's not possible (via server) to send out the OpenCL app only to hosts with at least 197.x nVIDIA driver?
If there are a lot of < 197.x driver hosts out there, wasted project server performance. DL/errors, DL/errors and DL/errors..

It's possible via app_info.xml to use the old (IIRC 0.24 CUDA23) app?
Or this app don't work with the new WUs?


BTW.
In past I saw the german translation of this site. Since a few months only english. It's a mistake or wanted?

ID: 46231 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Werkstatt

Send message
Joined: 19 Feb 08
Posts: 350
Credit: 141,284,369
RAC: 0
Message 46234 - Posted: 14 Feb 2011, 6:13:10 UTC - in response to Message 46231.  

Ohh.. a pity, MW@h canceled the CUDA apps?

It's possible via app_info.xml to use the old (IIRC 0.24 CUDA23) app?
Or this app don't work with the new WUs?



Please read this http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=1505&nowrap=true#46230
Cuda should still work.
ID: 46234 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : News : Nvidia OpenCL updated

©2024 Astroinformatics Group