OpenCL for Nvidia available for testing
log in

Advanced search

Message boards : News : OpenCL for Nvidia available for testing

Author Message
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44588 - Posted: 3 Dec 2010 | 2:07:51 UTC
Last modified: 8 Dec 2010 | 7:20:57 UTC

The OpenCL application for Nvidia GPUs is ready for testing for Windows and Linux x86_64. I'm particularly interested in the performance / responsiveness tradeoff on mid-low range GPUs.

Many thanks to cncguru for donating his GTX 480. If I hadn't had it, it would be about 30% slower than it is.

http://milkyway.cs.rpi.edu/milkyway/download/test/milkyway_separation_0.48_x86_64-pc-linux-gnu__cuda_opencl.tar.gz
http://milkyway.cs.rpi.edu/milkyway/download/test/milkyway_separation_0.48_windows_intelx86__cuda_opencl.zip

Extract these to the project directory. On Windows this is something like C:\ProgramData\BOINC\projects\milkyway.cs.rpi.edu_milkyway
On Ubuntu for me, this is /var/lib/boinc-client/projects/milkyway.cs.rpi.edu_milkyway

Minor update:
http://milkyway.cs.rpi.edu/milkyway/download/test/milkyway_separation_0.48.1_windows_intelx86__cuda_opencl.zip
http://milkyway.cs.rpi.edu/milkyway/download/test/milkyway_separation_0.48.1_x86_64-pc-linux-gnu__cuda_opencl.tar.gz

Another minor update:
http://milkyway.cs.rpi.edu/milkyway/download/test/milkyway_separation_0.48.2_windows_intelx86__cuda_opencl.zip
http://milkyway.cs.rpi.edu/milkyway/download/test/milkyway_separation_0.48.2_x86_64-pc-linux-gnu__cuda_opencl.tar.gz

Profile mdhittle*
Avatar
Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
Message 44590 - Posted: 3 Dec 2010 | 2:39:10 UTC

@ Matt - do you have a link for the download or will it download automatically as a stock app?

-Mike
____________

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44591 - Posted: 3 Dec 2010 | 2:42:52 UTC - in response to Message 44590.

@ Matt - do you have a link for the download or will it download automatically as a stock app?
You have to manually download from the links I posted. I'm not putting it up as stock yet. I think the server still needs to be updated for that to happen, and I want to hear if the system responsiveness is a problem on some lower end GPUs.

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 914
Credit: 74,781,320
RAC: 237
Message 44592 - Posted: 3 Dec 2010 | 4:01:27 UTC

I just installed it on my GTX460 host and will see what it does on there.
http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=179088
____________

Profile DoctorNow
Avatar
Send message
Joined: 28 Aug 07
Posts: 146
Credit: 5,183,509
RAC: 0
Message 44604 - Posted: 3 Dec 2010 | 10:47:58 UTC
Last modified: 3 Dec 2010 | 10:49:23 UTC

Just tried to run that version on my GTX260.
But I only get WUs that error out immediately like this one.
Do I need something else for it to run?
____________
Member of BOINC@Heidelberg and ATA!

My BOINCstats

Profile Werkstatt
Send message
Joined: 19 Feb 08
Posts: 297
Credit: 105,742,273
RAC: 84
Message 44606 - Posted: 3 Dec 2010 | 12:15:01 UTC

I just downloaded and installed it. Fails immediate. Is there anything more to do?
win7-64, nVidia 460. Forceware 259.19, BM 6.10.58
Alexander

Profile [AF>EDLS] Polynesia
Avatar
Send message
Joined: 5 Apr 09
Posts: 71
Credit: 3,192,490
RAC: 2,617
Message 44607 - Posted: 3 Dec 2010 | 14:51:36 UTC

Hello,

Me it works, for cons I see only one unit in my stats that are validated and any error in or pending?!

when I returned well over a ....
____________
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470

CTAPbIi
Send message
Joined: 4 Jan 10
Posts: 86
Credit: 51,753,924
RAC: 0
Message 44608 - Posted: 3 Dec 2010 | 16:24:46 UTC

Thanks a lot, Matt :-) I'll try it later 2day on ubuntu x64.
____________

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 914
Credit: 74,781,320
RAC: 237
Message 44609 - Posted: 3 Dec 2010 | 16:25:17 UTC

If I remember correctly, you will need at least a 26x.00 driver to use the openCL app.

Upgrade to the newest and see if it runs correctly.
____________

CTAPbIi
Send message
Joined: 4 Jan 10
Posts: 86
Credit: 51,753,924
RAC: 0
Message 44611 - Posted: 3 Dec 2010 | 16:59:02 UTC

BTW, Matt, can I run OpenCL app on GTX275 and which driver's and CUDA versions I need? I've got 195.30 and cuda 2.3
____________

Profile Werkstatt
Send message
Joined: 19 Feb 08
Posts: 297
Credit: 105,742,273
RAC: 84
Message 44612 - Posted: 3 Dec 2010 | 17:04:39 UTC - in response to Message 44609.

If I remember correctly, you will need at least a 26x.00 driver to use the openCL app.

Upgrade to the newest and see if it runs correctly.


Good idea ...
THX, seems running now.

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44613 - Posted: 3 Dec 2010 | 17:06:12 UTC - in response to Message 44611.

BTW, Matt, can I run OpenCL app on GTX275 and which driver's and CUDA versions I need? I've got 195.30 and cuda 2.3
The minimum driver which is supposed to work is 197.13, but I've only actually tested with the latest drivers.

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 914
Credit: 74,781,320
RAC: 237
Message 44614 - Posted: 3 Dec 2010 | 17:06:59 UTC - in response to Message 44611.

BTW, Matt, can I run OpenCL app on GTX275 and which driver's and CUDA versions I need? I've got 195.30 and cuda 2.3


http://www.nvidia.com/object/linux-display-amd64-260.19.21-driver.html
____________

CTAPbIi
Send message
Joined: 4 Jan 10
Posts: 86
Credit: 51,753,924
RAC: 0
Message 44615 - Posted: 3 Dec 2010 | 17:10:59 UTC

Thansk, guys :-)

The only thing - I tested that on GPUGRID and there cuda 2.3 is way faster then codu 3.0 and just faster then 3.1... But NP, I'll try latest drivers...
____________

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44616 - Posted: 3 Dec 2010 | 17:15:37 UTC - in response to Message 44604.

Just tried to run that version on my GTX260.
But I only get WUs that error out immediately like this one.
Do I need something else for it to run?
The workunit is gone now, so I can't tell.

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44617 - Posted: 3 Dec 2010 | 17:18:09 UTC - in response to Message 44588.

Is anyone running it in a system with both ATI and Nvidia drivers installed? I just realized a hypothetical problem that might happen.

CTAPbIi
Send message
Joined: 4 Jan 10
Posts: 86
Credit: 51,753,924
RAC: 0
Message 44618 - Posted: 3 Dec 2010 | 17:21:46 UTC
Last modified: 3 Dec 2010 | 17:27:25 UTC

ok, I understand - I have to move to CUDA 3.0 at least.

The latest driver (260.19.21) is on CUDA 3.2, so I think it's necessary to get cudatoolkit 3.2 from here
____________

Profile [AF>EDLS] Polynesia
Avatar
Send message
Joined: 5 Apr 09
Posts: 71
Credit: 3,192,490
RAC: 2,617
Message 44619 - Posted: 3 Dec 2010 | 17:22:51 UTC

Why in the stats on the project account, the stats of these test units do not stay posted?
____________
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470

Profile DoctorNow
Avatar
Send message
Joined: 28 Aug 07
Posts: 146
Credit: 5,183,509
RAC: 0
Message 44620 - Posted: 3 Dec 2010 | 17:43:12 UTC - in response to Message 44609.

If I remember correctly, you will need at least a 26x.00 driver to use the openCL app.

Upgrading the driver did the trick, you were right on this. I had a version beneath 260.x. Thanx for that hint!
The WUs are running fine now. :-)
____________
Member of BOINC@Heidelberg and ATA!

My BOINCstats

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44621 - Posted: 3 Dec 2010 | 17:46:29 UTC - in response to Message 44619.

Why in the stats on the project account, the stats of these test units do not stay posted?
I don't know. I consistently have a problem with this.

CTAPbIi
Send message
Joined: 4 Jan 10
Posts: 86
Credit: 51,753,924
RAC: 0
Message 44622 - Posted: 3 Dec 2010 | 17:46:54 UTC - in response to Message 44620.
Last modified: 3 Dec 2010 | 17:47:24 UTC

The WUs are running fine now. :-)

how many secs on which card?
____________

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 914
Credit: 74,781,320
RAC: 237
Message 44623 - Posted: 3 Dec 2010 | 17:53:12 UTC

Here is the stderr that came up on my 460.

Run time 614.917938
CPU time 31.6875
stderr out

<core_client_version>6.12.6</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkywayathome separation 0.48 Windows x86 double OpenCL </search_application>
Found 1 platforms
Platform 0 information:
Platform name: NVIDIA CUDA
Platform version: OpenCL 1.0 CUDA 3.2.1
Platform vendor:
Platform profile:
Platform extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
Using device 0 on platform 0
Found 1 CL devices
Device GeForce GTX 460 (NVIDIA Corporation:0x10de)
Type: CL_DEVICE_TYPE_GPU
Driver version: 263.06
Version: OpenCL 1.0 CUDA
Compute capability: 2.1
Little endian: CL_TRUE
Error correction: CL_FALSE
Image support: CL_TRUE
Address bits: 32
Max compute units: 7
Clock frequency: 1600 Mhz
Global mem size: 804847616
Max mem alloc: 201211904
Global mem cache: 114688
Cacheline size: 128
Local mem type: CL_LOCAL
Local mem size: 49152
Max const args: 9
Max const buf size: 65536
Max parameter size: 4352
Max work group size: 1024
Max work item dim: 3
Max work item sizes: { 1024, 1024, 64 }
Mem base addr align: 4096
Min type align size: 128
Timer resolution: 1000 ns
Double extension: MW_CL_KHR_FP64
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64

Compiler flags:
-cl-mad-enable -cl-no-signed-zeros -cl-strict-aliasing -cl-finite-math-only -DUSE_CL_MATH_TYPES=0 -DUSE_MAD=0 -DUSE_FMA=0 -cl-nv-verbose -DDOUBLEPREC=1 -DMILKYWAY_MATH_COMPILATION -DNSTREAM=3 -DFAST_H_PROB=1 -DAUX_BG_PROFILE=0 -DUSE_IMAGES=1 -DI_DONT_KNOW_WHY_THIS_DOESNT_WORK_HERE=1

Build status: CL_BUILD_SUCCESS
Build log:

: Considering profile 'compute_20' for gpu='sm_21' in 'cuModuleLoadDataEx_4'
Kernel work group info:
Work group size = 512
Kernel local mem size = 0
Compile work group size = { 0, 0, 0 }
Lower n solution: n = 40, x = 0
Higher n solution: n = 40, x = 0
Using solution: n = 40, x = 0
Range: { nu_steps = 640, mu_steps = 1600, r_steps = 1400 }
Iteration area: 2240000
Chunk estimate: 40
Num chunks: 40
Added area: 0
Effective area: 2240000
Integration time: 584.577879 s. Average time per iteration = 913.402936 ms
<background_integral> 0.00057622614096211838 </background_integral>
<stream_integrals> 105.68890505238272000000 172.60556751830708000000 161.27537590087846000000 </stream_integrals>
<background_only_likelihood> -3.29958249045773530000 </background_only_likelihood>
<stream_only_likelihood> -37.34692154949922100000 -4.75605645991618700000 -3.79966623105932080000 </stream_only_likelihood>
<search_likelihood> -3.00484287881831240000 </search_likelihood>
10:44:59 (2892): called boinc_finish

</stderr_txt>
]]>

Validate state Valid
Claimed credit 0.177164871378224
Granted credit 213.760359413782

____________

CTAPbIi
Send message
Joined: 4 Jan 10
Posts: 86
Credit: 51,753,924
RAC: 0
Message 44624 - Posted: 3 Dec 2010 | 18:02:28 UTC
Last modified: 3 Dec 2010 | 18:04:44 UTC

615 secs??? wow... even if u run 2 WUs concurrent, it too much. my 4870 run it in 325 secs and 4890 in 312 secs. but taking in consideration dp cropped by nvidia in fermi cards...

But anyway I'll try it on GTX275
____________

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 914
Credit: 74,781,320
RAC: 237
Message 44625 - Posted: 3 Dec 2010 | 18:06:01 UTC

Yes I know, my 5830 runs the same type of unit in 130 seconds.
____________

CTAPbIi
Send message
Joined: 4 Jan 10
Posts: 86
Credit: 51,753,924
RAC: 0
Message 44628 - Posted: 3 Dec 2010 | 18:19:45 UTC
Last modified: 3 Dec 2010 | 18:22:57 UTC

So, there is room for improvement :-)
____________

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44630 - Posted: 3 Dec 2010 | 18:24:45 UTC - in response to Message 44628.

So, there is some room for inprovement :-)
The theoretical performance of doubles on ATI hardware is much higher than on Nvidia, so it's not expected to match that. It's expected to be about the same or marginally faster than the old CUDA application.

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 914
Credit: 74,781,320
RAC: 237
Message 44631 - Posted: 3 Dec 2010 | 18:30:19 UTC - in response to Message 44630.

So, there is some room for inprovement :-)
The theoretical performance of doubles on ATI hardware is much higher than on Nvidia, so it's not expected to match that. It's expected to be about the same or marginally faster than the old CUDA application.


With the old CUDA app that Crunch3r fixed for Fermi cards I was running around 11 minutes for the 213 point units.
____________

CTAPbIi
Send message
Joined: 4 Jan 10
Posts: 86
Credit: 51,753,924
RAC: 0
Message 44632 - Posted: 3 Dec 2010 | 18:32:30 UTC

yep, agree. dp is 1/8 of sp instead of 1/2... so, there are no changes in my plans to get pair of 6970 :-)

when u plan to release OpenCL app for Ati? :-)
____________

Profile Werkstatt
Send message
Joined: 19 Feb 08
Posts: 297
Credit: 105,742,273
RAC: 84
Message 44636 - Posted: 3 Dec 2010 | 21:27:07 UTC

I have now a couple of wu's finished.
11min 34sec is a typical time for de_separation_16_3s_fix_1_1137162_...
GTX460 @ 715MHz, win7-64, E8400, BM 6.10.58, Forceware 260.99
GPU-Usage ~98%, Mem usage 255MB (Afterburner)
No invalid or errors logged (but they don't stay long..)

One World, One Dream
Send message
Joined: 26 Dec 09
Posts: 1
Credit: 615,993
RAC: 0
Message 44638 - Posted: 3 Dec 2010 | 22:39:42 UTC

I have tested the new OpenCL app with a Geforce GT 420m notebook GPU.

When I used Crunch3er's optimized app in the past, my work units took either 49 or 73 minutes to complete. With the new OpenCL app, work units need either 51 or 75 minutes to complete. I do not know why the new app is actually a bit slower, maybe it is because I was surfing on the web and displaying some images while running the work units? (though simple tasks like this never seemed to affect the work units of the old app by Crunch3er)

Furthermore, there is a significant difference in system responsiveness and a slight difference in GPU temperature between the two apps.

While Crunch3er's app did not cause any negative effects regarding system responsiveness, the new OpenCL app causes the system to react very sluggishly, so that comfortably writing something or surfing the web is not possible anymore.

Lastly, GPU temperature with the OpenCL app is about 3 degrees celsius higher than with Crunch3er's app. That is why I have now switched back to the old app for the time being.

I hope my information gathered about the performance of the new app is helpful.

europa
Send message
Joined: 29 Oct 10
Posts: 63
Credit: 23,977,739
RAC: 7,595
Message 44644 - Posted: 4 Dec 2010 | 1:00:55 UTC - in response to Message 44630.

Matt,

I followed the link above to extract the tar file into the MW sub-dir under boinc-client on /var however I am still getting the exact same error message about the app not finding a double-precision card even though it accurately identifies the Fermi. I'm on 64-bit Ubuntu.

In addition, all of the WU's are id'd as cuda 23 units, there is no mention in the error message or in the WU log of Open CL.

I notice that in the app_info.xml it refers to:

<file_info>
<name>milkyway_separation_0.48_x86_64-pc-linux-gnu__cuda_opencl</name>
<executable/>
</file_info>

However, there is no such file. The only executable that it unpacked was:
milkyway_0.24_x86_64-pc-linux-gnu__cuda23

Also, does:
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>

refer to the number of processors or to its ID number? Mine always comes up as GPU0 which is why I ask.

Thanks,
Steve

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44645 - Posted: 4 Dec 2010 | 1:11:20 UTC - in response to Message 44644.

Matt,

I followed the link above to extract the tar file into the MW sub-dir under boinc-client on /var however I am still getting the exact same error message about the app not finding a double-precision card even though it accurately identifies the Fermi. I'm on 64-bit Ubuntu.

In addition, all of the WU's are id'd as cuda 23 units, there is no mention in the error message or in the WU log of Open CL.

I notice that in the app_info.xml it refers to:

<file_info>
<name>milkyway_separation_0.48_x86_64-pc-linux-gnu__cuda_opencl</name>
<executable/>
</file_info>

However, there is no such file. The only executable that it unpacked was:
milkyway_0.24_x86_64-pc-linux-gnu__cuda23
Well BOINC is rather eager to delete anything that isn't mentioned in any of the XML files. It looks like something else was wrong, and then this got deleted and it attempted to download and use the CUDA one. You might need to chown what you extract to boinc:boinc for it to work. It seems to be unhappy when the boinc user doesn't own the files.
Also, does:
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>

refer to the number of processors or to its ID number?
It's the count of GPUs that will be used. The application only uses 1, so it should always be 1.

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44646 - Posted: 4 Dec 2010 | 2:07:15 UTC - in response to Message 44645.

Well BOINC is rather eager to delete anything that isn't mentioned in any of the XML files. It looks like something else was wrong, and then this got deleted and it attempted to download and use the CUDA one. You might need to chown what you extract to boinc:boinc for it to work. It seems to be unhappy when the boinc user doesn't own the files.
Actually I just checked this. It doesn't need to be owned by boinc, but otherwise you need to be in the boinc group and the stuff needs to be group readable and executable.

tolafoph
Send message
Joined: 24 Nov 10
Posts: 1
Credit: 41,702
RAC: 0
Message 44654 - Posted: 4 Dec 2010 | 10:14:19 UTC - in response to Message 44646.

Hi,
I´m new to milkyway@home, so I don´t know exactly how long the WUs took with the CUDA App, but here are the numbers for the new opencl app.

Boinc 6.10.58, E6750@3.2GHz, GTX 260-216, driver 260.99, vista 64

1,689.29 sec 320.63 credits
914.38 sec 213.76 credits
960.59 sec 213.76 credits

GPU usage @ ~90%

europa
Send message
Joined: 29 Oct 10
Posts: 63
Credit: 23,977,739
RAC: 7,595
Message 44657 - Posted: 4 Dec 2010 | 12:38:46 UTC - in response to Message 44645.

Matt,

Thanks for the feedback. I see where I wasn't the owner for some of the files.

I think that I've caught all of them.

It sounds like I should purge the WU's in progress and re-extract the tar?

Thanks,
Steve

Evil Penguin
Send message
Joined: 9 Nov 09
Posts: 9
Credit: 19,725,158
RAC: 0
Message 44660 - Posted: 4 Dec 2010 | 14:43:30 UTC

Sorry to go a bit off topic, but will the ATi OpenCL version come out soon?

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 44664 - Posted: 4 Dec 2010 | 16:22:00 UTC - in response to Message 44624.

615 secs??? wow... even if u run 2 WUs concurrent, it too much. my 4870 run it in 325 secs and 4890 in 312 secs. but taking in consideration dp cropped by nvidia in fermi cards...

But anyway I'll try it on GTX275


nvidia GPUs aren't nearly as fast as the ATI GPUs for double precision calculations. So that's really not too bad.
____________

Zeddicus
Send message
Joined: 30 May 10
Posts: 2
Credit: 2,351
RAC: 0
Message 44675 - Posted: 4 Dec 2010 | 18:13:46 UTC - in response to Message 44664.

I'm taking part in some cpu-based projects (like climateprediction.net and yoyo@home) and was looking for another project to run on my gpu (besides SETI). In the past milkyway told me that my gpu was lacking memory so i thought that maybe the OpenCL version would run. After installing the package and updating my NVIDIA driver to 260.99 I was happy to get some WUs - but they all ended up with "calculating error". - Okay, let's do it step by step... So at first I've updated Boinc to 6.10.58. Now milkyway says at start-up "Message from server: Your app_info.xml file doesn't have a version of MilkyWay@Home N-Body Simulation."

Hmmh, what does that tell me? Did I make any mistake? Or do I need the formerly mentioned 3.2 cudatoolkit from that guy "Crunch3er"? Any help is appreciated!

By the way: GeForce 8800 GTS (driver 26099, CUDA version 3020, compute capability 1.0).

Greetings from Germany,
Axel

P.S.: Bad English? Maybe that's because I've left school 25 years ago...

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44676 - Posted: 4 Dec 2010 | 18:44:15 UTC - in response to Message 44675.

By the way: GeForce 8800 GTS (driver 26099, CUDA version 3020, compute capability 1.0).
That GPU doesn't have doubles and won't work. It needs at least compute capability 1.3.

Profile [AF>EDLS] Polynesia
Avatar
Send message
Joined: 5 Apr 09
Posts: 71
Credit: 3,192,490
RAC: 2,617
Message 44677 - Posted: 4 Dec 2010 | 18:45:41 UTC
Last modified: 4 Dec 2010 | 18:49:39 UTC

app_info essayez avec ce fichier:

<app_info>
<app>
<name>milkyway</name>
</app>
<file_info>
<name>milkyway_0.45_windows_x86_64.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway</app_name>
<version_num>45</version_num>
<file_ref>
<file_name>milkyway_0.45_windows_x86_64.exe</file_name>
<main_program/>
</file_ref>
</app_version>

<app>
<name>milkyway_nbody</name>
<user_friendly_name>MilkyWay@Home nbody Simulation</user_friendly_name>
</app>
<file_info>
<name>milkyway_nbody_0.21_windows_x86_64__sse2.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>21</version_num>
<file_ref>
<file_name>milkyway_nbody_0.21_windows_x86_64__sse2.exe</file_name>
<main_program/>
</file_ref>
</app_version>

<app>
<name>milkyway</name>
<user_friendly_name>Milkyway@home Separation</user_friendly_name>
</app>
<file_info>
<name>milkyway_separation_0.48_windows_intelx86__cuda_opencl.exe</name>
<executable/>
</file_info>

<app_version>
<app_name>milkyway</app_name>
<version_num>47</version_num>
<plan_class>cuda_opencl</plan_class>
<avg_ncpus>0.05</avg_ncpus>
<max_ncpus>0.05</max_ncpus>
<flops>1.0e11</flops>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>milkyway_separation_0.48_windows_intelx86__cuda_opencl.exe</file_name>
<main_program/>
</file_ref>
</app_version>

<app>
<name>milkyway</name>
</app>
<file_info>
<name>milkyway_windows_intelx86__cuda23.exe</name>
<executable/>
</file_info>
<file_info>
<name>cudart.dll</name>
<executable/>
</file_info>
<file_info>
<name>cutil32.dll</name>
<executable/>
</file_info>

<app_version>
<app_name>milkyway</app_name>
<version_num>24</version_num>
<plan_class>cuda23</plan_class>
<flops>1.0e11</flops>
<avg_ncpus>0.1</avg_ncpus>
<max_ncpus>0.1</max_ncpus>
<coproc>
<type>CUDA</type>
<count>1.0</count>
</coproc>
<cmdline></cmdline>
<file_ref>
<file_name>milkyway_windows_intelx86__cuda23.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart.dll</file_name>
</file_ref>
<file_ref>
<file_name>cutil32.dll</file_name>
</file_ref>
</app_version>

</app_info>
____________
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470

Zeddicus
Send message
Joined: 30 May 10
Posts: 2
Credit: 2,351
RAC: 0
Message 44678 - Posted: 4 Dec 2010 | 18:56:24 UTC - in response to Message 44676.
Last modified: 4 Dec 2010 | 18:56:53 UTC

By the way: GeForce 8800 GTS (driver 26099, CUDA version 3020, compute capability 1.0).
That GPU doesn't have doubles and won't work. It needs at least compute capability 1.3.

Thanks for the information! But why did milkyway sent me any WUs then?

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44681 - Posted: 4 Dec 2010 | 20:31:43 UTC - in response to Message 44678.

By the way: GeForce 8800 GTS (driver 26099, CUDA version 3020, compute capability 1.0).
That GPU doesn't have doubles and won't work. It needs at least compute capability 1.3.

Thanks for the information! But why did milkyway sent me any WUs then?
If you tried manually installing this, it will try sending the workunits to it. You shouldn't get sent the Nvidia applications since you don't have doubles.

Profile Werkstatt
Send message
Joined: 19 Feb 08
Posts: 297
Credit: 105,742,273
RAC: 84
Message 44682 - Posted: 4 Dec 2010 | 22:19:57 UTC - in response to Message 44617.

Is anyone running it in a system with both ATI and Nvidia drivers installed? I just realized a hypothetical problem that might happen.


I have now a system running which has GTX460 and HD4850, running both MW and Collatz on both GPU's. No problems seen in the last 12 hours.
http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=168415

Profile cenit
Send message
Joined: 16 Mar 09
Posts: 58
Credit: 1,122,610
RAC: 0
Message 44683 - Posted: 4 Dec 2010 | 22:29:31 UTC - in response to Message 44681.
Last modified: 4 Dec 2010 | 22:29:53 UTC

By the way: GeForce 8800 GTS (driver 26099, CUDA version 3020, compute capability 1.0).
That GPU doesn't have doubles and won't work. It needs at least compute capability 1.3.

Thanks for the information! But why did milkyway sent me any WUs then?
If you tried manually installing this, it will try sending the workunits to it. You shouldn't get sent the Nvidia applications since you don't have doubles.

to be honest, this is not the way boinc should be set to work.
In your server, you should analyze the host requiring work and even if he has installed the app manually, the server should not send work because the host didn't satisfy all the requirements.
If you leave it in this way it is really easy to trick your server into doing really bad things!

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44684 - Posted: 4 Dec 2010 | 23:12:14 UTC - in response to Message 44683.

to be honest, this is not the way boinc should be set to work.
In your server, you should analyze the host requiring work and even if he has installed the app manually, the server should not send work because the host didn't satisfy all the requirements.
If you leave it in this way it is really easy to trick your server into doing really bad things!
This is related to the problem I think is most annoying in BOINC. I think everything involving the version and capability management in BOINC should be better; pretty much everything about app_info.xml and the scheduling isn't user friendly and is inflexible. The plan class system is inflexible; adding versions with different system requirements involves modifying the server code and it isn't composable for different features.

app_info.xml doesn't handle updates and requires far too much manual intervention for what most people want. You might only want to run GPU workunits, or only N-body on the CPU and separation on the GPU, but right now you can't do that. You can say no CPU or no GPU, but not on a per application basis. You have to manually download files, put in the same information in several places in app_info.xml, and then you're stuck with whatever version you happened to install unless you go out of your way to update it.

There should be a finer grain way of telling the server what capabilities different applications and possibly workunits require, and on the client there should be a way of specifying capabilities you want to use with an actual interface of some sort beyond just stating that manually installed application X should be used in an XML file, and it should handle updates automatically. I kind of want to work on a replacement that's more intelligent, but I don't really have the time.

bill
Send message
Joined: 15 Jul 09
Posts: 10
Credit: 42,684,620
RAC: 210
Message 44686 - Posted: 5 Dec 2010 | 1:28:43 UTC - in response to Message 44588.

OK Matt, if you're still interested, your new app takes a little over 17 minutes to complete 1 opencl wu. This is with Seti@Home working on both cpus (e6600@2.4GHz) with Folding@home working also. Windows XP32, Nvidia driver 206.63.

Palit GeForce GTS 450 (Fermi) Sonic Platinum 1GB

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44690 - Posted: 5 Dec 2010 | 3:10:01 UTC - in response to Message 44588.

The update I've posted should help with better errors when the GPU doesn't have doubles, and should help with system responsiveness on lower end GPUs.

Profile [AF>EDLS] Polynesia
Avatar
Send message
Joined: 5 Apr 09
Posts: 71
Credit: 3,192,490
RAC: 2,617
Message 44716 - Posted: 5 Dec 2010 | 22:35:48 UTC
Last modified: 5 Dec 2010 | 22:51:30 UTC

What updates? an updated application?
____________
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44717 - Posted: 5 Dec 2010 | 22:50:35 UTC - in response to Message 44716.

What updates? an updated application?
The second set of links I added to the original post.

Profile [AF>EDLS] Polynesia
Avatar
Send message
Joined: 5 Apr 09
Posts: 71
Credit: 3,192,490
RAC: 2,617
Message 44718 - Posted: 5 Dec 2010 | 22:56:44 UTC

ok, thank you but I did not understand what the minor updates of the application?
____________
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470

Astromancer.
Avatar
Send message
Joined: 21 Nov 09
Posts: 46
Credit: 19,439,348
RAC: 1
Message 44749 - Posted: 6 Dec 2010 | 22:34:27 UTC
Last modified: 6 Dec 2010 | 22:38:15 UTC

With 48.0 I found that the runtimes are over 2m longer than with the CUDA app on my GTX260. 15:40 - 16:00 for OpenCL and 13:22 - 13:31 for CUDA.

The memory bandwith used on the card is up to about 50% as well which is a HUGE jump from CUDA (0%). As well as having a larger system memory footprint. Just found that interesting, it doesn't particularly concern me.

And it also uses more CPU time than the CUDA app, though I didn't notice the cpu being used while I was watching the task manager. Does it use most / all of the cycles at the start or end of the WU or something?
____________

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44754 - Posted: 6 Dec 2010 | 23:38:54 UTC - in response to Message 44749.

With 48.0 I found that the runtimes are over 2m longer than with the CUDA app on my GTX260. 15:40 - 16:00 for OpenCL and 13:22 - 13:31 for CUDA.
Are you sure you're comparing the same things? Some of the newer workunits are quite a bit larger than they have been in the past. If it's not that, I there might be some mysterious problem I haven't quite figured out where there's a mysterious drop in performance at some points with how I break the problem up to keep the system responsive. There seemed to be strange peaks in run time at some points I try, and I haven't quite figured out a good rule for different GPUs. I think I had something, but I haven't actually played with it on slower GPUs. For me on the 285, it seems to be about 2% faster than the CUDA one.

The memory bandwith used on the card is up to about 50% as well which is a HUGE jump from CUDA (0%). As well as having a larger system memory footprint. Just found that interesting, it doesn't particularly concern me.
I don't see why that would happen. There's basically no transfer done except at the beginning / end.

And it also uses more CPU time than the CUDA app, though I didn't notice the cpu being used while I was watching the task manager. Does it use most / all of the cycles at the start or end of the WU or something?
Quite likely. I think the old CUDA one did the final likelihood calculation on the GPU, which takes a few seconds at the end on the CPU, but isn't actually worth the effort to do it on the GPU.

Astromancer.
Avatar
Send message
Joined: 21 Nov 09
Posts: 46
Credit: 19,439,348
RAC: 1
Message 44755 - Posted: 6 Dec 2010 | 23:46:48 UTC - in response to Message 44754.

Matt,

I downloaded the WU's all today and within about an hour of each other. I did 4 OpenCL then 3 CUDA before posting and another few CUDA after posting (With the same type of run time seen).

The memory bandwith usage struck me as odd as well which is why I posted it. I was running GPU-z to watch what was going on a bit to make sure it wasn't say using 50% of the GPU core or something and noticed that the "Memory Controller Load" was reading at 50% or over. I've only ever seen that with SETI before.

I'll give a try with 48.1 and see if anything different happens. If there is some kind of test WU I can run through the command line or the like to help you out any, I'd be more than willing to do it.
____________

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44757 - Posted: 6 Dec 2010 | 23:59:31 UTC - in response to Message 44755.

I downloaded the WU's all today and within about an hour of each other. I did 4 OpenCL then 3 CUDA before posting and another few CUDA after posting (With the same type of run time seen).
The current workunits aren't uniformly larger. There is a mixture of different sizes out right now, so it's hard to tell if this means anything without knowing which workunits you ran on each.

Astromancer.
Avatar
Send message
Joined: 21 Nov 09
Posts: 46
Credit: 19,439,348
RAC: 1
Message 44761 - Posted: 7 Dec 2010 | 2:36:23 UTC - in response to Message 44757.

After I posted the last one I figured you would need more details since the deleter is deleting things right away. (Looks like I was right) So I went about getting them for you.

One other thing I noticed on my system is that the OpenCL tasks sit at 100% with the clock still going for about 30s.

I'll PM you with all the info so I don't make a huge post full of data useless to anyone but you and Travis.
____________

europa
Send message
Joined: 29 Oct 10
Posts: 63
Credit: 23,977,739
RAC: 7,595
Message 44805 - Posted: 7 Dec 2010 | 22:39:51 UTC - in response to Message 44676.

Matt,

I wanted to give you an update on my use of the OpenCL app.

To re-cap, I'm running 64-bit Ubuntu 10.10 on an AMD quad-core with 8GB of DDR2 RAM and a GTX460 Fermi card with the latest Nvidia driver from their website.

1. Simply extracting the tar to the top-level MW folder did not work since the executable that it put there was for cuda23, not Open CL. The WU's continued to be retrieved as cuda23 WU's and the completed one's continued to fail validation.

2. Having noticed the app.xml and correct executable had been extracted into a sub-folder in the MW main folder, I first moved them to another non-MW folder and then copied the contents into the main MW folder with the rest of the extracted files.

3. I then deleted the cuda23 executable.

4. Based on discussion at Collatz, I copied libcudart32_23.so into the MW folder.

5. I suspended the WU's and exited Boinc Manager.

6. I then opened a terminal window as root and typed "service boinc-client restart" [Enter] and closed the terminal window.

7. I re-started Boinc Manager and the apps and quickly saw work units id's as "open_cl" work units along with the notation that they were using 0.5CPU and 1.0 GPU. The WU's so far have taken about 15-20 min. to process vs. a few hours before.

6. However, at first,the WU's were coming back the message "completed, validation inconclusive". Since then, it has changed to "Successful" so I guess I'm ok.

Here is the stderr_text for one of them.
Task 263496737

Name de_separation_17_3s_fix_1_1719797_1291478684_1
Workunit 198514068
Created 4 Dec 2010 16:09:29 UTC
Sent 4 Dec 2010 16:10:24 UTC
Received 4 Dec 2010 19:06:06 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 228451
Report deadline 12 Dec 2010 16:10:24 UTC
Run time 1519.988412
CPU time 8.83
stderr out

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkywayathome separation 0.48 Linux x86_64 double OpenCL </search_application>
Found 1 platforms
Platform 0 information:
Platform name: NVIDIA CUDA
Platform version: OpenCL 1.0 CUDA 3.2.1
Platform vendor:
Platform profile:
Platform extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
Using device 0 on platform 0
Found 1 CL devices
Device GeForce GTX 460 (NVIDIA Corporation:0x10de)
Type: CL_DEVICE_TYPE_GPU
Driver version: 260.19.21
Version: OpenCL 1.0 CUDA
Compute capability: 2.1
Little endian: CL_TRUE
Error correction: CL_FALSE
Image support: CL_TRUE
Address bits: 32
Max compute units: 7
Clock frequency: 1502 Mhz
Global mem size: 804454400
Max mem alloc: 201113600
Global mem cache: 114688
Cacheline size: 128
Local mem type: CL_LOCAL
Local mem size: 49152
Max const args: 9
Max const buf size: 65536
Max parameter size: 4352
Max work group size: 1024
Max work item dim: 3
Max work item sizes: { 1024, 1024, 64 }
Mem base addr align: 4096
Min type align size: 128
Timer resolution: 1000 ns
Double extension: MW_CL_KHR_FP64
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64

Compiler flags:
-cl-mad-enable -cl-no-signed-zeros -cl-strict-aliasing -cl-finite-math-only -DUSE_CL_MATH_TYPES=0 -DUSE_MAD=0 -DUSE_FMA=0 -cl-nv-verbose -DDOUBLEPREC=1 -DMILKYWAY_MATH_COMPILATION -DNSTREAM=3 -DFAST_H_PROB=1 -DAUX_BG_PROFILE=0 -DUSE_IMAGES=1 -DI_DONT_KNOW_WHY_THIS_DOESNT_WORK_HERE=1

Build status: CL_BUILD_SUCCESS
Build log:

: Considering profile 'compute_20' for gpu='sm_21' in 'cuModuleLoadDataEx_4'
Kernel work group info:
Work group size = 512
Kernel local mem size = 0
Compile work group size = { 0, 0, 0 }
Lower n solution: n = 40, x = 0
Higher n solution: n = 40, x = 0
Using solution: n = 40, x = 0
Range: { nu_steps = 640, mu_steps = 1600, r_steps = 1400 }
Iteration area: 2240000
Chunk estimate: 40
Num chunks: 40
Added area: 0
Effective area: 2240000
Integration time: 795.240523 s. Average time per iteration = 1242.563318 ms
Kernel work group info:
Work group size = 512
Kernel local mem size = 0
Compile work group size = { 0, 0, 0 }
Lower n solution: n = 38, x = 1792
Higher n solution: n = 50, x = 0
Using solution: n = 38, x = 1792
Range: { nu_steps = 640, mu_steps = 400, r_steps = 1400 }
Iteration area: 560000
Chunk estimate: 40
Num chunks: 38
Added area: 1792
Effective area: 561792
Integration time: 343.485032 s. Average time per iteration = 536.695363 ms
Kernel work group info:
Work group size = 512
Kernel local mem size = 0
Compile work group size = { 0, 0, 0 }
Lower n solution: n = 38, x = 1792
Higher n solution: n = 50, x = 0
Using solution: n = 38, x = 1792
Range: { nu_steps = 640, mu_steps = 400, r_steps = 1400 }
Iteration area: 560000
Chunk estimate: 40
Num chunks: 38
Added area: 1792
Effective area: 561792
Integration time: 369.818321 s. Average time per iteration = 577.841126 ms
<background_integral> 0.00049448328945061429 </background_integral>
<stream_integrals> 98.00489805211894633885 736.88664944943548107403 0.00661912358812184829 </stream_integrals>
<background_only_likelihood> -3.28428541304979715321 </background_only_likelihood>
<stream_only_likelihood> -35.67127293699287093887 -4.01220139598113423318 -231.34968260617702640047 </stream_only_likelihood>
<search_likelihood> -3.04403874542270713732 </search_likelihood>
12:51:02 (3471): called boinc_finish

</stderr_txt>
]]>

Validate state Checked, but no consensus yet
Claimed credit 0.0784035175711632
Granted credit 0
application version Anonymous platform

I assume since it's not failing, that it is using the Fermi at the double-precision level.

Sorry for the delay in posting this, the computer quit and it me a couple of days to fix it. Thanks for getting me back in the game.

Regards,
Steve

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44812 - Posted: 7 Dec 2010 | 23:48:17 UTC - in response to Message 44805.

Thanks to everyone you posted information. I've looked at the pieces, and I think I've pieced together why I made the slower GPUs slower; I half missed something obvious.

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44820 - Posted: 8 Dec 2010 | 7:21:52 UTC - in response to Message 44588.

I've posted another minor update which should hopefully fix being slower on some GPUs.

Profile [AF>EDLS] Polynesia
Avatar
Send message
Joined: 5 Apr 09
Posts: 71
Credit: 3,192,490
RAC: 2,617
Message 44853 - Posted: 9 Dec 2010 | 18:40:21 UTC
Last modified: 9 Dec 2010 | 18:55:06 UTC

Big trouble!!

I'm trying now the 0.48.2 and I am having slowdowns my PC that I was not with 0.48.1 ... In addition, the unit does not and this calculation is boosting the temperature of my card ...

GPU load is yet to 99% ... but 0% CPU

I do not think so in this case recalculate these units 0.48.2 ...

I still do not think that just my version of boinc: 6.12.8?
____________
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470

europa
Send message
Joined: 29 Oct 10
Posts: 63
Credit: 23,977,739
RAC: 7,595
Message 44859 - Posted: 9 Dec 2010 | 22:57:17 UTC - in response to Message 44820.

Matt,

The OpenCL setup continues to work like a charm on the first machine.

I've assembled a second machine that is virtually identical. Aside from being an AMD 6 core vs. AMD quad-core and having DDR3 RAM vs. DDR2, they are the same. I even made a point of getting the same model graphics card (MSI Twin-Frozr GTX-460). Both are 64-bit Ubuntu 10.10 and also have the backward compatibility 32-bit libraries. I updated the Nvidia driver, I've matched the permissions with those on the working machine BUTno matter what I do, I cannot get milkyway_separation_0.48_x86_64-pc-linux-gnu__cuda_opencl to run on this machine. The cuda 23 variant keeps re-appearing in the folder and executing. I've tried multiple bulk deletes of the MW folder contents and re-extractions of the OpenCL tar but I always end up with the cuda23 executable reappearing in the folder and taking over.

When I start up Boinc it sees the Fermi card.

As I said earlier, things continue to run like a charm on the first machine and this one is virtually identical. I don't figure out the problem.

Any suggestions? Thanks for any help that you can provide.

Regards,
Steve

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 914
Credit: 74,781,320
RAC: 237
Message 44860 - Posted: 9 Dec 2010 | 23:36:12 UTC - in response to Message 44859.

You are copying the app_info.xml file as well I hope, that is what tells BOINC what app to use with the project.
____________

europa
Send message
Joined: 29 Oct 10
Posts: 63
Credit: 23,977,739
RAC: 7,595
Message 44863 - Posted: 10 Dec 2010 | 0:20:35 UTC - in response to Message 44860.

Matt,

Yes, indeed.

Here it is.

<app_info>
&#8722;
<app>
<name>milkyway</name>
<user_friendly_name>Milkyway@home Separation</user_friendly_name>
</app>
&#8722;
<file_info>
&#8722;
<name>
milkyway_separation_0.48_x86_64-pc-linux-gnu__cuda_opencl
</name>
<executable/>
</file_info>
&#8722;
<app_version>
<app_name>milkyway</app_name>
<version_num>48</version_num>
<plan_class>cuda_opencl</plan_class>
<avg_ncpus>0.05</avg_ncpus>
<max_ncpus>0.05</max_ncpus>
<flops>1.0e11</flops>
&#8722;
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
&#8722;
<file_ref>
&#8722;
<file_name>
milkyway_separation_0.48_x86_64-pc-linux-gnu__cuda_opencl
</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>


It appears identical to the one on the first machine.


Here's the body of one of the stderr.txt's from machine #2. Seems like old times!! :)

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
Device index specified on the command line was 0
Looking for a Double Precision capable NVIDIA GPU
The device GeForce GTX 460 from the command line cannot be used because a device supporting compute capability 1.3 (Double Precision) is required
Found 1 CUDA cards
Found a GeForce GTX 460
Device cannot be used, it does not have compute capability 1.3 support
No compute capability 1.3 cards have been found, exiting...

//////
I installed the latest Nvidia driver from their website but am I missing some sort of cuda library file? The last time, it was the installation of the OpenCL app that solved things. I can't figure out why and where the cuda23 executable keeps coming from or gets called for.

But, I think that's symptomatic of it still trying to run cuda23 WU's rather than OpenCL ones. Is there any way to keep the cuda23 executable from returning to the MW folder?

I noticed that the WU's have changed to MilkyWay@Home N-Body Simulation v0.21 (sse2). Maybe the MW server has given up sending me cuda23 WUs?

Regards,
Steve

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44866 - Posted: 10 Dec 2010 | 1:12:17 UTC - in response to Message 44853.

I'm trying now the 0.48.2 and I am having slowdowns my PC that I was not with 0.48.1 ... In addition, the unit does not and this calculation is boosting the temperature of my card ..
Apparently the formula that I used to keep the system responsive fell apart for the 470's specifications. The temperature is most definitely expected to go up.

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 914
Credit: 74,781,320
RAC: 237
Message 44870 - Posted: 10 Dec 2010 | 3:07:49 UTC

What are you using to open the .xml files, it almost looks like a browser.

Make sure you are opening them in Gedit instead.

Also try the 48.2 version that is also in the first post of the thread.

The app_info should looks something like this.


<app_info>
<app>
<name>milkyway</name>
<user_friendly_name>Milkyway@home Separation</user_friendly_name>
</app>

<file_info>
<name>milkyway_separation_0.48.2_x86_64-pc-linux-gnu__cuda_opencl</name>
<executable/>
</file_info>

<app_version>
<app_name>milkyway</app_name>
<version_num>48</version_num>
<plan_class>cuda_opencl</plan_class>
<avg_ncpus>0.05</avg_ncpus>
<max_ncpus>0.05</max_ncpus>
<flops>1.0e11</flops>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>

<file_ref>
<file_name>milkyway_separation_0.48.2_x86_64-pc-linux-gnu__cuda_opencl</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>

____________

europa
Send message
Joined: 29 Oct 10
Posts: 63
Credit: 23,977,739
RAC: 7,595
Message 44887 - Posted: 10 Dec 2010 | 14:59:57 UTC - in response to Message 44870.

It does.

It occurred to me that there are two things that I did last time that I haven't tried yet. I haven't tried re-installing Boinc from a BOINC app site. This installation is from the Ubuntu repository. I did both last time.

The other thing was that I attached to Collatz which I understand has some good cuda libraries included. I remember copying over one of their cuda files the last time.

I'll have to try both later today.

Regards,
Steve

europa
Send message
Joined: 29 Oct 10
Posts: 63
Credit: 23,977,739
RAC: 7,595
Message 44890 - Posted: 10 Dec 2010 | 17:37:45 UTC - in response to Message 44812.

Matt,

After doing a couple of Boinc re-installs and detach/attach's to MW, I think I've spotted the problem on the second machine but I don't know what's causing it or how to fix it.

On the good machine (both are virtually identical including the exact same model GPU card) at the start of the messages on the BOINC Manager it says:

Fri 10 Dec 2010 12:21:13 PM EST NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 3020, compute capability 2.1, 767MB, 673 GFLOPS peak)

Found app_info.xml; using anonymous platform

Fri 10 Dec 2010 12:21:13 PM EST Milkyway@home URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 243370; resource share 100


But, on the problem machine it is missing the reference line to "app_info.xml; using anonymous platform" at that spot.

Fri 10 Dec 2010 12:21:13 PM EST NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 3020, compute capability 2.1, 767MB, 673 GFLOPS peak)

Fri 10 Dec 2010 12:21:13 PM EST Milkyway@home URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 243370; resource share 100

I've tried using both the original 48.0 version and the 48.2 but no luck. I just saw your posting about the new version being pushed out by the server and the removal of the cuda version. I'm hoping that will fix things on this second machine.

Regards,
Steve

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 44893 - Posted: 10 Dec 2010 | 17:56:43 UTC - in response to Message 44890.

But, on the problem machine it is missing the reference line to "app_info.xml; using anonymous platform" at that spot.
Are you sure you're putting it in the right place? It needs to be in the milkyway directory under projects.
I just saw your posting about the new version being pushed out by the server and the removal of the cuda version. I'm hoping that will fix things on this second machine.
The CUDA removal hasn't happened yet.

europa
Send message
Joined: 29 Oct 10
Posts: 63
Credit: 23,977,739
RAC: 7,595
Message 44898 - Posted: 10 Dec 2010 | 19:00:31 UTC - in response to Message 44893.

Matt,

It's there. That's the nice thing about having the working machine right next to it. I've checked everything multiple times.

So, right now, I'm hanging on for the server refresh with the OpenCl version and the removal of the cuda23. I'm hoping that will "force" whatever correction is needed.

Thanks for the suggestions.

Regards,
Steve

europa
Send message
Joined: 29 Oct 10
Posts: 63
Credit: 23,977,739
RAC: 7,595
Message 44919 - Posted: 11 Dec 2010 | 14:47:48 UTC - in response to Message 44893.

Matt,

An update. Everything is now working!

I did an uninstall/re-boot/re-install of the installation from the Ubuntu repository. I did not un-install the manual install from the BOINC site.

I then extracted the last OpenCL tar and deleted the cuda23 executable. Re-started the BOINC-Client and was greeted by the previously openCL config line in the messages. A WU popped up and has the notation that it is using 0.5CPU and 1.0 GPU and is chugging along.

A bonus, is that this has fixed all of the problems on this machine. Previously, all of the other projects on this machine with the exception of Seti were returning bad WU's.

It's been 2 hrs. and I'm now seeing credits for all of the projects for this machine.

I guess there's something that I needed from both installations.

Hooray!

Regards,
Steve

Post to thread

Message boards : News : OpenCL for Nvidia available for testing


Main page · Your account · Message boards


Copyright © 2013 AstroInformatics Group