Welcome to MilkyWay@home

GTX670's and the MilkyWay project

Message boards : Number crunching : GTX670's and the MilkyWay project
Message board moderation

To post messages, you must log in.

AuthorMessage
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56477 - Posted: 12 Dec 2012, 23:13:44 UTC

I am running GTX670's, driver 301.42, cuda 4.2.1. The system has an i7-3930k processor and 16 gigs DRAM.

I was advised that this computer might run MilkyWay very inefficiently in GPU processing. I am running the project on CPU on four machines; but I was looking to add to my GPU work where I only have Einstein and SETI on GPU.

My ScienceSprings blog has been tilting toward Astronomy lately, so I am beefing up that side of BOINC in my projects.

Any advice?


http://sciencesprings.wordpress.com
http://facebook.com/sciencesprings


ID: 56477 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JHMarshall

Send message
Joined: 24 Jul 12
Posts: 40
Credit: 7,123,301,054
RAC: 0
Message 56478 - Posted: 13 Dec 2012, 4:28:11 UTC - in response to Message 56477.  
Last modified: 13 Dec 2012, 5:14:47 UTC

I am running GTX670's, driver 301.42, cuda 4.2.1. The system has an i7-3930k processor and 16 gigs DRAM.

I was advised that this computer might run MilkyWay very inefficiently in GPU processing. I am running the project on CPU on four machines; but I was looking to add to my GPU work where I only have Einstein and SETI on GPU.

My ScienceSprings blog has been tilting toward Astronomy lately, so I am beefing up that side of BOINC in my projects.

Any advice?



Your GTX 670s will run fine on MW. The Nvdidia GPU apps will run many times faster than your CPU. I run several AMD 7950s and one Nvidia GTX 560 Ti. The 560 Ti has no problems but is definitely slower than the 7950s on MW. The AMD high end cards have much faster double precision units than the Nvidia cards and MW requires DP, so the AMD high end cards are several times faster than the Nvidia cards. This may be what you heard about inefficiency. The Nvidia cards do perform on MW and will give you much more throughput than the CPU apps.

Joe

Edit: Additional information.
GTX 560 Ti ~500 seconds
AMD 7950 60-64 seconds (slightly underclocked)

I think your 670 would be twice as fast as my 560 Ti
ID: 56478 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56485 - Posted: 14 Dec 2012, 1:25:18 UTC - in response to Message 56478.  

ID: 56485 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JHMarshall

Send message
Joined: 24 Jul 12
Posts: 40
Credit: 7,123,301,054
RAC: 0
Message 56487 - Posted: 14 Dec 2012, 4:44:42 UTC - in response to Message 56485.  

You're welcome. Crunch away!
ID: 56487 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56514 - Posted: 15 Dec 2012, 11:35:53 UTC

So, I visited the project prefs, and put the project on Nvidia, and now, I have a bunch of ps_separation_9 and 10 "Computation error" work units. They have not yet "reported", so I cannot give the Work Unit detail. They only ran for 1 second each.

Once the data is up, I will paste in the detail, unless someone says, it is the project and not me.

CPU work units are just fine.
http://sciencesprings.wordpress.com
http://facebook.com/sciencesprings


ID: 56514 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56520 - Posted: 15 Dec 2012, 13:08:29 UTC

Here is one of the WU's that failed.

Task 361228420, WU 28093553.

If I need to copy and paste in the task details, please let me know and I will do it.
http://sciencesprings.wordpress.com
http://facebook.com/sciencesprings


ID: 56520 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56522 - Posted: 15 Dec 2012, 23:00:56 UTC

This situation is untenable.

GPU work is all failures, and I am getting no CPU work on a machine which was doing great CPU work.

So, until there is a solution to the GPU problem, I am changing the project preferences back to no GPU and updating the project on this machine in the hopes of getting back to successful CPU work.
http://sciencesprings.wordpress.com
http://facebook.com/sciencesprings


ID: 56522 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56527 - Posted: 16 Dec 2012, 3:04:55 UTC

O.K., this many hours later, finally got some CPU tasks running. I guess LTD finally caught up with BOINC on this machine for this project. I am going to leave it at that until such time as I get some help with the GPU problem.
http://sciencesprings.wordpress.com
http://facebook.com/sciencesprings


ID: 56527 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JHMarshall

Send message
Joined: 24 Jul 12
Posts: 40
Credit: 7,123,301,054
RAC: 0
Message 56540 - Posted: 16 Dec 2012, 23:58:29 UTC - in response to Message 56527.  

Sorry I haven't gotten back to you, I was on vacation and shut down.

I can't see your task or work unit and I'm probably not the best person to answer /diagnose your problem. All I know is that my GTX 560 Ti runs fine but not as fast as my AMD cards.

I'm using Boinc 7.0.28, Windows 7 Pro 64 bit, and NVidia driver: (from Boinc Log)
"306.97, Cuda 5.0, Compute capabiilty 2.1" .

I looked in the top 100+ computers on MW and all NVidia cards were using 306.97. That said, when I first started using the NVidia card on Einstein it was at 301.42. I don't remember if I updated the driver before or after I started using the card on MW.

Joe
ID: 56540 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,947,628
RAC: 22,118
Message 56547 - Posted: 17 Dec 2012, 13:31:38 UTC - in response to Message 56522.  

This situation is untenable.

GPU work is all failures, and I am getting no CPU work on a machine which was doing great CPU work.

So, until there is a solution to the GPU problem, I am changing the project preferences back to no GPU and updating the project on this machine in the hopes of getting back to successful CPU work.


When I was running my Nvidia 560Ti at MW I was forced to use Boinc version 7.0.25 in order to complete units successfully. The ONLY problem is if you also run other gpu projects then that WILL be a problem as it ALWAYS runs MW projects in HIGH PRIORITY mode. I have NOT tried since I have upgraded to 7.0.40 but the one you are suing 7.0.28 NEVER worked for me. It DOES work for others, but I have no idea why it would for some but not others. IF you decide to downgrade MAKE SURE you have NO units from ANY project in your cache as they will ALL error out IMMEDIATELY, Boinc knows how to upgrade but NOT downgrade in versions!
ID: 56547 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,947,628
RAC: 22,118
Message 56548 - Posted: 17 Dec 2012, 13:36:53 UTC - in response to Message 56540.  

Sorry I haven't gotten back to you, I was on vacation and shut down.
Joe


I now go on vacation and leave my pc's ON and crunching, if they crash they crash, but if they don't it is better for me. It is VERY hard to compete with your Team if I don't keep them running!! I am kidding of course, your Team has some VERY prolific crunchers on it!! AND I LOVE the picture of Ingrid's home on the cliff!!
ID: 56548 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JHMarshall

Send message
Joined: 24 Jul 12
Posts: 40
Credit: 7,123,301,054
RAC: 0
Message 56553 - Posted: 17 Dec 2012, 16:05:35 UTC - in response to Message 56548.  

Sorry I haven't gotten back to you, I was on vacation and shut down.
Joe


I now go on vacation and leave my pc's ON and crunching, if they crash they crash, but if they don't it is better for me. It is VERY hard to compete with your Team if I don't keep them running!! I am kidding of course, your Team has some VERY prolific crunchers on it!! AND I LOVE the picture of Ingrid's home on the cliff!!


My AMD 7950 systems occasionally hang when starting a task and I hate the idea of them sucking juice and not doing anything for several days!

It sure does mess up my RAC when I shut down though!

Joe
ID: 56553 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
Message 56555 - Posted: 17 Dec 2012, 23:02:49 UTC - in response to Message 56520.  

Here is one of the WU's that failed.

Task 361228420, WU 28093553.

If I need to copy and paste in the task details, please let me know and I will do it.


Here is the relevant part of 1 task log:

Found 2 platforms
Platform 0 information:
Name: NVIDIA CUDA
Version: OpenCL 1.1 CUDA 4.2.1
Vendor: NVIDIA Corporation
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
Profile: FULL_PROFILE
Platform 1 information:
Name: NVIDIA CUDA
Version: OpenCL 1.1 CUDA 4.2.1
Vendor: NVIDIA Corporation
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 2 CL devices
Device 'GeForce GTX 670' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU)
Driver version: 306.97
Version: OpenCL 1.1 CUDA
Compute capability: 3.0
Max compute units: 7
Clock frequency: 1045 Mhz
Global mem size: 2147483648
Local mem size: 49152
Max const buf size: 65536
Double extension: cl_khr_fp64
Error creating context (-5): CL_OUT_OF_RESOURCES
Error getting device and context (-5): CL_OUT_OF_RESOURCES

Are you running other gpu projects on the same time on the same card?
If not, than something else is using too much resources on that card not letting enough room for mw to run.
ID: 56555 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56557 - Posted: 18 Dec 2012, 9:58:18 UTC

Len LE/GE

I am running Einstein and SETI GPU tasks with no problems.
http://sciencesprings.wordpress.com
http://facebook.com/sciencesprings


ID: 56557 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,947,628
RAC: 22,118
Message 56558 - Posted: 18 Dec 2012, 12:05:18 UTC - in response to Message 56557.  

Len LE/GE

I am running Einstein and SETI GPU tasks with no problems.


Ahh but resources is the key not the projects themselves, IF MW requires more resources from you gpu then you are just flat running out and having problems. The easy answer is to suspend the other projects and see if it continues, if so it is most likely something else.

I was having a problem at another project trying to run dual workunits at once. I have two Nvidia 560Ti cards and one would finish a unit in a bit over 3,000 seconds while the other was taking TWICE as long! I finally dug down into the cards resources and found the cards themselves were the problem...the cards were different brands and while both are 1gb cards, one was using more memory per workunit then the other card and just running out of resources and slowing down! I now run one unit at a time on that card and it finishes units in 3,000 seconds, just like the other card. So, on one card, I am only doing half as many units at once, but I am doing each individual unit twice as fast as it was. The program gpu-z helped me find the differences in the cards.
ID: 56558 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56561 - Posted: 18 Dec 2012, 12:28:20 UTC

mikey

Thanks for waking me up to the fact that I forgot to say I only run one GPU task a a time. I have twin GTX670's, but they have SLI enabled so as to act as one big GPU card, card 0 and even so, in cc_config file I have <ignore_cuda_dev>1</ignore_cuda_dev>.

(SLI is not viewed favorably outside the gamer community; but I tried running with SLI disabled. The problem was, BOINC wants to be on device 0 and so does Maingear, my builder. This created way too many problems, so I went with this configuration.)
http://sciencesprings.wordpress.com
http://facebook.com/sciencesprings


ID: 56561 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56562 - Posted: 18 Dec 2012, 19:31:27 UTC

A friend at BOINC says you guys should see this page https://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueNDRangeKernel.html.

He feels this explains why I (and others) might be having a problem with this project on GPU.

He says, "... It's not resources as in 'everything capable of doing calculations in your computer', or 'memory and disk values'. It's resources that OpenCL uses to set up the kernels, store kernels, do calculations on them.

So definitely something in the science application and thus something that Milkyway will want to know about. The thing I quoted from Khronos, you may want to copy & paste that in your thread at Milkyway, they may want to know about it.

Khronos are the developers of OpenCL, it's their explanation of what CL_OUT_OF_RESOURCES stands for...."

I sure would love to solve this problem.
http://sciencesprings.wordpress.com
http://facebook.com/sciencesprings


ID: 56562 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,947,628
RAC: 22,118
Message 56563 - Posted: 18 Dec 2012, 21:22:02 UTC - in response to Message 56561.  

mikey

Thanks for waking me up to the fact that I forgot to say I only run one GPU task a a time. I have twin GTX670's, but they have SLI enabled so as to act as one big GPU card, card 0 and even so, in cc_config file I have <ignore_cuda_dev>1</ignore_cuda_dev>.

(SLI is not viewed favorably outside the gamer community; but I tried running with SLI disabled. The problem was, BOINC wants to be on device 0 and so does Maingear, my builder. This created way too many problems, so I went with this configuration.)


If you are willing to try something unplug the sli cable and disable the cc_config file and try and run some units again. IF it works then I am thinking the sli cable and your ignore line could be conflicting. An sli cable makes it seem like one big card, not two cards and then you tell it to ignore the second card, does that makes sense? ALOT of dual gpu setups must have a line to use all gpus in the cc_config file, do you have that? When you take the sli cable off does Boinc 'see' both cards? I think this will be a trial and error thing until you hit the right combination that just works. Another idea would be takeout one card and see if you can crunch then, if so then add the second card and see what happens. Do you need the sli cable for you other program, builder? Or do you use it for gaming?
ID: 56563 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mitrichr
Avatar

Send message
Joined: 21 Dec 07
Posts: 24
Credit: 4,567,143
RAC: 0
Message 56564 - Posted: 18 Dec 2012, 22:16:13 UTC

mikey

Thanks, but I have run three other projects' GPU work units with no difficulty. My configuration was worked out with help from both BOINC people and the computer builder.

I was given a link for you guys by people at BOINC who think that the project as a resource problem in its compiling. I put the link into a post. This link was provided after BOINC looked at the following error from on of my tasks:

Stderr output
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code -5 (0xfffffffb)
</message>
<stderr_txt>
BOINC: parse gpu_opencl_dev_index 0
<search_application> milkyway_separation 1.02 Windows x86_64 double OpenCL </search_application>
Unrecognized XML in project preferences: max_gfx_cpu_pct
Skipping: 20
Skipping: /max_gfx_cpu_pct
Unrecognized XML in project preferences: allow_non_preferred_apps
Skipping: 1
Skipping: /allow_non_preferred_apps
Unrecognized XML in project preferences: nbody_graphics_poll_period
Skipping: 30
Skipping: /nbody_graphics_poll_period
Unrecognized XML in project preferences: nbody_graphics_float_speed
Skipping: 5
Skipping: /nbody_graphics_float_speed
Unrecognized XML in project preferences: nbody_graphics_textured_point_size
Skipping: 250
Skipping: /nbody_graphics_textured_point_size
Unrecognized XML in project preferences: nbody_graphics_point_point_size
Skipping: 40
Skipping: /nbody_graphics_point_point_size
BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Error reading astronomy parameters from file 'astronomy_parameters.txt'
Trying old parameters file
Using AVX path
Found 2 platforms
Platform 0 information:
Name: NVIDIA CUDA
Version: OpenCL 1.1 CUDA 4.2.1
Vendor: NVIDIA Corporation
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
Profile: FULL_PROFILE
Platform 1 information:
Name: NVIDIA CUDA
Version: OpenCL 1.1 CUDA 4.2.1
Vendor: NVIDIA Corporation
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 2 CL devices
Device 'GeForce GTX 670' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU)
Driver version: 306.97
Version: OpenCL 1.1 CUDA
Compute capability: 3.0
Max compute units: 7
Clock frequency: 1045 Mhz
Global mem size: 2147483648
Local mem size: 49152
Max const buf size: 65536
Double extension: cl_khr_fp64
Error creating context (-5): CL_OUT_OF_RESOURCES
Error getting device and context (-5): CL_OUT_OF_RESOURCES
Failed to calculate likelihood
<background_integral> 1.#QNAN0000000000 </background_integral>
<stream_integral> 1.#QNAN0000000000 1.#QNAN0000000000 1.#QNAN0000000000 </stream_integral>
<background_likelihood> 1.#QNAN0000000000 </background_likelihood>
<stream_only_likelihood> 1.#QNAN0000000000 1.#QNAN0000000000 1.#QNAN0000000000 </stream_only_likelihood>
<search_likelihood> 1.#QNAN0000000000 </search_likelihood>
10:16:09 (1936): called boinc_finish

</stderr_txt>
]]>

I have zero idea how one interprets this stuff, but this was what they used.


http://sciencesprings.wordpress.com
http://facebook.com/sciencesprings


ID: 56564 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,947,628
RAC: 22,118
Message 56566 - Posted: 19 Dec 2012, 13:04:58 UTC - in response to Message 56564.  

mikey

Thanks, but I have run three other projects' GPU work units with no difficulty. My configuration was worked out with help from both BOINC people and the computer builder.

I was given a link for you guys by people at BOINC who think that the project as a resource problem in its compiling. I put the link into a post. This link was provided after BOINC looked at the following error from on of my tasks:

I have zero idea how one interprets this stuff, but this was what they used.


I also have no idea what all that means, except you said your units failed. Oh well I guess you will have to wait for an Admin to help you.
Sorry and have fun.
ID: 56566 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : GTX670's and the MilkyWay project

©2024 Astroinformatics Group