Welcome to MilkyWay@home

New Nvidia Driver 378.49 Causing Computation Errors

Message boards : Number crunching : New Nvidia Driver 378.49 Causing Computation Errors
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Wrend
Avatar

Send message
Joined: 4 Nov 12
Posts: 96
Credit: 251,528,484
RAC: 0
Message 66185 - Posted: 12 Feb 2017, 22:37:09 UTC

Just a heads up that this driver is causing the MilkyWay@Home 1.43 (opencl_nvidia_101) work units to instantly fail – on my system at any rate.

I went back to the 376.33 Driver and the work units are crunching fine again.

Is anyone else having this issue?

Thanks.
ID: 66185 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Wrend
Avatar

Send message
Joined: 4 Nov 12
Posts: 96
Credit: 251,528,484
RAC: 0
Message 66193 - Posted: 15 Feb 2017, 1:25:00 UTC

I'm now using the newer drivers that came out today – 378.66 – and everything seems to be running fine with them.
ID: 66193 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Wrend
Avatar

Send message
Joined: 4 Nov 12
Posts: 96
Credit: 251,528,484
RAC: 0
Message 66194 - Posted: 15 Feb 2017, 2:49:33 UTC

Sorry, it looks like I spoke too soon – and it's too late to edit my previous post. All the tasks are now instantly erroring out again for some reason with the newer drivers that came out today now too. I'm going to have to roll the driver back again.

I'm assuming the work unites will run again fine with the older driver mentioned in the first post, but I will let you know if not.
ID: 66194 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 66206 - Posted: 19 Feb 2017, 1:34:20 UTC
Last modified: 19 Feb 2017, 1:55:06 UTC

I have the same problem. Total of over 900 tasks with 460 erroring out just today within a 2 seconds. Yesterday 420 ran ok taking 2 minutes each.

Pair of gtx 1070 not in sli. every now and then, in todays batch, there is one that is valid but i have to look hard to find it.

Here is a typical error
<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
Incorrect function.
(0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
<search_application> milkyway_separation 1.43 Windows x86 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 5 </number_WUs>
<number_params_per_WU> 20 </number_params_per_WU>
Using SSE3 path
Found 1 platform
Platform 0 information:
Name: NVIDIA CUDA
Version: OpenCL 1.2 CUDA 8.0.0
Vendor: NVIDIA Corporation
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
Profile: FULL_PROFILE
Using device 1 on platform 0
Found 1 CL device
Requested device is out of range of number found devices
Failed to select a device (1): MW_CL_ERROR
Failed to get information about device
Error getting device and context (1): MW_CL_ERROR
Failed to calculate likelihood
ID: 66206 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 66207 - Posted: 19 Feb 2017, 2:54:21 UTC
Last modified: 19 Feb 2017, 2:56:55 UTC

just realized that one of my 1070s quit working. that can explain the failures. however it was working fine on gpugrid today and completed a long run then switched to milkyway when the queue ran out. rebooting fixed the second 1070
ID: 66207 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile bcavnaugh
Avatar

Send message
Joined: 14 Feb 14
Posts: 22
Credit: 195,835,315
RAC: 0
Message 66225 - Posted: 6 Mar 2017, 1:14:45 UTC
Last modified: 6 Mar 2017, 1:48:47 UTC

Same problem now on my GTX Titans running Driver 378.66
They Ran one time and now they all error out

Host: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=561866

Stderr output
<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
Error performing inpage operation.
 (0x3e7) - exit code 999 (0x3e7)
</message>
<stderr_txt>
<search_application> milkyway_separation 1.43 Windows x86 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' 
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 5 </number_WUs>
<number_params_per_WU> 20 </number_params_per_WU>
Using AVX path
Found 1 platform
Platform 0 information:
  Name:       NVIDIA CUDA
  Version:    OpenCL 1.2 CUDA 8.0.0
  Vendor:     NVIDIA Corporation
  Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_khr_gl_event
  Profile:    FULL_PROFILE
Using device 1 on platform 0
Found 2 CL devices
Device 'GeForce GTX TITAN' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU)
Board: 
Driver version:      378.66
Version:             OpenCL 1.2 CUDA
Compute capability:  3.5
Max compute units:   14
Clock frequency:     980 Mhz
Global mem size:     6442450944
Local mem size:      49152
Max const buf size:  65536
Double extension:    cl_khr_fp64
Error creating contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Error getting device and contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Failed to calculate likelihood
Using AVX path
Found 1 platform
Platform 0 information:
  Name:       NVIDIA CUDA
  Version:    OpenCL 1.2 CUDA 8.0.0
  Vendor:     NVIDIA Corporation
  Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_khr_gl_event
  Profile:    FULL_PROFILE
Using device 1 on platform 0
Found 2 CL devices
Device 'GeForce GTX TITAN' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU)
Board: 
Driver version:      378.66
Version:             OpenCL 1.2 CUDA
Compute capability:  3.5
Max compute units:   14
Clock frequency:     980 Mhz
Global mem size:     6442450944
Local mem size:      49152
Max const buf size:  65536
Double extension:    cl_khr_fp64
Error creating contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Error getting device and contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Failed to calculate likelihood
Using AVX path
Found 1 platform
Platform 0 information:
  Name:       NVIDIA CUDA
  Version:    OpenCL 1.2 CUDA 8.0.0
  Vendor:     NVIDIA Corporation
  Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_khr_gl_event
  Profile:    FULL_PROFILE
Using device 1 on platform 0
Found 2 CL devices
Device 'GeForce GTX TITAN' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU)
Board: 
Driver version:      378.66
Version:             OpenCL 1.2 CUDA
Compute capability:  3.5
Max compute units:   14
Clock frequency:     980 Mhz
Global mem size:     6442450944
Local mem size:      49152
Max const buf size:  65536
Double extension:    cl_khr_fp64
Error creating contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Error getting device and contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Failed to calculate likelihood
Using AVX path
Found 1 platform
Platform 0 information:
  Name:       NVIDIA CUDA
  Version:    OpenCL 1.2 CUDA 8.0.0
  Vendor:     NVIDIA Corporation
  Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_khr_gl_event
  Profile:    FULL_PROFILE
Using device 1 on platform 0
Found 2 CL devices
Device 'GeForce GTX TITAN' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU)
Board: 
Driver version:      378.66
Version:             OpenCL 1.2 CUDA
Compute capability:  3.5
Max compute units:   14
Clock frequency:     980 Mhz
Global mem size:     6442450944
Local mem size:      49152
Max const buf size:  65536
Double extension:    cl_khr_fp64
Error creating contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Error getting device and contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Failed to calculate likelihood
Using AVX path
Found 1 platform
Platform 0 information:
  Name:       NVIDIA CUDA
  Version:    OpenCL 1.2 CUDA 8.0.0
  Vendor:     NVIDIA Corporation
  Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_khr_gl_event
  Profile:    FULL_PROFILE
Using device 1 on platform 0
Found 2 CL devices
Device 'GeForce GTX TITAN' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU)
Board: 
Driver version:      378.66
Version:             OpenCL 1.2 CUDA
Compute capability:  3.5
Max compute units:   14
Clock frequency:     980 Mhz
Global mem size:     6442450944
Local mem size:      49152
Max const buf size:  65536
Double extension:    cl_khr_fp64
Error creating contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Error getting device and contextTrying to show unknown cl_int 999
 (999): Unknown cl_int
Failed to calculate likelihood
17:57:07 (5188): called boinc_finish(999)

</stderr_txt>
]]>



Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.
ID: 66225 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile bcavnaugh
Avatar

Send message
Joined: 14 Feb 14
Posts: 22
Credit: 195,835,315
RAC: 0
Message 66226 - Posted: 6 Mar 2017, 2:42:35 UTC

This is bigger than this Project as after on set of this Project ran it seems that I cannot run any BOINC GPU Tasks anymore.

http://forums.evga.com/Geforce-37866-Drivers-BOINC-GPU-Projects-Fail-m2624784.aspx

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.
ID: 66226 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile bcavnaugh
Avatar

Send message
Joined: 14 Feb 14
Posts: 22
Credit: 195,835,315
RAC: 0
Message 66228 - Posted: 6 Mar 2017, 20:30:10 UTC

Back to Driver 373.06 and all is fine.

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.
ID: 66228 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Wrend
Avatar

Send message
Joined: 4 Nov 12
Posts: 96
Credit: 251,528,484
RAC: 0
Message 66257 - Posted: 31 Mar 2017, 16:14:52 UTC
Last modified: 31 Mar 2017, 16:15:41 UTC

I tried the newer driver 378.92 and it seemed to be running fine for at least an hour before all the work units started erroring out yet again. Definitely something unfortunate going on with these newer drivers.

I'm using two Titan Black cards in SLI, the prefer maximum performance setting, 2x, 3x, and 4x DSR, double precision optimization, and the Nvidia patch to force enable PCIe gen 3 on an i7-3930K CPU.

I'll be rolling the drivers back to 376.33 yet again and will let you know if I have any issues with it.
ID: 66257 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,950,815
RAC: 21,521
Message 66258 - Posted: 1 Apr 2017, 10:27:03 UTC - in response to Message 66257.  

I tried the newer driver 378.92 and it seemed to be running fine for at least an hour before all the work units started erroring out yet again. Definitely something unfortunate going on with these newer drivers.

I'm using two Titan Black cards in SLI, the prefer maximum performance setting, 2x, 3x, and 4x DSR, double precision optimization, and the Nvidia patch to force enable PCIe gen 3 on an i7-3930K CPU.

I'll be rolling the drivers back to 376.33 yet again and will let you know if I have any issues with it.


Boinc does NOT benefit from using SLI, take it off if you are not a gamer, it's better to treat each gpu separately.
ID: 66258 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Wrend
Avatar

Send message
Joined: 4 Nov 12
Posts: 96
Credit: 251,528,484
RAC: 0
Message 66275 - Posted: 5 Apr 2017, 5:15:30 UTC - in response to Message 66258.  
Last modified: 5 Apr 2017, 5:33:26 UTC

Boinc does NOT benefit from using SLI, take it off if you are not a gamer, it's better to treat each gpu separately.

Yes, I am aware of this and we've probably had this conversation several years ago now too back when I was running SLIed 680s. :)

It's still good to mention it though for other users in general as it is a waste of VRAM capacity and could be an issue for people with insufficient VRAM.

I was mentioning the specifics of my setup there in case it helps troubleshoot what the issues with these newer drivers are. The older drivers 376.33 are still running fine for me and crunching right along with the same settings.
ID: 66275 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : New Nvidia Driver 378.49 Causing Computation Errors

©2024 Astroinformatics Group