Joined: 18 Nov 08
I have been looking at why I get a lot of invalidate errors on my S9100 graphics board and was comparing the Task Details of my work unit with the Task Details of my wingmans and noticed a deficiency, probably in how BOINC reports the type of graphics boards and how the project chooses to use that info.
First, this work unit shows 2 errors, 2 valid and 1 (mine) invalid. 3 systems were ATI and 2 are nVidia and overall state was marked "too many error possible bug"
Examining each of the "error" systems shows an attempt to use the built in Intel graphics chipset instead of the "BOINC suggested nVidia" On both system, the Intel GPU did not support FP64. The system with supposidly two 1060s had over 300 errors with only 27 valid but the other system had almost 3000 errors and no other results.
I looked at the 27 valid units and in all 27 the NVidia platform was recognized unlike the 364 failures. Obvious bug, probably OpenCL? Possibly Milkyway?
This system appears to have 2 gtx1060 but actually that is an error in how BOINC goes about determining whats there when there are 2 or move video boards depending on the OS. Here is a typical Task Report
<search_application> milkyway_separation 1.46 Windows x86 double OpenCL </search_application> BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation' --- Using AVX path Found 1 platform Platform 0 information: Name: Intel(R) OpenCL Version: OpenCL 1.2 Vendor: Intel(R) Corporation Extensions: cl_khr_fp64 cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_intel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sharing cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d11_sharing Profile: FULL_PROFILE Didn't find preferred platform Using device 1 on platform 0 Failed to find number of devices (-1): CL_DEVICE_NOT_FOUND Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood-----this keeps repeating, nVidia is never found nor used although suggested---- -----that intel graphcs board does not support FP64, should have been rejected immediately----
This system has built in Intel graphics and also an nVidia 960m which is capable of (very low) double precision. BOINC, under linux, does report the correct identities of each graphics board under the hostid unlike the windows gtx1060 systems. here is the problem
<search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application> BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation' ----- Platform 0 information: Name: Intel Gen OCL Driver Version: OpenCL 1.2 beignet 1.1.1 Vendor: Intel Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_spir cl_khr_icd Profile: FULL_PROFILE Didn't find preferred platform Using device 0 on platform 0 Found 1 CL device Device 'Intel(R) HD Graphics Skylake Halo GT2' (Intel:0x8086) (CL_DEVICE_TYPE_GPU) ---this repeats and there is no further mention of the nVidia board---- ---that intel chipset does not supoport FP64, the 960m should have been used---
So far, those two errors were because the FP64 GPU was available but not used (or couldnt be found) so there were actually two "validates" and one (mine) invalided. Result should have been accepted.
IMHO the project should check for "double precision missing", mark as an error, but not use the error as part of any invalidation test.
©2021 Astroinformatics Group