Welcome to MilkyWay@home

Project frequently using wrong grapics board (FP64 on FP32 only system)


Advanced search

Message boards : Number crunching : Project frequently using wrong grapics board (FP64 on FP32 only system)
Message board moderation

To post messages, you must log in.

AuthorMessage
ProfileJStateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 158
Credit: 960,357,047
RAC: 2,268,197
500 million credit badge10 year member badge
Message 67429 - Posted: 6 May 2018, 17:12:07 UTC
Last modified: 6 May 2018, 18:04:07 UTC

I have been looking at why I get a lot of invalidate errors on my S9100 graphics board and was comparing the Task Details of my work unit with the Task Details of my wingmans and noticed a deficiency, probably in how BOINC reports the type of graphics boards and how the project chooses to use that info.

First, this work unit shows 2 errors, 2 valid and 1 (mine) invalid. 3 systems were ATI and 2 are nVidia and overall state was marked "too many error possible bug"

Examining each of the "error" systems shows an attempt to use the built in Intel graphics chipset instead of the "BOINC suggested nVidia" On both system, the Intel GPU did not support FP64. The system with supposidly two 1060s had over 300 errors with only 27 valid but the other system had almost 3000 errors and no other results.
I looked at the 27 valid units and in all 27 the NVidia platform was recognized unlike the 364 failures. Obvious bug, probably OpenCL? Possibly Milkyway?

This system appears to have 2 gtx1060 but actually that is an error in how BOINC goes about determining whats there when there are 2 or move video boards depending on the OS. Here is a typical Task Report
<search_application> milkyway_separation 1.46 Windows x86 double OpenCL </search_application>
BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation'
---
Using AVX path
Found 1 platform
Platform 0 information:
  Name:       Intel(R) OpenCL
  Version:    OpenCL 1.2 
  Vendor:     Intel(R) Corporation
  Extensions: cl_khr_fp64 cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_intel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sharing cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d11_sharing
  Profile:    FULL_PROFILE
Didn't find preferred platform
Using device 1 on platform 0
Failed to find number of devices (-1): CL_DEVICE_NOT_FOUND
Failed to get information about device
Error getting device and context (1): MW_CL_ERROR
Failed to calculate likelihood-----this keeps repeating, nVidia is never found nor used although suggested----
-----that intel graphcs board does not support FP64, should have been rejected immediately----

This system has built in Intel graphics and also an nVidia 960m which is capable of (very low) double precision. BOINC, under linux, does report the correct identities of each graphics board under the hostid unlike the windows gtx1060 systems. here is the problem
<search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application>
BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation'
-----
Platform 0 information:
  Name:       Intel Gen OCL Driver
  Version:    OpenCL 1.2 beignet 1.1.1
  Vendor:     Intel
  Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_spir cl_khr_icd
  Profile:    FULL_PROFILE
Didn't find preferred platform
Using device 0 on platform 0
Found 1 CL device
Device 'Intel(R) HD Graphics Skylake Halo GT2' (Intel:0x8086) (CL_DEVICE_TYPE_GPU)
---this repeats and there is no further mention of the nVidia board----
---that intel chipset does not supoport FP64, the 960m should have been used---


So far, those two errors were because the FP64 GPU was available but not used (or couldnt be found) so there were actually two "validates" and one (mine) invalided. Result should have been accepted.

IMHO the project should check for "double precision missing", mark as an error, but not use the error as part of any invalidation test.
ID: 67429 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Project frequently using wrong grapics board (FP64 on FP32 only system)

©2019 Astroinformatics Group