Welcome to MilkyWay@home

Segmentation Fault with OpenCL

Questions and Answers : Unix/Linux : Segmentation Fault with OpenCL
Message board moderation

To post messages, you must log in.

AuthorMessage
Verona Group [VENETO]

Send message
Joined: 31 Dec 09
Posts: 15
Credit: 18,268,497
RAC: 1,030
Message 74291 - Posted: 27 Sep 2022, 11:56:24 UTC

Hello,

I have a segmentation fault at beginning of OpenCL calculation. Here one exanple:

<core_client_version>7.20.2</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)</message>
<stderr_txt>
<search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Setting process priority to 0 (13): Permission denied
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' 
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 5 </number_WUs>
<number_params_per_WU> 20 </number_params_per_WU>
Using SSE3 path
ATTENTION: default value of option radeonsi_no_infinite_interp overridden by environment.
Gallium debugger active. Logging all calls.
Hang detection timeout is 1000ms.
Gallium debugger active. Logging all calls.
Hang detection timeout is 1000ms.
Gallium debugger active. Logging all calls.
Hang detection timeout is 1000ms.
Gallium debugger active. Logging all calls.
Hang detection timeout is 1000ms.
Found 1 platform
Platform 0 information:
  Name:       Clover
  Version:    OpenCL 1.2 Mesa 22.1.7
  Vendor:     Mesa
  Extensions: cl_khr_icd
  Profile:    FULL_PROFILE
Didn't find preferred platform
Using device 0 on platform 0
Found 1 CL device
Device 'AMD VERDE (LLVM 14.0.6, DRM 2.50, 5.19.10-arch1-1)' (AMD:0x1002) (CL_DEVICE_TYPE_GPU)
Board: 
Driver version:      22.1.7
Version:             OpenCL 1.2 Mesa 22.1.7
Compute capability:  0.0
Max compute units:   6
Clock frequency:     800 Mhz
Global mem size:     2143076352
Local mem size:      32768
Max const buf size:  67108864
Double extension:    cl_khr_fp64
SIGSEGV: segmentation violation

Exiting...

</stderr_txt>
]]>


May I enable some verbose flag?

Thank you
ID: 74291 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,306,047
RAC: 20,507
Message 74292 - Posted: 27 Sep 2022, 12:13:53 UTC - in response to Message 74291.  

Hello,

I have a segmentation fault at beginning of OpenCL calculation. Here one exanple:

<core_client_version>7.20.2</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)</message>
<stderr_txt>
<search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Setting process priority to 0 (13): Permission denied
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' 
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 5 </number_WUs>
<number_params_per_WU> 20 </number_params_per_WU>
Using SSE3 path
ATTENTION: default value of option radeonsi_no_infinite_interp overridden by environment.
Gallium debugger active. Logging all calls.
Hang detection timeout is 1000ms.
Gallium debugger active. Logging all calls.
Hang detection timeout is 1000ms.
Gallium debugger active. Logging all calls.
Hang detection timeout is 1000ms.
Gallium debugger active. Logging all calls.
Hang detection timeout is 1000ms.
Found 1 platform
Platform 0 information:
  Name:       Clover
  Version:    OpenCL 1.2 Mesa 22.1.7
  Vendor:     Mesa
  Extensions: cl_khr_icd
  Profile:    FULL_PROFILE
Didn't find preferred platform
Using device 0 on platform 0
Found 1 CL device
Device 'AMD VERDE (LLVM 14.0.6, DRM 2.50, 5.19.10-arch1-1)' (AMD:0x1002) (CL_DEVICE_TYPE_GPU)
Board: 
Driver version:      22.1.7
Version:             OpenCL 1.2 Mesa 22.1.7
Compute capability:  0.0
Max compute units:   6
Clock frequency:     800 Mhz
Global mem size:     2143076352
Local mem size:      32768
Max const buf size:  67108864
Double extension:    cl_khr_fp64
SIGSEGV: segmentation violation

Exiting...

</stderr_txt>
]]>


May I enable some verbose flag?

Thank you


Has this setup ever crunched here before? Over at Einstein they say that the Mesa drivers are NOT the way to go to crunch, don't know if that helps or not.
ID: 74292 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Verona Group [VENETO]

Send message
Joined: 31 Dec 09
Posts: 15
Credit: 18,268,497
RAC: 1,030
Message 74293 - Posted: 27 Sep 2022, 13:05:56 UTC - in response to Message 74292.  


Has this setup ever crunched here before? Over at Einstein they say that the Mesa drivers are NOT the way to go to crunch, don't know if that helps or not.


No, it's first time. I have a new GPU.

The GPU architetture is GCN 1.0 and new AMD drive dosn't support this architetture anymore, so i have to use Mesa Clover driver
ID: 74293 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,306,047
RAC: 20,507
Message 74295 - Posted: 27 Sep 2022, 17:04:57 UTC - in response to Message 74293.  


Has this setup ever crunched here before? Over at Einstein they say that the Mesa drivers are NOT the way to go to crunch, don't know if that helps or not.


No, it's first time. I have a new GPU.

The GPU architetture is GCN 1.0 and new AMD drive dosn't support this architetture anymore, so i have to use Mesa Clover driver


Have you installed the OpenCL stuff yet? If so then you need to find someone who's better at Linux then I am. You might try searching the projects for how to do it as you may just find the answer at some other project and then you can post it here for everyone else.
ID: 74295 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 708
Credit: 543,183,946
RAC: 142,711
Message 74296 - Posted: 27 Sep 2022, 17:11:43 UTC - in response to Message 74295.  

The Mesa drivers are showing OpenCL1.2 support. But AFAIK, the Mesa OpenCL component has never worked on BOINC projects.

Extra concern is the platform is Arch distro. They lock things down and do it only their way from what I've read.

My usual response for more common distros is to ditch the Mesa drivers and go with the official AMD driver packages.

If you were runinng a common Debian derivative, then I would say install the Opencl-icd package. sudo apt-get install ocl-icd-libopencl1

About all the help I can offer.
ID: 74296 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Verona Group [VENETO]

Send message
Joined: 31 Dec 09
Posts: 15
Credit: 18,268,497
RAC: 1,030
Message 74304 - Posted: 28 Sep 2022, 18:34:48 UTC - in response to Message 74296.  

I subscribed to Einstein@Home and I completed a GPU task with MESA driver without problem.

So i think my problem is with Milkyway@Home code or because M@H needs more resources (memory?)

thank you all
ID: 74304 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 100
Credit: 16,967,906
RAC: 0
Message 74305 - Posted: 28 Sep 2022, 18:57:44 UTC - in response to Message 74304.  
Last modified: 28 Sep 2022, 19:00:14 UTC

The last time I checked, there was not much point in upgrading my RX 570 (Ubuntu 20.04.5) because the timeout on MW limited the speed anyway.
Maybe that has changed now? Maybe someone with a new card will post.
https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=925986&offset=0&show_names=0&state=4&appid=
ID: 74305 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,306,047
RAC: 20,507
Message 74306 - Posted: 28 Sep 2022, 21:19:23 UTC - in response to Message 74305.  

The last time I checked, there was not much point in upgrading my RX 570 (Ubuntu 20.04.5) because the timeout on MW limited the speed anyway.
Maybe that has changed now? Maybe someone with a new card will post.
https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=925986&offset=0&show_names=0&state=4&appid=


Yours is doing about this:
Completed and validated 117.57 9.86 227.14 Milkyway@home Separation v1.46 (opencl_ati_101) x86_64-pc-linux-gnu

Here's what a 580 does:
Completed and validated 94.78 25.20 227.14 Milkyway@home Separation v1.46 (opencl_ati_101) windows_x86_64

Here's what my 980 does:
Completed and validated 246.16 90.78 227.11 Milkyway@home Separation v1.46 (opencl_nvidia_101) x86_64-pc-linux-gnu
ID: 74306 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 100
Credit: 16,967,906
RAC: 0
Message 74307 - Posted: 28 Sep 2022, 21:31:19 UTC - in response to Message 74306.  

Thanks. That is a useful point of comparison.
ID: 74307 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,306,047
RAC: 20,507
Message 74311 - Posted: 29 Sep 2022, 10:02:09 UTC - in response to Message 74307.  

Thanks. That is a useful point of comparison.


What model gpu were you looking at? I have a few other models I can put on here to do a couple of tasks for their comparison too if you'd like.
ID: 74311 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 708
Credit: 543,183,946
RAC: 142,711
Message 74321 - Posted: 30 Sep 2022, 6:06:53 UTC

This is what a RTX 3080 does at 2X concurrent on Separation.

126.72 81.02 227.14 Milkyway@home Separation v1.46 (opencl_nvidia_101) x86_64-pc-linux-gnu
ID: 74321 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : Segmentation Fault with OpenCL

©2024 Astroinformatics Group