1)
Message boards :
Number crunching :
GPU tasks with AMD ROCm
(Message 69556)
Posted 19 Feb 2020 by Šarūnas Burdulis Post: Open source amdgpu+rocm-opencl is still not usable by Milkyway@home. However open source amdgpu (part of stock Linux kernel) can be combined with OpenCL libraries from AMDGPU-PRO driver (download from AMD). amdgpu-install --opencl=legacy,pal --headless --no-dkms This works with up to the latest Linux kernel, 5.6-rc2 as of today. |
2)
Message boards :
Number crunching :
GPU tasks with AMD ROCm
(Message 66753)
Posted 26 Oct 2017 by Šarūnas Burdulis Post: Any ideas on how to debug this? The OpenCL platform seems to be there. Here is /var/log/boinc.log on boinc-client startup: 26-Oct-2017 13:38:23 [---] Starting BOINC client version 7.8.3 for x86_64-pc-linux-gnu 26-Oct-2017 13:38:23 [---] log flags: file_xfer, sched_ops, task 26-Oct-2017 13:38:23 [---] Libraries: libcurl/7.55.1 OpenSSL/1.0.2g zlib/1.2.11 libidn2/2.0.2 libpsl/0.18.0 (+libidn2/2.0.2) librtmp/2.3 26-Oct-2017 13:38:23 [---] Data directory: /var/lib/boinc-client 26-Oct-2017 13:38:23 [---] OpenCL: AMD/ATI GPU 0: gfx701 (driver version 1.1 (HSA,LC), device version OpenCL 1.2, 8192MB, 8192MB available, 3696 GFLOPS peak) 26-Oct-2017 13:38:23 [---] Host name: hilbert 26-Oct-2017 13:38:23 [---] Processor: 12 AuthenticAMD AMD Ryzen 5 1600 Six-Core Processor [Family 23 Model 1 Stepping 1] 26-Oct-2017 13:38:23 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic overflow_recov succor smca 26-Oct-2017 13:38:23 [---] OS: Linux Ubuntu: Ubuntu 17.10 [4.11.0-kfd-compute-rocm-rel-1.6-180] 26-Oct-2017 13:38:23 [---] Memory: 15.67 GB physical, 15.95 GB virtual 26-Oct-2017 13:38:23 [---] Disk: 452.26 GB total, 283.99 GB free 26-Oct-2017 13:38:23 [---] Local time is UTC -4 hours 26-Oct-2017 13:38:23 [---] VirtualBox version: 5.1.30_Ubuntur118389 26-Oct-2017 13:38:23 [---] Config: GUI RPCs allowed from: 26-Oct-2017 13:38:23 [Milkyway@Home] URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 734143; resource share 100 26-Oct-2017 13:38:23 [Milkyway@Home] General prefs: from Milkyway@Home (last modified 16-Aug-2017 11:52:13) 26-Oct-2017 13:38:23 [Milkyway@Home] Host location: none 26-Oct-2017 13:38:23 [Milkyway@Home] General prefs: using your defaults 26-Oct-2017 13:38:23 [---] Reading preferences override file 26-Oct-2017 13:38:23 [---] Preferences: 26-Oct-2017 13:38:23 [---] max memory usage when active: 8023.47 MB 26-Oct-2017 13:38:23 [---] max memory usage when idle: 14442.24 MB 26-Oct-2017 13:38:23 [---] max disk usage: 283.91 GB 26-Oct-2017 13:38:23 [---] max CPUs used: 1 26-Oct-2017 13:38:23 [---] (to change preferences, visit a project web site or select Preferences in the Manager) 26-Oct-2017 13:38:23 [---] gui_rpc_auth.cfg is empty - no GUI RPC password protection 26-Oct-2017 13:38:23 Initialization completed |
3)
Message boards :
Number crunching :
GPU tasks with AMD ROCm
(Message 66704)
Posted 19 Oct 2017 by Šarūnas Burdulis Post: I have used AMD GPUS with their AMDGPU-PRO drivers. This works with Linux 4.4 and 4.10 (Ubuntu). One can use 4.10 kernel also in the latest Ubuntu 17.10b, but while AMDGPU-PRO installs, it causes some issues with Desktop apps. So I switched to AMD's open source ROCm and its corresponding OpenCL implementation. All seems to works fine, including applications which use OpenCL, e.g. darktable. MW@home GPU tasks however are failing (task log below). Did anyone try ROCm OpenCL? Any ideas on how to 'fix' this? <core_client_version>7.8.3</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255)</message> <stderr_txt> <search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application> BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.' Setting process priority to 0 (13): Permission denied Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' Switching to Parameter File 'astronomy_parameters.txt' <number_WUs> 5 </number_WUs> <number_params_per_WU> 20 </number_params_per_WU> Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood 09:15:21 (13641): called boinc_finish(1) </stderr_txt> ]]> |
4)
Message boards :
News :
GPU Issues Mega Thread
(Message 66703)
Posted 19 Oct 2017 by Šarūnas Burdulis Post: I have been successfully running GPU tasks with both AMD (amdgpu-pro) and Nvidia devices, using their provided OpenCL libraries. Yesterday I upgraded one of the AMD workstations to use ROCm and its OpenCL (amdgpu-pro doesn't work on Ubuntu 17.10). GPU device is RX 480 (Ellesmere/Polaris, or 'gfx803' in ROCm). Since then MilkyWay@home GPU tasks are failing. Below is what I see in the task log and part of the clinfo. Let me know if there already is any solution to this or more info is needed. <core_client_version>7.8.3</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255)</message> <stderr_txt> <search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application> BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.' Setting process priority to 0 (13): Permission denied Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' Switching to Parameter File 'astronomy_parameters.txt' <number_WUs> 5 </number_WUs> <number_params_per_WU> 20 </number_params_per_WU> Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood Using AVX path Error getting number of platform (-1001): CL_PLATFORM_NOT_FOUND_KHR Failed to get information about device Error getting device and context (1): MW_CL_ERROR Failed to calculate likelihood 09:15:21 (13641): called boinc_finish(1) </stderr_txt> ]]> clinfo|head -20 Number of platforms 1 Platform Name AMD Accelerated Parallel Processing Platform Vendor Advanced Micro Devices, Inc. Platform Version OpenCL 2.0 AMD-APP (2508.0) Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_amd_event_callback Platform Extensions function suffix AMD Platform Name AMD Accelerated Parallel Processing Number of devices 1 Device Name gfx803 Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 1.2 Driver Version 1.1 (HSA,LC) Device OpenCL C Version OpenCL C 2.0 Device Type GPU Device Profile FULL_PROFILE Max compute units 36 Max clock frequency 1288MHz ... |
©2024 Astroinformatics Group