Questions and Answers :
Unix/Linux :
CL_OUT_OF_HOST_MEMORY with AMD RX 6600 XT on Xubuntu 20.04
Message board moderation
Author | Message |
---|---|
Send message Joined: 4 Mar 18 Posts: 23 Credit: 265,230,226 RAC: 14,976 |
Is there anyone running Milkyway@Home with AMD RX 6600 XT in Ubuntu 20.04? I run into a lot of error of computing. It turns out that it's the error "CL_OUT_OF_HOST_MEMORY". Computer info: * Xubuntu 20.04, kernel 5.13.19 * AMD OpenCL (ROCM 5.1.1) installed with amdgpu-install from: https://repo.radeon.com/amdgpu-install/22.10.1/ubuntu/focal/ * 32GB of RAM * GPU: AMD Radeon 6600 XT 8 GB * I do not have the APU (AMD integrated GPU) enabled in BIOS. Only the discrete Graphics card 6600 XT is used. Interestingly, Einstein@Home project can run apps that use GPU without errors. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Is there anyone running Milkyway@Home with AMD RX 6600 XT in Ubuntu 20.04? I believe that's a programming error with the tasks themselves not your gpu. |
Send message Joined: 23 May 09 Posts: 4 Credit: 16,387 RAC: 0 |
I realize that this is an old thread, but is there any word on what we should do about this error? I have the same problem on my Radeon RX 6750 XT. Every task attempted fails with the CL_OUT_OF_HOST_MEMORY error. Should I let my computer burn through the bugged tasks? Should I disable the GPU for this project? Do the researchers know about this problem? Thanks! |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
I realize that this is an old thread, but is there any word on what we should do about this error? I have the same problem on my Radeon RX 6750 XT. Every task attempted fails with the CL_OUT_OF_HOST_MEMORY error. Should I let my computer burn through the bugged tasks? Should I disable the GPU for this project? Do the researchers know about this problem? In hindsight both of you guys could be running into gpu's that are just too old to crunch here unless you have one of the newer ones with 12gb of onboard ram |
Send message Joined: 23 May 09 Posts: 4 Credit: 16,387 RAC: 0 |
I just bought my GPU last week, and it has 12GB of onboard RAM. Einstein@Home jobs run just fine. Any other thoughts? Thanks |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
I just bought my GPU last week, and it has 12GB of onboard RAM. Einstein@Home jobs run just fine. Any other thoughts? No i don't sorry |
Send message Joined: 23 May 09 Posts: 4 Credit: 16,387 RAC: 0 |
Ok, thanks for your help. |
Send message Joined: 23 May 09 Posts: 4 Credit: 16,387 RAC: 0 |
In case anyone else runs into this problem and happens to find this page, it took extensive research, but I think I know what's happened. In ~2017, AMD came out with a new opencl stack called Radeon Open Compute - runtime (ROCr), and started building it in to new GPUs. Specifically, anything newer than a Vega 10. In ~2019, the AMD GPU Linux driver was updated to deprecate the "legacy" opencl stack in favour of ROCr. According to the AMD GPU driver page, the legacy opencl stack doesn't support anything newer than the Vega 10. Newer GPUs must use ROCr. Since the apps for MilkyWay@Home haven't been updated since 2019, I assume that they haven't been updated to use ROCr, and therefore won't run on any of the newer AMD GPUs under Linux. This also explains why if you look at the GPU Models page under the Computing menu above, all the AMD GPUs listed running under Linux are older than the Vega 10. Long story short, MilkyWay@Home needs to update its apps. |
Send message Joined: 28 May 22 Posts: 17 Credit: 402,111,833 RAC: 0 |
Maybe a dumb idea, but could you run Windows in a Virtual Machine and put BOINC and AMD's cl compatible drivers on it ? Martin |
Send message Joined: 28 Jun 16 Posts: 1 Credit: 145,448 RAC: 0 |
I got the same errors on my RX 5600XT with ROCm. |
Send message Joined: 1 Oct 14 Posts: 3 Credit: 20,121,618 RAC: 2,129 |
Yeah, I see. Thanks, man! I have the exact same problem with my RX6600 on Ubuntu 22.10. Installed the latest drivers using amdgpu-install. It runs handsomly on Einstein and PrimeGrid, but no luck with Milkyway. So sad. <core_client_version>7.20.2</core_client_version> <![CDATA[ <message> process exited with code 250 (0xfa, -6)</message> <stderr_txt> <search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application> Reading preferences ended prematurely BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.' Setting process priority to 0 (13): Permission denied Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' Switching to Parameter File 'astronomy_parameters.txt' <number_WUs> 5 </number_WUs> <number_params_per_WU> 20 </number_params_per_WU> Using AVX path Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 2.1 AMD-APP (3513.0) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_amd_event_callback Profile: FULL_PROFILE Using device 0 on platform 0 Found 1 CL device Device 'gfx1032' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Board: AMD Radeon RX 6600 Driver version: 3513.0 (HSA1.1,LC) Version: OpenCL 2.0 Compute capability: 0.0 Max compute units: 14 Clock frequency: 2750 Mhz Global mem size: 8573157376 Local mem size: 65536 Max const buf size: 7287183768 Double extension: cl_khr_fp64 Error creating command queue (-6): CL_OUT_OF_HOST_MEMORY Error getting device and context (-6): CL_OUT_OF_HOST_MEMORY |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 554,885,540 RAC: 36,448 |
I believe is it is just a permissions issue with the Rocr drivers which have the OpenCL component in a different location from the legacy AMD OpenCL drivers. You would have to get some AMD compute experts to chime in and verify that. Remember reading about the issue somewhere, on some project but don't know where to point you to. |
Send message Joined: 18 Nov 22 Posts: 84 Credit: 640,530,847 RAC: 0 |
In case anyone else runs into this problem and happens to find this page, it took extensive research, but I think I know what's happened. i think you're very confused on what ROCm and ROCr actually are. your post implies it's something to do with hardware with your comment "started building it into new GPUs". this is not true. ROCm and ROCr are just the software/drivers. nothing to do with hardware. people generally dont run newer AMD GPUs because AMD started nerfing the FP64 capabilities of their new cards and it's just not worth it to run here. the older cards just perform better. if you go out past the top 100, you'll see some Navi and Big Navi cards working on the project. the problem is drivers, not the application. I'm betting that a full true ROCm install (NOT ROCr from the amdgpu installer) would work. unfortunately AMD linux drivers are a bit of a mess in this regard with OpenCL support coming from multiple drivers (amdgpu, ROCm, Mesa) and each of them with their own drawbacks and limitations. but for the application itself, it does have a kind of flaw, not one with making it full-stop not work, but with memory management which is likely the reason for the memory errors by the OP. these tasks don't use much VRAM for each context, but the tasks are prepackaged groups of 5 tasks. and when the subsequent internal "jobs" run, the contexts, and hence VRAM used, are not released until the task has fully completed all 5. this is undoubtedly not necessary for the task to function. it's holding old data in the VRAM for no reason. when the task completes, the 5x tasks are taking up ~1500MB of VRAM. if you're running multiples (as most people do) for best performance, you can easily run out of VRAM and get this error. an 8GB card could only run 5 tasks at a time safely, MAYBE 6 if the tasks remain staggered. |
Send message Joined: 8 Nov 22 Posts: 3 Credit: 8,617,255 RAC: 1,373 |
Dear all, I believe I ran into a similar issue with an AMD Radeon RX 7900 XTX: <core_client_version>7.20.5</core_client_version> <![CDATA[ <message> process exited with code 250 (0xfa, -6)</message> <stderr_txt> <search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application> Reading preferences ended prematurely BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.' Setting process priority to 0 (13): Permission denied Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' Switching to Parameter File 'astronomy_parameters.txt' <number_WUs> 5 </number_WUs> <number_params_per_WU> 20 </number_params_per_WU> Using AVX path Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 2.1 AMD-APP (3513.0) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_amd_event_callback Profile: FULL_PROFILE Using device 0 on platform 0 Found 2 CL devices Device 'gfx1100' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Board: Radeon RX 7900 XTX Driver version: 3513.0 (HSA1.1,LC) Version: OpenCL 2.0 Compute capability: 0.0 Max compute units: 48 Clock frequency: 3220 Mhz Global mem size: 25753026560 Local mem size: 65536 Max const buf size: 21890072576 Double extension: cl_khr_fp64 Error creating command queue (-6): CL_OUT_OF_HOST_MEMORY Error getting device and context (-6): CL_OUT_OF_HOST_MEMORY Failed to calculate likelihood Using AVX path Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 2.1 AMD-APP (3513.0) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_amd_event_callback Profile: FULL_PROFILE Using device 0 on platform 0 Found 2 CL devices Device 'gfx1100' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Board: Radeon RX 7900 XTX Driver version: 3513.0 (HSA1.1,LC) Version: OpenCL 2.0 Compute capability: 0.0 Max compute units: 48 Clock frequency: 3220 Mhz Global mem size: 25753026560 Local mem size: 65536 Max const buf size: 21890072576 Double extension: cl_khr_fp64 Error creating command queue (-6): CL_OUT_OF_HOST_MEMORY Error getting device and context (-6): CL_OUT_OF_HOST_MEMORY Failed to calculate likelihood Using AVX path Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 2.1 AMD-APP (3513.0) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_amd_event_callback Profile: FULL_PROFILE Using device 0 on platform 0 Found 2 CL devices Device 'gfx1100' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Board: Radeon RX 7900 XTX Driver version: 3513.0 (HSA1.1,LC) Version: OpenCL 2.0 Compute capability: 0.0 Max compute units: 48 Clock frequency: 3220 Mhz Global mem size: 25753026560 Local mem size: 65536 Max const buf size: 21890072576 Double extension: cl_khr_fp64 Error creating command queue (-6): CL_OUT_OF_HOST_MEMORY Error getting device and context (-6): CL_OUT_OF_HOST_MEMORY Failed to calculate likelihood Using AVX path Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 2.1 AMD-APP (3513.0) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_amd_event_callback Profile: FULL_PROFILE Using device 0 on platform 0 Found 2 CL devices Device 'gfx1100' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Board: Radeon RX 7900 XTX Driver version: 3513.0 (HSA1.1,LC) Version: OpenCL 2.0 Compute capability: 0.0 Max compute units: 48 Clock frequency: 3220 Mhz Global mem size: 25753026560 Local mem size: 65536 Max const buf size: 21890072576 Double extension: cl_khr_fp64 Error creating command queue (-6): CL_OUT_OF_HOST_MEMORY Error getting device and context (-6): CL_OUT_OF_HOST_MEMORY Failed to calculate likelihood Using AVX path Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 2.1 AMD-APP (3513.0) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_amd_event_callback Profile: FULL_PROFILE Using device 0 on platform 0 Found 2 CL devices Device 'gfx1100' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Board: Radeon RX 7900 XTX Driver version: 3513.0 (HSA1.1,LC) Version: OpenCL 2.0 Compute capability: 0.0 Max compute units: 48 Clock frequency: 3220 Mhz Global mem size: 25753026560 Local mem size: 65536 Max const buf size: 21890072576 Double extension: cl_khr_fp64 Error creating command queue (-6): CL_OUT_OF_HOST_MEMORY Error getting device and context (-6): CL_OUT_OF_HOST_MEMORY Failed to calculate likelihood 18:10:55 (9902): called boinc_finish(-6) </stderr_txt> ]]> BOINC is version 7.20.5 installed on Ubuntu 22.04, with latest ROCm drivers: https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.4.3/page/How_to_Install_ROCm.html As reported by others, Einstein@Home GPU tasks work fine. Best regards, Samuel |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 554,885,540 RAC: 36,448 |
Don't feel like you just lack the knowledge to figure out OpenCL on the Radeon 6900XTX cards. Even Michael Larabel at Phoronix, who is an actual Linux wiz, couldn't get the ROCm drivers to run OpenCL tests without these "out of memory" errors. https://www.phoronix.com/review/nvidia-rtx4080-rtx4090-compute Besides many of the binary-only (CUDA) benchmarks being incompatible with the AMD ROCm compute stack, even for the common OpenCL benchmarks there were problems testing the latest driver build; the Radeon RX 7900 XTX was hitting OpenCL "out of host memory" errors when initializing the OpenCL driver with the RDNA3 GPUs. So with those issues plus the AMD ROCm compute stack still being hit or miss depending upon the particular consumer GPU, this article ended up just being a generational look at the NVIDIA compute performance on Ubuntu Linux. I really feel anyone that is still trying to tuff it out getting the newer AMD cards to do BOINC OpenCL projects is just a glutton for punishment. Much simpler to use Nvidia cards which 'just work' and get on with crunching. There really is no difference in FP64 capabilities anymore in the latest generation of consumer cards from either camp. |
Send message Joined: 2 Mar 20 Posts: 131 Credit: 320,183,524 RAC: 13,434 |
I'm certainly no authority on this problem, but have you attempted using an older driver version. Just maybe, the newer ones don't agree with the Milkyway app or Boinc. Sorry I can't be of any real help. Good luck! Allen |
Send message Joined: 19 Jul 10 Posts: 623 Credit: 19,260,717 RAC: 522 |
There really is no difference in FP64 capabilities anymore in the latest generation of consumer cards from either camp.Don't agree on that, AMD Radeon RX 7950 XTX 2.534 TFLOPS (1:32), NVIDIA GeForce RTX 4090 1,290 GFLOPS (1:64), so just half of the AMD card and nearly 100W higher TDP. The AMD Radeon RX 7900 XTX has still 1.919 TFLOPS FP64, i.e. ~1.5x of the RTX 4090 at 62% of the price. That are huge differences. Much simpler to use Nvidia cards which 'just work' and get on with crunching.Perhaps even simpler: use Windows. ;-) (sorry, could not resist) |
Send message Joined: 18 Nov 22 Posts: 84 Credit: 640,530,847 RAC: 0 |
There really is no difference in FP64 capabilities anymore in the latest generation of consumer cards from either camp.Don't agree on that, AMD Radeon RX 7950 XTX 2.534 TFLOPS (1:32), NVIDIA GeForce RTX 4090 1,290 GFLOPS (1:64), so just half of the AMD card and nearly 100W higher TDP. The AMD Radeon RX 7900 XTX has still 1.919 TFLOPS FP64, i.e. ~1.5x of the RTX 4090 at 62% of the price. That are huge differences. You’d have to look at actual power use. Very likely that full TDP is not being pulled to run on the 4090. But the spirit of the comment is still valid. Both AMD and Nvidia are slashing the FP64 capabilities of their consumer based cards. AMD not as much as Nvidia, but they are still doing it to a large extent. Older Nvidia cards still reign supreme here though. P100s for the budget option, or Titan V for higher density. |
Send message Joined: 19 Jul 10 Posts: 623 Credit: 19,260,717 RAC: 522 |
You’d have to look at actual power use. Very likely that full TDP is not being pulled to run on the 4090.I'm even pretty sure, that the full TDP isn't pulled while crunching, in particular here with FP64 load (my GTX 275 is quite a bit warmer when crunching Moo! for example), but that's the same for AMD cards. |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 554,885,540 RAC: 36,448 |
I could care less about theoretical FP64 specifications. I would just examine the actual 1X computation times for both cards. You won't see the 4090 card turning in 2X the computation time of the 7950 XTX. |
©2024 Astroinformatics Group