Questions and Answers :
Unix/Linux :
Yet another computation-error problem
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 8 Jan 10 Posts: 21 Credit: 33,211,690 RAC: 0 ![]() ![]() ![]() |
I stopped using my GPU in BOINC when a case fan failed from hard work, but now I've tried to resume. I get immediate computation errors on all MW tasks, but not Einstein@Home tasks which run fine. OS: Gentoo Linux, kernel 4.9.76-r1 GPU: Radeon Pro WX 5100 8GB GDDR5 Driver: amdgpu-pro-opencl-17.50.511655 using mesa-17.3.8 Typical error: Computation error (0.929 CPUs + AMD/ATI GPU) ... MilkyWay@Home 1.46 (opencl_ati_101) de_modfit_14_bundle5_NoConstraintsWithDisk... (Also with de_modfit_23) Toolkit: wxGTK-3.0.3-r300 stdoutdae.txt shows this (T&D stripped): OpenCL: AMD/ATI GPU 0: AMD Radeon (TM) Pro WX 5100 Graphics (POLARIS10 / DRM 3.8.0 / 4.9.76-gentoo-r1, LLVM 5.0.1) (driver version 17.3.8, device version OpenCL 1.1 Mesa 17.3.8, 16029MB, 16029MB available, 2433 GFLOPS peak) [...] Memory: 31.32 GB physical, 62.47 GB virtual Disk: 39.12 GB total, 26.02 GB free Local time is UTC +1 hours VirtualBox version: 5.2.8_Gentoor120774 Config: don't compute while cc1 is running Config: don't compute while cc1plus is running Config: don't compute while cmake is running [...] Reading preferences override file Preferences: max memory usage when active: 28859.99 MB max memory usage when idle: 30463.32 MB max disk usage: 37.06 GB max download rate: 2621440 bytes/sec max upload rate: 838861 bytes/sec I've searched everywhere I can think of for clues to this, but there's nothing either recent or relevant. I've tried downgrading to amdgpu-pro-opencl-17.40.492261 but it's made no difference. Those are the only two versions available in Gentoo. What else can I try? Rgds Peter. |
![]() Send message Joined: 8 Jan 10 Posts: 21 Credit: 33,211,690 RAC: 0 ![]() ![]() ![]() |
I don't know what's going on here, but something has changed since I wrote the above: now I don't get the computation errors. I have another problem instead, on which I'll ask another question if I can't solve it myself. Rgds Peter. |
![]() ![]() Send message Joined: 24 Jan 11 Posts: 716 Credit: 558,801,556 RAC: 31,441 ![]() ![]() ![]() ![]() |
I have upgraded an older system and now the MilkyWay milkyway_separation 1.46 Linux x86_64 double OpenCL application does not run on Ubuntu 18.04. It just produces instant errors. <core_client_version>7.4.44</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> <stderr_txt> <search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application> Reading preferences ended prematurely BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation' Setting process priority to 0 (13): Permission denied Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' Switching to Parameter File 'astronomy_parameters.txt' <number_WUs> 4 </number_WUs> <number_params_per_WU> 26 </number_params_per_WU> stream sigma 0.0 is invalid Failed to get stream constants 18:16:08 (11399): called boinc_finish(1) </stderr_txt> ]]> My other projects SETI and Einstein are running fine on this system with their respective OpenCL applications. Can a developer look into this problem please. I would like to continue with MilkyWay if possible. I will run into this same problem again when I upgrade another old system with identical hardware and software. ![]() |
![]() ![]() Send message Joined: 24 Jan 11 Posts: 716 Credit: 558,801,556 RAC: 31,441 ![]() ![]() ![]() ![]() |
Just did a better look at the errored tasks. I found one that ran for longer and it has a lot more output in stderr.txt. Maybe someone can look at this and offer a suggestion. It looks like the OpenCL wisdom file couldn't be created properly. Stderr output <core_client_version>7.4.44</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> <stderr_txt> <search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application> Reading preferences ended prematurely BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation' Setting process priority to 0 (13): Permission denied Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' Switching to Parameter File 'astronomy_parameters.txt' <number_WUs> 5 </number_WUs> <number_params_per_WU> 20 </number_params_per_WU> Using AVX path Found 1 platform Platform 0 information: Name: NVIDIA CUDA Version: OpenCL 1.2 CUDA 9.2.101 Vendor: NVIDIA Corporation Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer Profile: FULL_PROFILE Using device 1 on platform 0 Found 3 CL devices Device 'GeForce GTX 1070' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU) Board: Driver version: 396.24 Version: OpenCL 1.2 CUDA Compute capability: 6.1 Max compute units: 15 Clock frequency: 1683 Mhz Global mem size: 8513978368 Local mem size: 49152 Max const buf size: 65536 Double extension: cl_khr_fp64 Build log: -------------------------------------------------------------------------------- <kernel>:183:72: warning: unknown attribute 'max_constant_size' ignored __constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))), ^ <kernel>:185:62: warning: unknown attribute 'max_constant_size' ignored __constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))), ^ <kernel>:186:67: warning: unknown attribute 'max_constant_size' ignored __constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))), ^ <kernel>:235:26: error: use of undeclared identifier 'inf' tmp = mad((real) Q_INV_SQR, z * z, tmp); /* (q_invsqr * z^2) + (x^2 + y^2) */ ^ <built-in>:35:19: note: expanded from here #define Q_INV_SQR inf ^ -------------------------------------------------------------------------------- clBuildProgram: Build failure (-11): CL_BUILD_PROGRAM_FAILURE Error building program from source (-11): CL_BUILD_PROGRAM_FAILURE Error creating integral program from source Failed to calculate likelihood Background Epsilon (61.817300) must be >= 0, <= 1 18:13:51 (10595): called boinc_finish(1) </stderr_txt> ]]> ![]() |
Send message Joined: 16 Mar 10 Posts: 213 Credit: 109,633,250 RAC: 1,174 ![]() ![]() ![]() |
Keith, I popped something in your "New Linux system trashes all tasks" thread in the Number Crunching forum which may or may not help... http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4288 Cheers - Al. |
©2025 Astroinformatics Group