"Failed to calculate integral 0 Failed to calculate likelihood" errors

Author	Message
VictordeHollander Send message Joined: 9 Nov 10 Posts: 19 Credit: 71,077,081 RAC: 0	Message 67324 - Posted: 11 Apr 2018, 16:31:33 UTC Hi, Does anybody know what is causing the "Failed to calculate integral 0 Failed to calculate likelihood" errors? Likelihood time = 2.088974 s <background_integral3> 0.000135044562893 </background_integral3> <stream_integral3> 73.177979946439066 191.871483289292200 122.601402164019433 </stream_integral3> <background_likelihood3> -3.327750849128395 </background_likelihood3> <stream_only_likelihood3> -3.332280687700035 -4.324665411311142 -8.682139791419360 </stream_only_likelihood3> <search_likelihood3> -2.929028725467456 </search_likelihood3> Using SSE3 path Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 2.0 AMD-APP (1912.5) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices Profile: FULL_PROFILE Using device 0 on platform 0 Found 1 CL device Device 'Tahiti' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Board: AMD Radeon HD 7900 Series Driver version: 1912.5 (VM) Version: OpenCL 1.2 AMD-APP (1912.5) Compute capability: 0.0 Max compute units: 28 Clock frequency: 800 Mhz Global mem size: 2896491072 Local mem size: 32768 Max const buf size: 65536 Double extension: cl_khr_fp64 Estimated AMD GPU GFLOP/s: 2867 SP GFLOP/s, 717 DP FLOP/s Using a target frequency of 60.0 Using a block size of 7168 with 78 blocks/chunk Using clWaitForEvents() for polling (mode -1) Range: { nu_steps = 320, mu_steps = 800, r_steps = 700 } Iteration area: 560000 Chunk estimate: 1 Num chunks: 2 Chunk size: 559104 Added area: 558208 Effective area: 1118208 Initial wait: 20 ms Integration time: 11.634570 s. Average time per iteration = 36.358030 ms Integral 0 time = 11.993249 s Failed to calculate integral 0 Failed to calculate likelihood For instance in this WU: http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=2308241462 It calculates a lot of streams successfully (and randomly fails at that point) The system has 600+ valid tasks and some (32 at the moment) invalid and a few with error status (7). The ones that failed have different de_modfit_XX and it seems to happen at random? Is this a hardware or driver or BOINC issue? OS: Ubuntu 14.04 GPU: AMD HD7950 BOINC: 7.9.3 ID: 67324 · Rating: 0 · rate: / Reply Quote

VictordeHollander Send message Joined: 9 Nov 10 Posts: 19 Credit: 71,077,081 RAC: 0	Message 67334 - Posted: 15 Apr 2018, 16:08:42 UTC I installed Windows 10 Pro (1709) on the hardware and it now runs without errors (700+ valid tasks). Previously it would produce the error (above) in about 1 every 20 tasks (so 1 "Failed to calculate likelihood" in about 100 WUs/streams). Now that I know the hardware is fine, I suspect it is one of these: 1. the AMD graphic cards drivers for Linux (I used the .deb package for Ubuntu 14.04.2) 2. BOINC client (7.9.3 on Ubuntu vs. 7.8.3 on Win10) 3. Priority (Ubuntu runs BOINC and subprocesses at "nice 10", so lower than standard/normal, while Windows at Normal/standard priority (equivalent to nice 0). The lower priority could mean it takes too long before the task gets CPU time and errors out. I can change the nicelevel of the boinc-client to 0 on Ubuntu with superuser commands, but every OpenCL process/WU start with nice 10 again. ID: 67334 · Rating: 0 · rate: / Reply Quote

VictordeHollander Send message Joined: 9 Nov 10 Posts: 19 Credit: 71,077,081 RAC: 0	Message 67335 - Posted: 15 Apr 2018, 16:12:45 UTC or 4. the Milkyway OpenCL Linux executable ID: 67335 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 24 Jan 11 Posts: 716 Credit: 557,681,598 RAC: 30,565	Message 67507 - Posted: 19 May 2018, 20:49:09 UTC - in response to Message 67334. I installed Windows 10 Pro (1709) on the hardware and it now runs without errors (700+ valid tasks). Previously it would produce the error (above) in about 1 every 20 tasks (so 1 "Failed to calculate likelihood" in about 100 WUs/streams). Now that I know the hardware is fine, I suspect it is one of these: 1. the AMD graphic cards drivers for Linux (I used the .deb package for Ubuntu 14.04.2) 2. BOINC client (7.9.3 on Ubuntu vs. 7.8.3 on Win10) 3. Priority (Ubuntu runs BOINC and subprocesses at "nice 10", so lower than standard/normal, while Windows at Normal/standard priority (equivalent to nice 0). The lower priority could mean it takes too long before the task gets CPU time and errors out. I can change the nicelevel of the boinc-client to 0 on Ubuntu with superuser commands, but every OpenCL process/WU start with nice 10 again. I may have a solution/suggestion for you. I run a bash file at startup that permanently assigns affinity and process level for my Seti applications. It uses an app called schedtool that can be retrieved from the repository. I just set the nice level of each application that I want to run with fixed priority. I downlevel the cpu apps and uplevel the gpu apps. This is what the file looks like. You can get an idea and make a similar script that calls your specific application and allow you to raise the scheduling priority. #Run in root terminal, NOT sudo nvidia-smi -pm 1 for (( ; ; )) do # Assign CPU Priority (19=Nice/LowPriority, 0=Normal, -20=HighPriority) # This was code Petri gave out # GPU Tasks get high Priority schedtool -n -20 `pidof setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda90` schedtool -n -20 `pidof astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100` # CPU Tasks get (a little) Below Normal Priority (0 being normal) to make sure it doesn't choke the OS schedtool -n 5 `pidof ap_7.05r2728_sse3_linux64` schedtool -n 5 `pidof MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu` # Assign CPU Usage Threads (0-7) # Brent added this to Petri's code # Keep GPU tasks on threads 1 3 5 7 9 11 13 15 schedtool -a 1,3,5,7,9,11,13,15 `pidof setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda90` schedtool -a 1,3,5,7,9,11,13,15 `pidof astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100` # Keep CPU tasks on threads 0 2 4 6 8 10 12 14 schedtool -a 0,2,4,6,8,10,12,14 `pidof MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu` schedtool -a 0,2,4,6,8,10,12,14 `pidof ap_7.05r2728_sse3_linux64` # CPU Priority Assignment Script date # lscpu \| grep MHz sleep 5 echo " CPU Priority and Assignment Script (8 Threads)" done You just run it from a root terminal and then minimize the script and leave it running. It runs every 5 seconds to pick up the next task being run. You would have to alter the nvidia-setting persistence line to whatever is similar or needed for your AMD cards. ID: 67507 · Rating: 0 · rate: / Reply Quote