Welcome to MilkyWay@home

New Separation Runs

Message boards : News : New Separation Runs
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 71717 - Posted: 10 Feb 2022, 4:22:40 UTC
Last modified: 10 Feb 2022, 4:24:10 UTC

I've added the other runs (de_modfit_<71-86>_bundle5_3s_south_pt2_2). The first one that I put up was returning successful completions by crunchers already, so these should all be the same.

Thanks for your patience, and a shout-out to Al for correctly figuring out the problem.

Also heads up, these tasks will have -np 100 and not -np 104, so make sure to edit any scripts to account for that. If it does the weird 7-task thing again it will spit out -np 400 instead of -np 416.
ID: 71717 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mg13 [HWU]
Avatar

Send message
Joined: 22 Oct 09
Posts: 7
Credit: 20,215,177
RAC: 811
Message 71720 - Posted: 10 Feb 2022, 13:28:05 UTC - in response to Message 71717.  

I wanted to bring to Tom's attention, some errors that are in the log stderr_txt of the WU successful both the latest created and those before, here for example:

101372052 Activities
<core_client_version>7.16.20</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_separation 1.46 Windows x86 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 4 </number_WUs>
<number_params_per_WU> 26 </number_params_per_WU>
Using SSE4.1 path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 2.1 AMD-APP (3302.6)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'gfx1010:xnack-' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: AMD Radeon RX 5700 XT 50th Anniversary
Driver version: 3302.6 (PAL,LC)
Version: OpenCL 1.2 AMD-APP (3302.6)
Compute capability: 0.0
Max compute units: 20
Clock frequency: 1830 Mhz
Global mem size: 3221225472
Local mem size: 65536
Max const buf size: 3221225472
Double extension: cl_khr_fp64
Build log:
--------------------------------------------------------------------------------
C:\Users\MG\AppData\Local\Temp\comgr-2def57\input\CompileSource:183:72: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-2def57\input\CompileSource:185:62: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-2def57\input\CompileSource:186:67: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 warnings generated.

--------------------------------------------------------------------------------
Estimated AMD GPU GFLOP/s: 366 SP GFLOP/s, 73 DP FLOP/s
Warning: Bizarrely low flops (73). Defaulting to 100

Using a target frequency of 60.0
Using a block size of 5120 with 7 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 800, r_steps = 700 }
Iteration area: 560000
Chunk estimate: 14
Num chunks: 16
Chunk size: 35840
Added area: 13440
Effective area: 573440
Initial wait: 12 ms
Integration time: 22.525731 s. Average time per iteration = 70.392911 ms
Integral 0 time = 22.926961 s
Estimated AMD GPU GFLOP/s: 366 SP GFLOP/s, 73 DP FLOP/s
Warning: Bizarrely low flops (73). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 5120 with 1 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 10, r_steps = 700 }
Iteration area: 7000
Chunk estimate: 1
Num chunks: 2
Chunk size: 5120
Added area: 3240
Effective area: 10240
Initial wait: 0 ms
Integration time: 0.622743 s. Average time per iteration = 1.946071 ms
Integral 1 time = 0.671665 s
Running likelihood with 31964 stars
Likelihood time = 0.881139 s
<background_integral> 0.000011578806962 </background_integral>
<stream_integral> 0.000000000000018 4.676746998945490 22.100569923848273 23.169267332870888 </stream_integral>
<background_likelihood> -3.504938847511503 </background_likelihood>
<stream_only_likelihood> -160.335902972520900 -88.935250948764889 -5.504389023994463 -4.553139945975357 </stream_only_likelihood>
<search_likelihood> -2.127413849501802 </search_likelihood>
Using SSE4.1 path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 2.1 AMD-APP (3302.6)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'gfx1010:xnack-' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: AMD Radeon RX 5700 XT 50th Anniversary
Driver version: 3302.6 (PAL,LC)
Version: OpenCL 1.2 AMD-APP (3302.6)
Compute capability: 0.0
Max compute units: 20
Clock frequency: 1830 Mhz
Global mem size: 3221225472
Local mem size: 65536
Max const buf size: 3221225472
Double extension: cl_khr_fp64
Build log:
--------------------------------------------------------------------------------
C:\Users\MG\AppData\Local\Temp\comgr-3e8728\input\CompileSource:183:72: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-3e8728\input\CompileSource:185:62: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-3e8728\input\CompileSource:186:67: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 warnings generated.

--------------------------------------------------------------------------------
Estimated AMD GPU GFLOP/s: 366 SP GFLOP/s, 73 DP FLOP/s
Warning: Bizarrely low flops (73). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 5120 with 7 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 800, r_steps = 700 }
Iteration area: 560000
Chunk estimate: 14
Num chunks: 16
Chunk size: 35840
Added area: 13440
Effective area: 573440
Initial wait: 12 ms
Integration time: 22.231152 s. Average time per iteration = 69.472350 ms
Integral 0 time = 22.597274 s
Estimated AMD GPU GFLOP/s: 366 SP GFLOP/s, 73 DP FLOP/s
Warning: Bizarrely low flops (73). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 5120 with 1 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 10, r_steps = 700 }
Iteration area: 7000
Chunk estimate: 1
Num chunks: 2
Chunk size: 5120
Added area: 3240
Effective area: 10240
Initial wait: 0 ms
Integration time: 0.628210 s. Average time per iteration = 1.963156 ms
Integral 1 time = 0.669651 s
Running likelihood with 31964 stars
Likelihood time = 0.702059 s
<background_integral1> 0.000011298997624 </background_integral1>
<stream_integral1> 0.000000000000018 4.718137250742583 22.200288867106067 22.095712766296991 </stream_integral1>
<background_likelihood1> -3.503428995149281 </background_likelihood1>
<stream_only_likelihood1> -160.382577964957190 -92.273044140147903 -5.604255756215713 -4.575021296422328 </stream_only_likelihood1>
<search_likelihood1> -2.127744253550517 </search_likelihood1>
Using SSE4.1 path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 2.1 AMD-APP (3302.6)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'gfx1010:xnack-' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: AMD Radeon RX 5700 XT 50th Anniversary
Driver version: 3302.6 (PAL,LC)
Version: OpenCL 1.2 AMD-APP (3302.6)
Compute capability: 0.0
Max compute units: 20
Clock frequency: 1830 Mhz
Global mem size: 3221225472
Local mem size: 65536
Max const buf size: 3221225472
Double extension: cl_khr_fp64
Build log:
--------------------------------------------------------------------------------
C:\Users\MG\AppData\Local\Temp\comgr-611d43\input\CompileSource:183:72: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-611d43\input\CompileSource:185:62: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-611d43\input\CompileSource:186:67: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 warnings generated.

--------------------------------------------------------------------------------
Estimated AMD GPU GFLOP/s: 366 SP GFLOP/s, 73 DP FLOP/s
Warning: Bizarrely low flops (73). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 5120 with 7 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 800, r_steps = 700 }
Iteration area: 560000
Chunk estimate: 14
Num chunks: 16
Chunk size: 35840
Added area: 13440
Effective area: 573440
Initial wait: 12 ms
Integration time: 22.273608 s. Average time per iteration = 69.605025 ms
Integral 0 time = 22.595202 s
Estimated AMD GPU GFLOP/s: 366 SP GFLOP/s, 73 DP FLOP/s
Warning: Bizarrely low flops (73). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 5120 with 1 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 10, r_steps = 700 }
Iteration area: 7000
Chunk estimate: 1
Num chunks: 2
Chunk size: 5120
Added area: 3240
Effective area: 10240
Initial wait: 0 ms
Integration time: 0.603242 s. Average time per iteration = 1.885131 ms
Integral 1 time = 0.645424 s
Running likelihood with 31964 stars
Likelihood time = 0.859699 s
<background_integral2> 0.000011026183914 </background_integral2>
<stream_integral2> 0.000000000000018 4.584594365941732 20.180881313076032 23.432775015272508 </stream_integral2>
<background_likelihood2> -3.524231368066430 </background_likelihood2>
<stream_only_likelihood2> -160.301925237461060 -90.620901173636867 -5.710452295409489 -4.465951711195169 </stream_only_likelihood2>
<search_likelihood2> -2.128120330810599 </search_likelihood2>
Using SSE4.1 path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 2.1 AMD-APP (3302.6)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'gfx1010:xnack-' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: AMD Radeon RX 5700 XT 50th Anniversary
Driver version: 3302.6 (PAL,LC)
Version: OpenCL 1.2 AMD-APP (3302.6)
Compute capability: 0.0
Max compute units: 20
Clock frequency: 1830 Mhz
Global mem size: 3221225472
Local mem size: 65536
Max const buf size: 3221225472
Double extension: cl_khr_fp64
Build log:
--------------------------------------------------------------------------------
C:\Users\MG\AppData\Local\Temp\comgr-946fa1\input\CompileSource:183:72: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-946fa1\input\CompileSource:185:62: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-946fa1\input\CompileSource:186:67: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 warnings generated.

--------------------------------------------------------------------------------
Estimated AMD GPU GFLOP/s: 366 SP GFLOP/s, 73 DP FLOP/s
Warning: Bizarrely low flops (73). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 5120 with 7 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 800, r_steps = 700 }
Iteration area: 560000
Chunk estimate: 14
Num chunks: 16
Chunk size: 35840
Added area: 13440
Effective area: 573440
Initial wait: 12 ms
Integration time: 22.314301 s. Average time per iteration = 69.732190 ms
Integral 0 time = 22.641366 s
Estimated AMD GPU GFLOP/s: 366 SP GFLOP/s, 73 DP FLOP/s
Warning: Bizarrely low flops (73). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 5120 with 1 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 10, r_steps = 700 }
Iteration area: 7000
Chunk estimate: 1
Num chunks: 2
Chunk size: 5120
Added area: 3240
Effective area: 10240
Initial wait: 0 ms
Integration time: 0.606541 s. Average time per iteration = 1.895440 ms
Integral 1 time = 0.645266 s
Running likelihood with 31964 stars
Likelihood time = 0.848916 s
<background_integral3> 0.000011112955957 </background_integral3>
<stream_integral3> 0.000000000000018 4.746189554196099 22.757572218739661 23.780151288055961 </stream_integral3>
<background_likelihood3> -3.520082813378394 </background_likelihood3>
<stream_only_likelihood3> -160.242420984124520 -85.978986494325142 -5.439495829712062 -4.504009759915403 </stream_only_likelihood3>
<search_likelihood3> -2.127477774341335 </search_likelihood3>
20:38:04 (14020): called boinc_finish(0)

</stderr_txt>
]]>

I just wanted to know if these errors are normal or can cause some problem, even if the processing of the WU is regular?
ID: 71720 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 71724 - Posted: 10 Feb 2022, 16:14:44 UTC - in response to Message 71720.  
Last modified: 10 Feb 2022, 16:17:00 UTC

I'll go through them in order:

Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 4 </number_WUs>
<number_params_per_WU> 26 </number_params_per_WU>


This isn't really an error, it's just the program saying "I didn't find a Lua script, so I'm going to use the parameters file instead". This is actually what we want (the lua script is outdated, and we use parameter files now). This shows up in every workunit.

Build log:
--------------------------------------------------------------------------------
C:\Users\MG\AppData\Local\Temp\comgr-2def57\input\CompileSource:183:72: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-2def57\input\CompileSource:185:62: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\Users\MG\AppData\Local\Temp\comgr-2def57\input\CompileSource:186:67: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
__constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 warnings generated.


The warnings that you linked are compiler warnings, so they aren't expected to cause any problems. It just means that the code is written in a way that the compiler doesn't "prefer". This shows up in every workunit.

--------------------------------------------------------------------------------
Estimated AMD GPU GFLOP/s: 366 SP GFLOP/s, 73 DP FLOP/s
Warning: Bizarrely low flops (73). Defaulting to 100


This one seems to be related to your graphics card, because it doesn't show up in every workunit. Separation does an estimate of how long the WU will take to run before it starts calculations. It does this by looking at how many FLOP/s (floating-point operations per second) your GPU can do. If it's below some expected threshold apparently it comments on that and then warns you that it is actually using another number for the calculation. This could actually be due to the graphics card that you're using, or it could be due to a quirk in how Separation calculates the FLOP/s. Either way, I don't expect it to change anything about the actual calculation.
ID: 71724 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71725 - Posted: 10 Feb 2022, 19:33:58 UTC

I'm getting a small number of tasks erroring out after a few seconds, are these just leftovers resent from before? This is one:
https://milkyway.cs.rpi.edu/milkyway/result.php?resultid=108288769

<core_client_version>7.16.20</core_client_version>
<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>
<stderr_txt>
<search_application> milkyway_separation 1.46 Windows x86 double OpenCL </search_application>
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' 
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 5 </number_WUs>
<number_params_per_WU> 20 </number_params_per_WU>
Using SSE4.1 path
Found 1 platform
Platform 0 information:
  Name:       AMD Accelerated Parallel Processing
  Version:    OpenCL 2.1 AMD-APP (3240.6)
  Vendor:     Advanced Micro Devices, Inc.
  Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices 
  Profile:    FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'Tahiti' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: AMD Radeon R9 200 Series
Driver version:      3240.6
Version:             OpenCL 1.2 AMD-APP (3240.6)
Compute capability:  0.0
Max compute units:   32
Clock frequency:     1070 Mhz
Global mem size:     3221225472
Local mem size:      32768
Max const buf size:  65536
Double extension:    cl_khr_fp64
Build log:
--------------------------------------------------------------------------------
"C:\Users\pc\AppData\Local\Temp\OCL12448T1.cl", line 235: error: identifier
          "inf" is undefined
          tmp = mad((real) Q_INV_SQR, z * z, tmp);   /* (q_invsqr * z^2) + (x^2 + y^2) */
                           ^

1 error detected in the compilation of "C:\Users\pc\AppData\Local\Temp\OCL12448T1.cl".
Frontend phase failed compilation.

--------------------------------------------------------------------------------
clBuildProgram: Build failure (-11): CL_BUILD_PROGRAM_FAILURE
Error building program from source (-11): CL_BUILD_PROGRAM_FAILURE
Error creating integral program from source
Failed to calculate likelihood
Background Epsilon (34.080700) must be >= 0, <= 1
17:07:46 (12448): called boinc_finish(1)

</stderr_txt>
]]>
ID: 71725 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 696
Credit: 540,012,208
RAC: 86,812
Message 71726 - Posted: 10 Feb 2022, 19:46:39 UTC - in response to Message 71725.  

Yes, that one is a leftover resend from the initial bad batch. Ignore it since Tom and Al have figured out the issue and Tom has put up new stripes with the fixed parameter sets.
ID: 71726 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Max_Pirx

Send message
Joined: 13 Dec 17
Posts: 46
Credit: 2,421,362,376
RAC: 0
Message 71728 - Posted: 10 Feb 2022, 20:43:47 UTC

Well, I find these bad leftovers still problematic as they cause my client to backoff exponentially for some reason.
ID: 71728 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71729 - Posted: 10 Feb 2022, 20:52:44 UTC - in response to Message 71728.  

Well, I find these bad leftovers still problematic as they cause my client to backoff exponentially for some reason.
There must be something in Boinc that does that if there's a lot of errors, to stop you from overloading a server with many mistakes which could be your fault. You can manually override it by updating the project, but that's kinda hard when we can only get a couple of hours work at a time. I set up a Windows event on one of mine to update the project every 15 minutes.
ID: 71729 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 May 11
Posts: 66
Credit: 5,635,044
RAC: 46
Message 71730 - Posted: 10 Feb 2022, 21:46:29 UTC - in response to Message 71729.  

cd e:\Program Files\BOINC
e:
:loop
TIMEOUT /T 5 /nobreak
boinccmd.exe --project https://milkyway.cs.rpi.edu/milkyway update

TIMEOUT /T 100 /nobreak
goto loop
ID: 71730 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71731 - Posted: 10 Feb 2022, 22:03:06 UTC - in response to Message 71730.  

cd e:\Program Files\BOINC
e:
:loop
TIMEOUT /T 5 /nobreak
boinccmd.exe --project https://milkyway.cs.rpi.edu/milkyway update

TIMEOUT /T 100 /nobreak
goto loop
Doesn't that leave a command prompt window open? I use task scheduler (on startup then every 15 minutes).
ID: 71731 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 May 11
Posts: 66
Credit: 5,635,044
RAC: 46
Message 71732 - Posted: 10 Feb 2022, 22:05:04 UTC - in response to Message 71731.  

I have tor node run through bat too.
ID: 71732 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 696
Credit: 540,012,208
RAC: 86,812
Message 71733 - Posted: 10 Feb 2022, 23:48:06 UTC

We are still going to get the leftover resends from the last batch and the resends from the initial new batch with the badly formatted parameter set until all hit their too many errors limit upon which they will be finally retired and not sent back out again.

Patience is required.
ID: 71733 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71734 - Posted: 10 Feb 2022, 23:51:48 UTC - in response to Message 71733.  
Last modified: 10 Feb 2022, 23:52:14 UTC

We are still going to get the leftover resends from the last batch and the resends from the initial new batch with the badly formatted parameter set until all hit their too many errors limit upon which they will be finally retired and not sent back out again.

Patience is required.
Not that they bother me (it's a very small number which isn't triggering a backoff here and they don't waste processing time), but surely the admin can cancel them so they aren't resent? Or is the Boinc server not that clever? I'm sure some projects actually cancel tasks when you have them on your computer.
ID: 71734 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 71735 - Posted: 11 Feb 2022, 0:40:30 UTC

I just went to cancel them, but it looks like the majority of them have already left the pool. It's nontrivial to cancel them since I can't query over the run names (since they both contain the same substring "south_pt2"). Unfortunately we'll just have to let them naturally error out.
ID: 71735 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 696
Credit: 540,012,208
RAC: 86,812
Message 71736 - Posted: 11 Feb 2022, 4:28:10 UTC - in response to Message 71735.  

That's unfortunate. I keep getting slugs of bad tasks that knock a host offline for 3 hours in backoff unless I manually intervene with an update.
ID: 71736 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 71737 - Posted: 11 Feb 2022, 4:37:01 UTC

I've started to remove them but it's slow-going - I can only cancel 999 at a time, and there are ~82k WUs from the original runs out there right now. Just going to throw on some TV and hit the cancel button over and over for a little while.
ID: 71737 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 71738 - Posted: 11 Feb 2022, 4:57:06 UTC

I've cancelled all of the bad workunits from the job pool. Hopefully you should see them disappear from your machines shortly.
ID: 71738 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cameron

Send message
Joined: 16 Dec 07
Posts: 37
Credit: 24,358,733
RAC: 5,896
Message 71739 - Posted: 11 Feb 2022, 5:51:02 UTC - in response to Message 71738.  

I've cancelled all of the bad workunits from the job pool. Hopefully you should see them disappear from your machines shortly.

Just had the last few on my machine cancel, so that worked.
ID: 71739 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Spatzthecat

Send message
Joined: 1 Dec 10
Posts: 82
Credit: 15,452,009,012
RAC: 2
Message 71742 - Posted: 11 Feb 2022, 19:31:00 UTC

Are they taking longer than the previous units?
They are on my hosts!
ID: 71742 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 71743 - Posted: 11 Feb 2022, 19:41:05 UTC

These jobs are batched into 5 workunit groups instead of 4 workunit groups like the last set of runs. This is because there are fewer parameters in each workunit so we can fit an extra workunit on each command line. You should be getting more credit for these tasks than the last set, I think. (However, it might be roughly the same credit because each workunit will take less time to crunch, because it has fewer parameters)
ID: 71743 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JohnDK
Avatar

Send message
Joined: 18 Feb 10
Posts: 53
Credit: 221,568,825
RAC: 9,748
Message 71744 - Posted: 11 Feb 2022, 20:55:34 UTC

On one of my PCs, they take about 25 secs longer and I get the same credit.
ID: 71744 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : News : New Separation Runs

©2024 Astroinformatics Group