Welcome to MilkyWay@home

What is the cause of these 'validate errors'


Advanced search

Message boards : Number crunching : What is the cause of these 'validate errors'
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
ProfileTex1954

Send message
Joined: 22 Apr 11
Posts: 61
Credit: 898,269,476
RAC: 7,699
500 million credit badge9 year member badgeextraordinary contributions badge
Message 63241 - Posted: 17 Mar 2015, 19:24:44 UTC - in response to Message 63240.  
Last modified: 17 Mar 2015, 19:26:30 UTC

I get funny waveforms in MSI Afterburner using 13.2 moded drivers OR using the latest 14.2 drivers. CPU makes no difference.

The only difference between the E3-1230 setup and the 2600K setup is the PCIe bus speed... PCIe 3.0 vs. 2.0.

Notice it sometimes drops into low clock mode, a slight change of the clock speeds below default corrects it... or seems to.

I've run the clocks as low at 580MHz and seems the slower speeds create fewer errors...



In any case, there is something weird with the modified fit WU's for sure. Both setups perform the same weather using reference 7970 or latest Saphire R9 280x.

8-)
ID: 63241 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTex1954

Send message
Joined: 22 Apr 11
Posts: 61
Credit: 898,269,476
RAC: 7,699
500 million credit badge9 year member badgeextraordinary contributions badge
Message 63247 - Posted: 20 Mar 2015, 21:38:17 UTC - in response to Message 63241.  

I might add that some playing around with clocks and memory speeds on the GPU's has allowed me to sort of tune the setups and greatly reduce errors...

We will see how it goes, but so far so good..

8-)
ID: 63247 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 332
Credit: 203,522,173
RAC: 328,405
200 million credit badge9 year member badgeextraordinary contributions badge
Message 63510 - Posted: 3 May 2015, 22:45:25 UTC - in response to Message 63219.  

I just revisited this thread and see that only Tex provided you a link to errors we see on MilkyWay 1.36 tasks. Basically, the error.txt file output gets truncated. The exit status is always [0] but because the file doesn't contain any result information, the tasks get invalidated.

http://milkyway.cs.rpi.edu/milkyway/results.php?userid=147145&offset=0&show_names=0&state=5&appid=

This is the list from my two computers. Thought I should provide some input also so it isn't from just one user.

Cheers, Keith
ID: 63510 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 332
Credit: 203,522,173
RAC: 328,405
200 million credit badge9 year member badgeextraordinary contributions badge
Message 63522 - Posted: 5 May 2015, 15:47:28 UTC - in response to Message 63510.  

I just revisited this thread and see that only Tex provided you a link to errors we see on MilkyWay 1.36 tasks. Basically, the error.txt file output gets truncated. The exit status is always [0] but because the file doesn't contain any result information, the tasks get invalidated.

http://milkyway.cs.rpi.edu/milkyway/results.php?userid=147145&offset=0&show_names=0&state=5&appid=

This is the list from my two computers. Thought I should provide some input also so it isn't from just one user.

Cheers, Keith



Here is a link to some invalids you can actually see. Sorry about that.

http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=257518&offset=0&show_names=0&state=5&appid=

Cheers, Keith
ID: 63522 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tom*

Send message
Joined: 4 Oct 11
Posts: 38
Credit: 285,340,761
RAC: 0
200 million credit badge8 year member badgeextraordinary contributions badge
Message 63526 - Posted: 5 May 2015, 18:37:34 UTC
Last modified: 5 May 2015, 18:38:11 UTC

ID: 63526 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilekhryl

Send message
Joined: 11 Feb 11
Posts: 57
Credit: 69,475,644
RAC: 0
50 million credit badge9 year member badge
Message 63527 - Posted: 5 May 2015, 18:59:53 UTC
Last modified: 5 May 2015, 19:00:23 UTC

i have some invalid modified fit workunits as well.
http://milkyway.cs.rpi.edu/milkyway/results.php?userid=150155&offset=0&show_names=0&state=5&appid=

r9 280x (3 wus at a time)
no other projects running atm
ID: 63527 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 332
Credit: 203,522,173
RAC: 328,405
200 million credit badge9 year member badgeextraordinary contributions badge
Message 63528 - Posted: 5 May 2015, 19:00:46 UTC - in response to Message 63526.  

I've attracted some interest in this problem over on the Seti Number Cruncher forum and have some of the BOINC and app developers looking into the issue. They seem to think they might have a handle on just what the problem might be. It is not a user equipment failure but a problem in the underlying BOINC platform code. Lets's hope something fruitful comes of their investigations.

Cheers, Keith
ID: 63528 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 332
Credit: 203,522,173
RAC: 328,405
200 million credit badge9 year member badgeextraordinary contributions badge
Message 63535 - Posted: 6 May 2015, 21:46:29 UTC - in response to Message 63526.  

Here are mine

http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=573186&offset=0&show_names=0&state=5&appid=

18 Validate errors so far just today on one system



Hi Tom, what is interesting is that it looks like only your AMD FX-8350 system has the truncated std_error.txt results. Your Intel system is just producing results that don't validate against your wingmen. What is most interesting is that I too am running two AMD FX-8350 hosts and produce the truncated std_error.txt results invalids on them. I wonder if this some commonality I hadn't noticed before.

Cheers, Keith
ID: 63535 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tom*

Send message
Joined: 4 Oct 11
Posts: 38
Credit: 285,340,761
RAC: 0
200 million credit badge8 year member badgeextraordinary contributions badge
Message 63546 - Posted: 9 May 2015, 17:54:45 UTC
Last modified: 9 May 2015, 18:05:40 UTC

Hi Keith,

I view the validate errors on my Haswell i7 a little differently

Yes my amd fx-8350 always truncates the whole stderr.

But

My Haswell (also running an HD7950) truncates the stderr after the

Initial wait always at the same place. Although the FX-8350 had many more
errors per day than the Haswell.

Iteration area: 560000
Chunk estimate: 1
Num chunks: 2
Chunk size: 559104
Added area: 558208
Effective area: 1118208
Initial wait: 16 ms

</stderr_txt>
]]>
ID: 63546 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 332
Credit: 203,522,173
RAC: 328,405
200 million credit badge9 year member badgeextraordinary contributions badge
Message 63549 - Posted: 10 May 2015, 18:21:12 UTC - in response to Message 63546.  

I wasn't aware of an invalid task that produced semi-truncated std_error.txt output. Everything I've seen so far is the extreme truncated output like this:

<core_client_version>7.4.42</core_client_version>
<![CDATA[
<stderr_txt>

</stderr_txt>
]]>

As I stated earlier in the thread, it seems I have finally started an investigation by the developers into these kinds of invalid results. The answer previously was always it is just an isolated incident common to your hardware. Now the developers have acknowledged that is a lot more common than previously thought and is a problem with the underlying BOINC code and not just with specific projects. There is also a newly recognized problem of BOINC failing to delete or remove files sizes above 4GB in project slots. Let us hope that the BOINC developers can release a new code level that fixes these issues ...... and doesn't introduce brand new problems.

Cheers, Keith
ID: 63549 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 99
Credit: 25,715,405
RAC: 47
20 million credit badge10 year member badge
Message 63696 - Posted: 10 Jun 2015, 21:34:19 UTC

I got also 1 (until now) validate error:

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=845965973

The <stderr_txt> isn't complete.
ID: 63696 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 332
Credit: 203,522,173
RAC: 328,405
200 million credit badge9 year member badgeextraordinary contributions badge
Message 63793 - Posted: 9 Jul 2015, 20:55:36 UTC - in response to Message 63549.  

Just a quick followup to the truncated stderr.txt problem that I started as OP. It looks like we finally understand the nature of the problem and an analysis report has been submitted to the boinc_dev mailing list. Now we just have to wait for a fix or work around to the problem by the BOINC developers. I'd like to thank Richard Haselgrove for working with me and for submitting the boinc_dev report. You can follow the analysis and discussion of the "race condition" over at SETI@Home Panic Mode thread here.

Cheers, Keith
ID: 63793 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 Sep 12
Posts: 219
Credit: 448,778
RAC: 0
100 thousand credit badge8 year member badge
Message 63799 - Posted: 14 Jul 2015, 15:46:26 UTC

After intensive work with Keith Myers and others (mainly in the SETI message board thread Stderr Truncations), I think I've finally traced and recorded the full life-cycle of these little beasties.

The easiest starting point is the debris left behind.



The task completed, and for 'some reason' (we'll come back to that later) BOINC couldn't delete one of the files. So it left it for later, and moved to another slot for the next task. In the message log, that looks like

14-Jul-2015 15:49:11 [---] [slot] cleaning out slots/2: handle_exited_app()
14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/astronomy_parameters.txt
14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/boinc_finish_called
14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/boinc_task_state.xml
14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/init_data.xml
14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/milkyway_separation__modified_fit_1.36_windows_x86_64__opencl_nvidia_101.exe
14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/separation_checkpoint
14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/stars.txt
14-Jul-2015 15:49:11 [---] [slot] failed to remove file slots/2/stderr.txt: Error 32
14-Jul-2015 15:49:11 [Milkyway@Home] Computation for task ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9901989_0 finished
14-Jul-2015 15:49:11 [---] [slot] cleaning out slots/2: get_free_slot()
14-Jul-2015 15:49:11 [---] [slot] failed to remove file slots/2/stderr.txt: Error 32
14-Jul-2015 15:49:11 [Milkyway@Home] [slot] failed to clean out dir: unlink() failed
14-Jul-2015 15:49:11 [---] [slot] cleaning out slots/10: get_free_slot()
14-Jul-2015 15:49:11 [Milkyway@Home] [slot] assigning slot 10 to de_80_DR8_Rev_8_5_00004_1434551187_13360920_0

Note that the timestamps match.

According to MSDN, error 32 is

ERROR_SHARING_VIOLATION
32 (0x20)
The process cannot access the file because it is being used by another process.

- BOINC couldn't delete the file, because Milkyway was still writing to it.

On the website, we see task 1187921853: Name ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9901989_0, Received 14 Jul 2015, 14:50:08 UTC - again it matches (my timezone is UTC+1).

The stderr on the website ends

...
Initial wait: 12 ms
Integration time: 133.964844 s. Average time per iteration = 418.640136 ms
Integral 0 time = 135.042252 s
Running likelihood with 108458 stars

</stderr_txt>

- no final result or call to boinc_finish

But I just had time to copy stderr.txt to another part of my hard disk:



That copy ends

...
Initial wait: 12 ms
Integration time: 133.964844 s. Average time per iteration = 418.640136 ms
Integral 0 time = 135.042252 s
Running likelihood with 108458 stars
Likelihood time = 2.782655 s
<background_integral> 0.000265723224422 </background_integral>
<stream_integral> 209.417694469056580 135.316345272137030 37.756694047809596 </stream_integral>
<background_likelihood> -3.403332787286266 </background_likelihood>
<stream_only_likelihood> -4.236377567130232 -4.667012639515129 -4.359314280913779 </stream_only_likelihood>
<search_likelihood> -3.090730944956150 </search_likelihood>
15:49:09 (6496): called boinc_finish

Again, note that the Integration time, Average time per iteration, and Integral 0 time all match (they vary from task to task), and that the call to boinc_finish timestamp matches the message log.

If BOINC had waited until the last few lines had been appended to stderr.txt, as they later were, before preparing the report for the server, I have every reason to believe this would have been a valid report.

It took at least 3,200 tasks to reach that point (and I think a few of the early ones have already been purged). I'll take a pause from this project for a while, and let the GPU chew on a nice restful GPUGrid task (17 hours with none of this frantic uploading and downloading). But I'll come back and test any fix that David can come up with.
ID: 63799 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 Sep 12
Posts: 219
Credit: 448,778
RAC: 0
100 thousand credit badge8 year member badge
Message 63801 - Posted: 14 Jul 2015, 18:20:08 UTC
Last modified: 14 Jul 2015, 18:25:56 UTC

David has applied a possible fix for this:

client (Win): when read stderr.txt, wait for write lock to be release first.

Apparently, on Win, there is still a write lock on stderr.txt,
and its buffer isn't flushed, until shortly after the app process exits.
This is bizarre, but so be it.

and Rom has built a installer to test it.

I've built a new version of 7.6 with David's latest change to address this issue.

http://boinc.berkeley.edu/dl/boinc_7.6.6_windows_intelx86.exe
http://boinc.berkeley.edu/dl/boinc_7.6.6_windows_x86_64.exe

----- Rom

Those of you who have some experience already with v7.6.2 might like to try this and see how it compares - bearing in mind that at this point it is totally untested. (That's our job!)

I'm clocking off the the night, but I'll switch back tomorrow morning and add to the testing effort.

Edit - additional comment from David:

I checked in a workaround in which the client waits until
stderr.txt is not locked before reading it.
Can people please review this change?
-- David

Windows programmers are invited to look at

http://boinc.berkeley.edu/gitweb/?p=boinc-v2.git;a=commitdiff;h=f2d690029c6dab9d586a9ba1a2e0af03dc7f3c70
ID: 63801 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 332
Credit: 203,522,173
RAC: 328,405
200 million credit badge9 year member badgeextraordinary contributions badge
Message 63802 - Posted: 15 Jul 2015, 0:21:44 UTC

Great news Richard in capturing the wild beast. I know it is tough because of how MW cycles the slots every minute for 1.36 tasks. I'm off to give the new 7.6.6 drop the acid test on MW. Thanks again.

Cheers, Keith
ID: 63802 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 332
Credit: 203,522,173
RAC: 328,405
200 million credit badge9 year member badgeextraordinary contributions badge
Message 63803 - Posted: 15 Jul 2015, 1:15:21 UTC

This is the first validated task with the new 7.7.6 client. Everything looks the same except for the PID callout which I don't remember seeing before. Haven't seen a invalid, blank result yet just some inconclusives. Haven't had the new client running long enough and didn't think to turn off the 1.02 tasks and suspend Einstein for 15 minutes after installing the new client. Just SETI and MW running now. I'll be shutting the systems down soon for the night and start fresh tomorrow morning. Here is the log with the extra flags and the task result.



446 Milkyway@Home 7/14/2015 5:51:56 PM [slot] assigning slot 0 to ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0
455 Milkyway@Home 7/14/2015 5:51:56 PM [task] task_state=EXECUTING for ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0 from start
456 Milkyway@Home 7/14/2015 5:51:56 PM Starting task ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0
457 Milkyway@Home 7/14/2015 5:51:56 PM [cpu_sched] Starting task ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0 using milkyway_separation__modified_fit version 136 (opencl_nvidia_101) in slot 0
588 Milkyway@Home 7/14/2015 5:52:58 PM [task] result ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0 checkpointed
632 Milkyway@Home 7/14/2015 5:53:41 PM [task] Process for ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0 exited, exit code 0, task state 1
633 Milkyway@Home 7/14/2015 5:53:41 PM [task] task_state=EXITED for ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0 from handle_exited_app
634 7/14/2015 5:53:41 PM [slot] cleaning out slots/0: handle_exited_app()
635 7/14/2015 5:53:41 PM [slot] removed file slots/0/astronomy_parameters.txt
636 7/14/2015 5:53:41 PM [slot] removed file slots/0/boinc_finish_called
637 7/14/2015 5:53:41 PM [slot] removed file slots/0/boinc_task_state.xml
638 7/14/2015 5:53:41 PM [slot] removed file slots/0/init_data.xml
639 7/14/2015 5:53:41 PM [slot] removed file slots/0/milkyway_separation__modified_fit_1.36_windows_x86_64__opencl_nvidia_101.exe
640 7/14/2015 5:53:41 PM [slot] removed file slots/0/separation_checkpoint
641 7/14/2015 5:53:41 PM [slot] removed file slots/0/stars.txt
642 7/14/2015 5:53:41 PM [slot] removed file slots/0/stderr.txt
643 Milkyway@Home 7/14/2015 5:53:41 PM Computation for task ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0 finished
644 Milkyway@Home 7/14/2015 5:53:41 PM [task] result state=FILES_UPLOADING for ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0 from CS::app_finished
645 Milkyway@Home 7/14/2015 5:53:41 PM [task] result state=FILES_UPLOADED for ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9990546_0 from CS::update_results
646 7/14/2015 5:53:41 PM [slot] cleaning out slots/0: get_free_slot()
650 7/14/2015 5:53:41 PM request_exit(): PID 5008 has 0 descendants
651 7/14/2015 5:53:41 PM [slot] removed file slots/0/init_data.xml
664 7/14/2015 5:53:41 PM [slot] removed file slots/0/boinc_temporary_exit

<core_client_version>7.6.6</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_separation 1.36 Windows x86_64 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Switching to Parameter File
Using AVX path
Found 1 platform
Platform 0 information:
Name: NVIDIA CUDA
Version: OpenCL 1.2 CUDA 7.5.9
Vendor: NVIDIA Corporation
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
Profile: FULL_PROFILE
Using device 1 on platform 0
Found 2 CL devices
Device 'GeForce GTX 970' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU)
Board:
Driver version: 353.30
Version: OpenCL 1.2 CUDA
Compute capability: 5.2
Max compute units: 13
Clock frequency: 1279 Mhz
Global mem size: 4294967296
Local mem size: 49152
Max const buf size: 65536
Double extension: cl_khr_fp64
Build log:
--------------------------------------------------------------------------------

ptxas info : 0 bytes gmem
ptxas info : Compiling entry function 'probabilities' for 'sm_52'
ptxas info : Function properties for probabilities
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 96 registers, 420 bytes cmem[0], 152 bytes cmem[2]
--------------------------------------------------------------------------------
Build log:
--------------------------------------------------------------------------------


--------------------------------------------------------------------------------
Estimated Nvidia GPU GFLOP/s: 1064 SP GFLOP/s, 133 DP FLOP/s
Using a target frequency of 60.0
Using a block size of 8320 with 8 blocks/chunk
Using clWaitForEvents() for polling with initial wait of 13 ms (mode 0)
Range: { nu_steps = 320, mu_steps = 800, r_steps = 700 }
Iteration area: 560000
Chunk estimate: 8
Num chunks: 9
Chunk size: 66560
Added area: 39040
Effective area: 599040
Initial wait: 13 ms
Integration time: 95.871923 s. Average time per iteration = 299.599760 ms
Integral 0 time = 96.521371 s
Running likelihood with 108458 stars
Likelihood time = 4.002593 s
<background_integral> 0.000342179663701 </background_integral>
<stream_integral> 3.453552708168287 228.484507039272870 24.801193300493413 </stream_integral>
<background_likelihood> -4.268041850474944 </background_likelihood>
<stream_only_likelihood> -142.051140945806140 -4.812913902331247 -4.792916224352506 </stream_only_likelihood>
<search_likelihood> -3.447714477738023 </search_likelihood>
17:53:38 (4848): called boinc_finish

</stderr_txt>
]]>
ID: 63803 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 Sep 12
Posts: 219
Credit: 448,778
RAC: 0
100 thousand credit badge8 year member badge
Message 63804 - Posted: 15 Jul 2015, 18:21:06 UTC

Not a single validate error, from over 500 tasks processed under BOINC v7.6.6 since this morning.
ID: 63804 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tom*

Send message
Joined: 4 Oct 11
Posts: 38
Credit: 285,340,761
RAC: 0
200 million credit badge8 year member badgeextraordinary contributions badge
Message 63805 - Posted: 15 Jul 2015, 19:43:54 UTC

Results so far for 7.6.6

State: All (2458) · In progress (40) · Validation pending (0)
· Validation inconclusive (92) · Valid (2326) · Invalid (0) · Error (0)

Application: All (2458) · MilkyWay@Home (1331) · MilkyWay@Home N-Body Simulation (0) · Milkyway@Home Separation (0) ·

Milkyway@Home Separation (Modified Fit) (1127)

Thanks Keith and Richard for pushing the workaround
ID: 63805 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 332
Credit: 203,522,173
RAC: 328,405
200 million credit badge9 year member badgeextraordinary contributions badge
Message 63806 - Posted: 15 Jul 2015, 19:53:47 UTC - in response to Message 63805.  

Over 200 valid 1.36 tasks so far since systems came back online this morning with the new 7.6.6 client. Looking good so far and thinking of turning off the extra logging data since we seem to have finally overcome the errors. Thanks for the help with the bug detection Richard, and all the other beta testers like Jeff and Jason over at SETI to help define the bug and the BOINC developers to come up with a solution and quickly implemented fix.

Cheers, Keith
ID: 63806 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 Sep 12
Posts: 219
Credit: 448,778
RAC: 0
100 thousand credit badge8 year member badge
Message 63807 - Posted: 16 Jul 2015, 14:17:40 UTC

Heading close to 2,000 without error now.

One additional problem at this project: the administrators have set quite a low 'maximum errors' threshhold.



Two validate errors together, plus one other glitch, and the whole workunit is killed. Once BOINC v7.6.6 (or its successor) is fully tested and released as 'recommended', I'd suggest you start a push to get as many people as possible to upgrade.
ID: 63807 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : What is the cause of these 'validate errors'

©2020 Astroinformatics Group