Welcome to MilkyWay@home

Errors

Message boards : Number crunching : Errors
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
aad

Send message
Joined: 30 Mar 09
Posts: 63
Credit: 621,582,726
RAC: 4
Message 66608 - Posted: 15 Sep 2017, 22:07:41 UTC

My error rate is Climbing....
Some of the wu's error out after 1 or 2 seconds.
Is it me or a bad run...
ID: 66608 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JHMarshall

Send message
Joined: 24 Jul 12
Posts: 40
Credit: 7,123,301,054
RAC: 0
Message 66609 - Posted: 15 Sep 2017, 22:44:28 UTC - in response to Message 66608.  
Last modified: 15 Sep 2017, 23:06:33 UTC

I'm seeing errors also and so are other systems crunching the same WUs.

Update: rebooted system and first 4 WUs completed normally.
Will update after more results. ???????

On my systems the "de_modfit_fast_20_3s_146_bundle5_" WUs are getting computation errors. Other WUs seem to be working.
ID: 66609 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
aad

Send message
Joined: 30 Mar 09
Posts: 63
Credit: 621,582,726
RAC: 4
Message 66610 - Posted: 15 Sep 2017, 23:15:19 UTC
Last modified: 15 Sep 2017, 23:17:49 UTC

First I did was also reboot my system, but that did not do any good.
I got errors on;
"de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy"
and
"de_modfit_fast_20_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy"
ID: 66610 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DJBPace07

Send message
Joined: 21 Mar 15
Posts: 3
Credit: 47,175,569
RAC: 0
Message 66611 - Posted: 15 Sep 2017, 23:34:41 UTC

You can add me to the list, I'm also getting computation errors on WUs. The ones affected are those in the first post in this thread. All error out after a second or two.
ID: 66611 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
Message 66612 - Posted: 16 Sep 2017, 1:12:51 UTC - in response to Message 66611.  

I also just got 40+ failed WU's in a row. All of them failed on the same error:

<number_WUs> 5 </number_WUs>
<number_params_per_WU> 21 </number_params_per_WU>
Number of parameters doesn't make sense
ID: 66612 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MaDdCoW

Send message
Joined: 6 Nov 13
Posts: 2
Credit: 550,672,142
RAC: 0
Message 66613 - Posted: 16 Sep 2017, 1:14:05 UTC

Add me to the list as well. Was getting errors, rebooted and am still getting errors. About 30% error rate? Not doing anymore work until more info is made available.
ID: 66613 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Vortac

Send message
Joined: 22 Apr 09
Posts: 95
Credit: 4,808,181,963
RAC: 0
Message 66615 - Posted: 16 Sep 2017, 7:32:35 UTC

Another bad run, most likely.
ID: 66615 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Werkstatt

Send message
Joined: 19 Feb 08
Posts: 350
Credit: 141,284,369
RAC: 0
Message 66616 - Posted: 16 Sep 2017, 8:07:51 UTC

ID: 66616 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chooka
Avatar

Send message
Joined: 13 Dec 12
Posts: 101
Credit: 1,782,658,327
RAC: 0
Message 66617 - Posted: 16 Sep 2017, 10:23:14 UTC

Ah phew.... it's not my new Vega 56.
Same errors as above.

ID: 66617 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
aad

Send message
Joined: 30 Mar 09
Posts: 63
Credit: 621,582,726
RAC: 4
Message 66619 - Posted: 16 Sep 2017, 12:08:01 UTC

Error rate is quite high....45% error out in 1 sec.
Not much time wasted, but I must babysit this machine because the communication with the project defferes to 24 hours...
ID: 66619 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cautilus

Send message
Joined: 29 Jul 14
Posts: 19
Credit: 3,451,802,406
RAC: 54
Message 66620 - Posted: 16 Sep 2017, 12:44:55 UTC

I'm getting errors on these work units as well:

"de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy"

"de_modfit_fast_20_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy"

Running a 290X and 280X, both of them seem to get errors only with these units, no others.
ID: 66620 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 25 May 14
Posts: 31
Credit: 56,750,059
RAC: 0
Message 66621 - Posted: 16 Sep 2017, 13:31:10 UTC

Looks like another bad batch of work units. All that have arrived today have quickly failed. No new tasks set for now.
ID: 66621 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
shu
Avatar

Send message
Joined: 4 Aug 17
Posts: 8
Credit: 199,494,186
RAC: 0
Message 66622 - Posted: 16 Sep 2017, 14:07:16 UTC

I can report the same issue over here, I switched off MW for now until the problem is resolved :)
ID: 66622 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[VENETO] boboviz

Send message
Joined: 10 Feb 09
Posts: 52
Credit: 16,286,597
RAC: 0
Message 66623 - Posted: 16 Sep 2017, 16:57:23 UTC

Same here with my rx560


<stderr_txt>
<search_application> milkyway_separation 1.46 Windows x86 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 5 </number_WUs>
<number_params_per_WU> 21 </number_params_per_WU>
Number of parameters doesn't make sense
ID: 66623 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Shai

Send message
Joined: 3 Dec 09
Posts: 1
Credit: 32,311,764
RAC: 0
Message 66624 - Posted: 16 Sep 2017, 17:49:30 UTC
Last modified: 16 Sep 2017, 17:50:21 UTC

It is easy to see something is wrong with some of the recent workunits - just look at the results returned for a workunit for which you get an error. For example:
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=1499638353
All the users got the exact same error with this workunit:
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 5 </number_WUs>
<number_params_per_WU> 21 </number_params_per_WU>
Number of parameters doesn't make sense
18:05:21 (6400): called boinc_finish(1)

So this is clearly a problem with either the workunit or with the application processing it.
ID: 66624 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 28 Aug 07
Posts: 133
Credit: 29,423,179
RAC: 0
Message 66628 - Posted: 16 Sep 2017, 23:39:22 UTC

My BOINC says this:

So 17 Sep 2017 01:03:11 CEST | Milkyway@Home | [slot] assigning slot 0 to de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2
So 17 Sep 2017 01:03:11 CEST | Milkyway@Home | [slot] linked ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_1.46_x86_64-pc-linux-gnu__opencl_nvidia_101 to slots/0/milkyway_1.46_x86_64-pc-linux-gnu__opencl_nvidia_101
So 17 Sep 2017 01:03:11 CEST | Milkyway@Home | [slot] linked ../../projects/milkyway.cs.rpi.edu_milkyway/parameters-18-3s-ModfitConstraintsWithDisk.txt to slots/0/astronomy_parameters.txt
So 17 Sep 2017 01:03:11 CEST | Milkyway@Home | [slot] linked ../../projects/milkyway.cs.rpi.edu_milkyway/stars-18.txt to slots/0/stars.txt
So 17 Sep 2017 01:03:11 CEST | Milkyway@Home | [task] ACTIVE_TASK::start(): forked process: pid 18869
So 17 Sep 2017 01:03:11 CEST | Milkyway@Home | [task] task_state=EXECUTING for de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2 from start
So 17 Sep 2017 01:03:11 CEST | Milkyway@Home | Starting task de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2
So 17 Sep 2017 01:03:11 CEST | Milkyway@Home | [cpu_sched] Starting task de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2 using milkyway version 146 (opencl_nvidia_101) in slot 0
So 17 Sep 2017 01:03:13 CEST | Milkyway@Home | [task] Process for de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2 exited, status 256, task state 1
So 17 Sep 2017 01:03:13 CEST | Milkyway@Home | [task] process exited with status 1
So 17 Sep 2017 01:03:13 CEST | Milkyway@Home | [task] task_state=EXITED for de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2 from handle_exited_app
So 17 Sep 2017 01:03:13 CEST | Milkyway@Home | [sched_op] Deferring communication for 00:03:01
So 17 Sep 2017 01:03:13 CEST | Milkyway@Home | [sched_op] Reason: Unrecoverable error for task de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2
So 17 Sep 2017 01:03:13 CEST | Milkyway@Home | [task] result state=COMPUTE_ERROR for de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2 from CS::report_result_error
So 17 Sep 2017 01:03:13 CEST | Milkyway@Home | Computation for task de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2 finished
So 17 Sep 2017 01:03:13 CEST | Milkyway@Home | [task] result state=COMPUTE_ERROR for de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy_2_1500622801_16641904_2 from CS::app_finished


The WU has this as stderr:

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
<search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation'
Setting process priority to 0 (13): Permission denied
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' 
Switching to Parameter File 'astronomy_parameters.txt'
<number_WUs> 5 </number_WUs>
<number_params_per_WU> 21 </number_params_per_WU>
Number of parameters doesn't make sense
00:40:15 (6533): called boinc_finish(1)

</stderr_txt>
]]>


Something's fishy, but they only take a second, so not much is lost in terms of crunch time ;)
Grüße vom Sänger
ID: 66628 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 66631 - Posted: 17 Sep 2017, 0:38:16 UTC
Last modified: 17 Sep 2017, 1:09:51 UTC

I made a bad post because I failed to read the date on the thread I linked to. Please ignore.
ID: 66631 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
SandraGerland

Send message
Joined: 31 May 16
Posts: 1
Credit: 2,356,583
RAC: 0
Message 66632 - Posted: 17 Sep 2017, 9:54:31 UTC
Last modified: 17 Sep 2017, 9:59:13 UTC

Hey

Add me pls to the List, same Errors here, over 90%. Same Batch of WUs
"de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy" and
"de_modfit_fast_20_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy"

LG Sandra
ID: 66632 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 28 Aug 07
Posts: 133
Credit: 29,423,179
RAC: 0
Message 66633 - Posted: 17 Sep 2017, 9:59:50 UTC - in response to Message 66632.  
Last modified: 17 Sep 2017, 10:01:39 UTC

Hey

Add me pls to the List, same Errors here, over 90%. Same Batch of WUs
"de_modfit_fast_18_3s_146_bundle5_ModfitConstraintsWithDisk_Bouncy"

LG Sandra

Wrong;)
I've got some 20 in my list of errors as well, and some are called Random, some not. Looks a wee bit random to me ;)

P.S.: I just saw, you changed your post, so now it fits here as well ;)
Grüße vom Sänger
ID: 66633 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 66636 - Posted: 17 Sep 2017, 13:47:50 UTC
Last modified: 17 Sep 2017, 13:51:56 UTC

The ones named "fixed" work, but they are scattered among so many un-fixed I am aborting all.

[EDIT] After aborting 170 my system downloaded another 110 of which only 2 were "fixed" They are downloading faster than I can abort them. I should have stuck with my bitcoin mining.
ID: 66636 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Errors

©2024 Astroinformatics Group