Welcome to MilkyWay@home

Computation errors

Message boards : Number crunching : Computation errors
Message board moderation

To post messages, you must log in.

AuthorMessage
Andy_Taximan

Send message
Joined: 15 May 16
Posts: 3
Credit: 38,418,196
RAC: 0
Message 68915 - Posted: 25 Jul 2019, 13:32:00 UTC

Can somebody take a look, seem to be getting too many errors today, running pc @ stock settings now
https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=808617&offset=0&show_names=0&state=6&appid=
ID: 68915 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Marsinph

Send message
Joined: 13 Nov 10
Posts: 23
Credit: 108,282,839
RAC: 0
Message 68916 - Posted: 25 Jul 2019, 13:52:59 UTC - in response to Message 68915.  

Hello,
It was the same for me this morning (about 50%). Admin know and works on it.
See https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4487
ID: 68916 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 68917 - Posted: 25 Jul 2019, 14:22:11 UTC - in response to Message 68915.  
Last modified: 25 Jul 2019, 14:40:14 UTC

I thought this was fixed after reading the post under news quote: "Thanks for bringing that to my attention, none of the runs are returning data. I'll try to fix that as soon as I can."

I did get about 100-200 w/o error but this morning I see another 253 errored out and 212 more waiting to error out.

Compounding the problem is that Einstein coincidently has a series of bad runs that affect older GPUs like my S9000 boards

[EDIT] only a few more errored out. Looks like there are just a few of the bad ones left Out of those 253 "waiting" only about 5 were bad and I now have over 500 downloaded and do not see any more errors.

Thanks for the fix.
ID: 68917 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Andy_Taximan

Send message
Joined: 15 May 16
Posts: 3
Credit: 38,418,196
RAC: 0
Message 68918 - Posted: 25 Jul 2019, 14:29:38 UTC

Ah, at least they only taking 2 seconds to error.
ID: 68918 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 68919 - Posted: 25 Jul 2019, 14:34:55 UTC - in response to Message 68918.  

Ah, at least they only taking 2 seconds to error.


Agree, unlike the new batch of Einstein tasks that run for 6-7 hours and then show 35 - 45 days to complete.
ID: 68919 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Marsinph

Send message
Joined: 13 Nov 10
Posts: 23
Credit: 108,282,839
RAC: 0
Message 68921 - Posted: 25 Jul 2019, 15:08:14 UTC - in response to Message 68917.  

JStateSon,
I think more some bad series than a problem of old GPU.
A Radeon RX580 is not a old stuff !

I will let finish all I have (about 2 hours if no error), waiting the next serie download

Best regards





I thought this was fixed after reading the post under news quote: "Thanks for bringing that to my attention, none of the runs are returning data. I'll try to fix that as soon as I can."

I did get about 100-200 w/o error but this morning I see another 253 errored out and 212 more waiting to error out.

Compounding the problem is that Einstein coincidently has a series of bad runs that affect older GPUs like my S9000 boards

[EDIT] only a few more errored out. Looks like there are just a few of the bad ones left Out of those 253 "waiting" only about 5 were bad and I now have over 500 downloaded and do not see any more errors.

Thanks for the fix.
ID: 68921 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 68922 - Posted: 25 Jul 2019, 15:48:12 UTC

Hey all,

There was a problem with a flag in the parameters for the runs last night. I fixed it around 10 PM. You may still get some bad runs as your queues purge, but after a day or two at most things should continue to run smoothly.

Tom
ID: 68922 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Marsinph

Send message
Joined: 13 Nov 10
Posts: 23
Credit: 108,282,839
RAC: 0
Message 68924 - Posted: 25 Jul 2019, 16:03:52 UTC - in response to Message 68922.  

Hello Tom
Thank you for update.
It is not a so big problem at our side, because they crashs atfter 1 or 2 seconds.
It was more problem if they crash a few second before end of crunch
Have a nice day.
Best regards

PS, if you have cold, ask wertern europe.... we will send you "degrees" for free !
In Belgium, in the latest two centuries, the first time we reach such temperature.
To compare in Egypt summer (now and also the highest temperature).
It is warmer here than in the desert of Egypt at Abou-Simbel temple ! "Only 39°C"
So it is fully normal if you see less returned results. We preserve our hosts
ID: 68924 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
robertmiles

Send message
Joined: 30 Sep 09
Posts: 211
Credit: 36,977,315
RAC: 0
Message 68931 - Posted: 27 Jul 2019, 1:22:26 UTC

2 computation errors for me just now.

The following error message suggests that both were due to bad input files:

Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
ID: 68931 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 68938 - Posted: 30 Jul 2019, 15:11:38 UTC - in response to Message 68931.  

robertmiles,

Did the workunits actually abort computation, or did they just output that error message? That's a typical output from the separation application that basically just says "you aren't using a lua file so I'm defaulting back to this parameter file". When I test these runs on my client I get that "error" every time because we don't use separation that way anymore.

- Tom
ID: 68938 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Computation errors

©2024 Astroinformatics Group