Welcome to MilkyWay@home

New Separation Runs 6/9/2021

Message boards : News : New Separation Runs 6/9/2021
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70858 - Posted: 9 Jun 2021, 23:14:57 UTC
Last modified: 9 Jun 2021, 23:16:45 UTC

Hello Everyone,

I've just put some new separation runs up on the server. Remember those stripe 84 and 85 runs that would start to throw validate errors as they became more optimized? I've been testing and comparing runs on different builds and *hopefully* that problem has been resolved.

The names of the new runs are:

de_modfit_84_bundle4_4s_south4s_gapfix
de_modfit_84_bundle4_4s_south4s_gapfix_bgset2
de_modfit_84_bundle4_4s_south4s_gapfix_bgset3
de_modfit_85_bundle4_4s_south4s_gapfix
de_modfit_85_bundle4_4s_south4s_gapfix_bgset2
de_modfit_85_bundle4_4s_south4s_gapfix_bgset3

Please keep an eye on these runs and let me know if anything odd happens (validate errors or otherwise). With any luck, everything will work perfectly! These are the last runs that need to optimized before the latest results of separation can be submitted to a journal to be published.

Additionally, I have taken down the following runs:

de_modfit_80_bundle4_4s_south4s_bgset_7
de_modfit_81_bundle4_4s_south4s_bgset_7
de_modfit_82_bundle4_4s_south4s_bgset_7
de_modfit_83_bundle4_4s_south4s_bgset_7
de_modfit_86_bundle4_4s_south4s_bgset_7

As always, the stopped runs will continue to show up in your workunit queue for a few days as they finish up. This is normal and expected. Thank you all for your support and help with this project.

Best,
Tom
ID: 70858 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Socrbob

Send message
Joined: 10 Sep 12
Posts: 4
Credit: 18,297,712
RAC: 0
Message 70859 - Posted: 10 Jun 2021, 17:57:58 UTC - in response to Message 70858.  

Hello, this run, de_modfit_84_bundle4_4s_south4s_bgset_7, along with 21 other runs with different ending numbers, has shown up for the past 4-5 days as Ready to report. Please explain why. Thank you.
ID: 70859 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70860 - Posted: 10 Jun 2021, 18:29:55 UTC
Last modified: 10 Jun 2021, 18:30:53 UTC

Hello,

These types of questions are better asked in the Number Crunching (https://milkyway.cs.rpi.edu/milkyway/forum_forum.php?id=2) part of these forums. If you ask your question there, I (and others) will be happy to try to figure out the issue.
ID: 70860 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Socrbob

Send message
Joined: 10 Sep 12
Posts: 4
Credit: 18,297,712
RAC: 0
Message 70861 - Posted: 11 Jun 2021, 0:01:42 UTC - in response to Message 70860.  

I thought since it was similar to the ones you posted to watch, that I would ask what was going on. All of them are now gone from my listing. Thanks for your assistance.
ID: 70861 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70864 - Posted: 11 Jun 2021, 3:45:49 UTC

Glad to hear that the problem is resolved!
ID: 70864 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70865 - Posted: 11 Jun 2021, 14:54:16 UTC
Last modified: 11 Jun 2021, 14:54:52 UTC

I've had a report of one person who experienced a GPU (Quadro P620 with default cooler) memory controller crash while crunching these new runs. I'm not sure if this was a fluke or if it's some problem with the runs. As far as I know, nothing was changed that should cause this problem, but if anyone else experiences something like it please let me know.
ID: 70865 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 696
Credit: 539,263,103
RAC: 95,705
Message 70866 - Posted: 12 Jun 2021, 14:10:02 UTC

I've had nary a problem with these new stripe 84/85 runs. Much better than previous attempts.
Good job!
ID: 70866 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Siran d'Vel'nahr
Avatar

Send message
Joined: 1 Jul 08
Posts: 88
Credit: 25,079,058
RAC: 0
Message 70869 - Posted: 13 Jun 2021, 12:33:52 UTC - in response to Message 70858.  

Hi Tom,

I'm getting the same Lua Script error on those tasks. I got 5 or 6 just this morning. :-(

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr XO - L L & P _\\//
USS Vre'kasht NCC-33187
Winders 10 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 70869 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70870 - Posted: 13 Jun 2021, 16:31:49 UTC - in response to Message 70869.  

Hello Siran,

Do the tasks actually result in errors? If you look at your workunits that do not fail, you should also see the "Lua Script error" on those. It's not an actual problem for the software, it's just a poorly phrased output. If you didn't see the Lua error I would be more concerned, actually.
ID: 70870 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Siran d'Vel'nahr
Avatar

Send message
Joined: 1 Jul 08
Posts: 88
Credit: 25,079,058
RAC: 0
Message 70871 - Posted: 13 Jun 2021, 20:37:48 UTC - in response to Message 70870.  

Hello Siran,

Do the tasks actually result in errors? If you look at your workunits that do not fail, you should also see the "Lua Script error" on those. It's not an actual problem for the software, it's just a poorly phrased output. If you didn't see the Lua error I would be more concerned, actually.

Hi Tom,

Here's what I found:

I clicked on a random validated task and it did indeed have the Lua Error.

I clicked on the first error work unit number and it says: Too many errors (may have bug) in the upper section of the page.
I clicked on the task number for the same work unit above and the only error I can find is the Lua Error.

I would assume that the tasks do result in errors. :-\

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr XO - L L & P _\\//
USS Vre'kasht NCC-33187
Winders 10 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 70871 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 696
Credit: 539,263,103
RAC: 95,705
Message 70872 - Posted: 13 Jun 2021, 23:20:25 UTC

All my tasks, invalid, valid or errored show the lua error. Just as Tom stated, the printed error is innocuous and has no bearing on the real reason for invalid or errored tasks.
ID: 70872 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FritzB

Send message
Joined: 7 Apr 15
Posts: 3
Credit: 199,342,045
RAC: 9,084
Message 70890 - Posted: 20 Jun 2021, 20:12:24 UTC
Last modified: 20 Jun 2021, 20:13:17 UTC

There are some wu's that run endless instead of ~2 Min. Stuck at different points from 30 to 99.8%

eg:
https://milkyway.cs.rpi.edu/milkyway/result.php?resultid=226249315
https://milkyway.cs.rpi.edu/milkyway/result.php?resultid=226960369 <- aborted after 11 hours and some 40%

AMD A12-9800 APU
ID: 70890 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70891 - Posted: 20 Jun 2021, 22:47:47 UTC
Last modified: 20 Jun 2021, 22:49:19 UTC

Thanks for the report, Fritz. It's curious that the task that your first workunit was validating took under 2 minutes, but your workunit ran indefinitely... I'll keep an eye on this moving forward.

It's also only Windows machines that I've seen with these large runtimes, based on the few workunits that I've looked at so far.
ID: 70891 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FritzB

Send message
Joined: 7 Apr 15
Posts: 3
Credit: 199,342,045
RAC: 9,084
Message 70902 - Posted: 23 Jun 2021, 4:30:45 UTC - in response to Message 70891.  

Another one. 24.x% after 4:20h
https://milkyway.cs.rpi.edu/milkyway/result.php?resultid=229675684

This only happens on the A12. No problems with 280X and HD 7970 and Ryzen 3900X/5950X, all Win 10, so far.
ID: 70902 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 208
Credit: 105,174,898
RAC: 51,003
Message 70910 - Posted: 24 Jun 2021, 4:06:15 UTC

Tom,

You asked for notification of Invalid results...

I spotted that I'd had the following on 23rd June:

Workunit 120081435
name 	de_modfit_84_bundle4_4s_south4s_gapfix_bgset3_1621277702_21931551

Workunit 120134345
name 	de_modfit_84_bundle4_4s_south4s_gapfix_bgset3_1621277702_21980504

Workunit 120351731
name 	de_modfit_85_bundle4_4s_south4s_gapfix_bgset3_1621277702_22181699

Workunit 120388109
name 	de_modfit_85_bundle4_4s_south4s_gapfix_bgset3_1621277702_22214546


So I had a look at [some of] my Validation Inconclusive tasks and found the following where both my task and that of a wing-man were tagged inconclusive (so someone will end up invalid!):

Workunit 120402533
name 	de_modfit_85_bundle4_4s_south4s_gapfix_bgset3_1621277702_22227482

Workunit 120718751
name 	de_modfit_85_bundle4_4s_south4s_gapfix_bgset3_1621277702_22517852

Workunit 120351730
name 	de_modfit_85_bundle4_4s_south4s_gapfix_bgset3_1621277702_22181698

Workunit 120388053  (NOT bgset3!)
name 	de_modfit_85_bundle4_4s_south4s_gapfix_bgset2_1621277702_22214490

Workunit 120388804  (NOT bgset3!)
name 	de_modfit_85_bundle4_4s_south4s_gapfix_bgset2_1621277702_22215148


And for completeness I went through a subset of my Valid results and found the following that had an Invalid wing-man:

Workunit 119929966
name 	de_modfit_85_bundle4_4s_south4s_gapfix_1621277702_21791665

Workunit 119969464
name 	de_modfit_85_bundle4_4s_south4s_gapfix_bgset3_1621277702_21827692

Workunit 120355887
name 	de_modfit_84_bundle4_4s_south4s_gapfix_bgset3_1621277702_22185096

Workunit 120389250
name 	de_modfit_85_bundle4_4s_south4s_gapfix_1621277702_22215504

Workunit 120389268
name 	de_modfit_85_bundle4_4s_south4s_gapfix_1621277702_22215522


It's time-consuming (and finger-cramping) checking these via the Web interface, so I've not checked anything further back than 23rd June...

Hope the above is of some use.

Cheers - Al.
ID: 70910 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JohnDK
Avatar

Send message
Joined: 18 Feb 10
Posts: 53
Credit: 221,485,241
RAC: 10,575
Message 70911 - Posted: 24 Jun 2021, 16:51:09 UTC

My 3 hosts has also started with Validate errors, it began yesterday. My 24/7 Linux host has around 200 errors.
ID: 70911 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cameron

Send message
Joined: 16 Dec 07
Posts: 37
Credit: 24,329,517
RAC: 9,197
Message 70913 - Posted: 25 Jun 2021, 3:36:00 UTC

Had one Error on me

Workunit 119867095
name de_modfit_84_bundle4_4s_south4s_gapfix_bgset2_1621277702_21732796

Task in Question 229042440
ID: 70913 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jean-Pierre HARLE

Send message
Joined: 25 Sep 08
Posts: 15
Credit: 145,544,797
RAC: 0
Message 70914 - Posted: 25 Jun 2021, 8:00:34 UTC

Hi Tom,

Same problem for me. From June 22nd to this morning : 145 invalid tasks in both "de_modfit_84" and "de_modfit_85". For example :

de_modfit_84_bundle4_4s_south4s_gapfix_bgset3_1621277702_22565313
de_modfit_85_bundle4_4s_south4s_gapfix_bgset3_1621277702_22692570

Best regards.

JPH
ID: 70914 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jean-Pierre HARLE

Send message
Joined: 25 Sep 08
Posts: 15
Credit: 145,544,797
RAC: 0
Message 70919 - Posted: 26 Jun 2021, 5:25:30 UTC

Hi Tom,

The number of my invalid tasks is still increasing... 145 yesterday, 212 this morning ! Why ???

Best regards.

JPH
ID: 70919 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
micropro

Send message
Joined: 25 Jun 19
Posts: 1
Credit: 107,208
RAC: 0
Message 70920 - Posted: 26 Jun 2021, 13:24:20 UTC - in response to Message 70914.  

Hi Tom,

Same problem for me. From June 22nd to this morning : 145 invalid tasks in both "de_modfit_84" and "de_modfit_85". For example :

de_modfit_84_bundle4_4s_south4s_gapfix_bgset3_1621277702_22565313
de_modfit_85_bundle4_4s_south4s_gapfix_bgset3_1621277702_22692570

Best regards.

JPH


Hi all,

It seems that I could have the same problem that JPH.

I know I'm no faithful member of the MilkyWay community but I wanted to come back to the project and chose to run on CPU since my GPU is busy (for now) at other things.

I've never uncountered any errors until yesterday actually. I thought it was an hardware error on my end so I ditched the idea of CPU computing but still... Dumb to think that if other projects are good with my CPU.

I'll wait and see if my few units valdates or not before continuing or aborting CPU tasks.

Thank you for your time if you read this until the end ;)

Best regards,

micropro
ID: 70920 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : News : New Separation Runs 6/9/2021

©2024 Astroinformatics Group