Welcome to MilkyWay@home

Milky Way Maintainance 2/1/2021

Message boards : News : Milky Way Maintainance 2/1/2021
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70498 - Posted: 1 Feb 2021, 17:54:13 UTC

Hey Everyone,

The MilkyWay server will be going down from 4PM today until midnight. After this outage, we should have the server up and running as normal, and the workunit generation issue will hopefully be solved.

Best,
Tom
ID: 70498 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Holdolin

Send message
Joined: 9 Dec 11
Posts: 38
Credit: 1,497,896,956
RAC: 0
Message 70499 - Posted: 1 Feb 2021, 18:10:03 UTC - in response to Message 70498.  

Thanks for the heads up and your work. Much appreciated.
ID: 70499 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Siran d'Vel'nahr
Avatar

Send message
Joined: 1 Jul 08
Posts: 88
Credit: 25,079,058
RAC: 0
Message 70500 - Posted: 1 Feb 2021, 18:16:07 UTC - in response to Message 70498.  

Greetings,

Thanks for the notification on maintenance, Tom. :)

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr XO - L L & P _\\//
USS Vre'kasht NCC-33187
Winders 10 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 70500 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70501 - Posted: 2 Feb 2021, 4:43:30 UTC

The server is back up after some tweaking of the server's database. Hopefully that fixes the problems that we were having. We are currently monitoring the system, and I will have more info for you all tomorrow.

- Tom
ID: 70501 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dark_soul_saanvi

Send message
Joined: 18 Jan 21
Posts: 1
Credit: 2,281
RAC: 0
Message 70503 - Posted: 2 Feb 2021, 7:53:16 UTC - in response to Message 70498.  

Greetings,

Thank you for informing us :]
ID: 70503 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Geri

Send message
Joined: 29 Mar 15
Posts: 1
Credit: 30,671,378
RAC: 0
Message 70504 - Posted: 2 Feb 2021, 11:21:40 UTC

There are still no workunits for Milkyway@Home in BOINC, even the waiting ones disappeared! Quitting, restarting and updating Milkyway@Home are without effect.
ID: 70504 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cavalary
Avatar

Send message
Joined: 23 Aug 11
Posts: 35
Credit: 11,459,845
RAC: 17,197
Message 70505 - Posted: 2 Feb 2021, 14:05:52 UTC

Same, still nothing, and server status lists none available. But it doesn't even seem to connect properly, because I changed the project resource share and it doesn't update.
ID: 70505 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
paris
Avatar

Send message
Joined: 26 Apr 08
Posts: 87
Credit: 64,801,496
RAC: 0
Message 70506 - Posted: 2 Feb 2021, 14:34:14 UTC

I checked my tasks list and I have a lot of "cancelled by server" notations. Is this part of the usual restart procedure?


Plus SETI Classic = 21,082 WUs
ID: 70506 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile alk44
Avatar

Send message
Joined: 2 Mar 20
Posts: 131
Credit: 320,872,837
RAC: 14,512
Message 70507 - Posted: 2 Feb 2021, 14:42:38 UTC - in response to Message 70506.  

Still seeing no new tasks as of this writing.
ID: 70507 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70508 - Posted: 2 Feb 2021, 15:02:02 UTC
Last modified: 2 Feb 2021, 15:02:38 UTC

We fixed the problem with the work generators, but our fix seems to have caused a problem with the transitioner, which means that no new workunits are going out at the moment. We will have to do some server maintenance again in the near future to fix this, and I'll update you all with details when I know more.

This is a problem caused by running out of valid workunit IDs (since you all have crunched over 2 billion workunits!) so we had to reset the entire database ID count and delete old workunits. As a result, there are some residual ID mismatches that we are trying to find and sort out. The transitioner was working as of midnight last night, but it seems to have switched off sometime in the early morning.

Thanks for your patience,
Tom
ID: 70508 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Holdolin

Send message
Joined: 9 Dec 11
Posts: 38
Credit: 1,497,896,956
RAC: 0
Message 70509 - Posted: 2 Feb 2021, 16:09:24 UTC - in response to Message 70508.  

Let me guess, we crunched 2,147,483,647 work units lol. Might I suggest you plan for this eventuality, as it will happen again and more quickly as GPUs will continue to get faster and faster. Without knowing how your DB is set up, or even what DB you are using I could not begin to offer a solution but would be happy to have a look if y'all have room for another volunteer :)
ID: 70509 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 70510 - Posted: 2 Feb 2021, 16:25:13 UTC - in response to Message 70509.  
Last modified: 2 Feb 2021, 16:29:43 UTC

Yup, you got it. I don't think the Boinc framework really planned on encountering this problem. This happened to a different table in the past, so we are aware of the issue, but as far as I can tell other projects haven't hit this limitation yet.

We are discussing ways to prevent this problem in the future. The simplest "fix" would be to change the Primary Key of each affected table to a BIGINT instead of an INT, which we may end up doing at some point in the future once things are working again.
ID: 70510 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
EPFrybar

Send message
Joined: 28 Jan 21
Posts: 8
Credit: 25,225,698
RAC: 0
Message 70513 - Posted: 2 Feb 2021, 20:54:27 UTC - in response to Message 70510.  
Last modified: 2 Feb 2021, 20:55:06 UTC

I believe that was SETI@HOME's solution (64 bit INT).. (The joys of success!!)

Ed F
ID: 70513 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Vester
Avatar

Send message
Joined: 30 Dec 14
Posts: 34
Credit: 909,998,366
RAC: 0
Message 70514 - Posted: 3 Feb 2021, 13:45:56 UTC

All looks good on this user's end. Crunching again.
ID: 70514 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Max_Pirx

Send message
Joined: 13 Dec 17
Posts: 46
Credit: 2,421,362,376
RAC: 0
Message 70515 - Posted: 3 Feb 2021, 14:16:22 UTC

Yep, seems all right for me too. Crunching WUs nicely.
Thanks for sorting this out!
ID: 70515 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Holdolin

Send message
Joined: 9 Dec 11
Posts: 38
Credit: 1,497,896,956
RAC: 0
Message 70516 - Posted: 3 Feb 2021, 14:42:58 UTC

Well, it's working in the idea that I'm getting work, but only a couple WU's at a time.
ID: 70516 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
paris
Avatar

Send message
Joined: 26 Apr 08
Posts: 87
Credit: 64,801,496
RAC: 0
Message 70529 - Posted: 3 Feb 2021, 19:08:55 UTC

Plenty of work and verifying just fine so far.


Plus SETI Classic = 21,082 WUs
ID: 70529 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : Milky Way Maintainance 2/1/2021

©2024 Astroinformatics Group