Welcome to MilkyWay@home

server crash (July 29)

Message boards : Number crunching : server crash (July 29)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 28509 - Posted: 29 Jul 2009, 18:05:23 UTC

Looks like labstaff completed the restore. It should have been from 4am this morning.

Let us know if you're missing anything from before then.
ID: 28509 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,525,188
RAC: 0
Message 28511 - Posted: 29 Jul 2009, 18:15:38 UTC - in response to Message 28509.  

Looks OK here -- thanks to you and the labstaff for the quick turnaround -- I've seen database crashes cripple a BOINC project for days. Well done.


Looks like labstaff completed the restore. It should have been from 4am this morning.

Let us know if you're missing anything from before then.


ID: 28511 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Cori
Avatar

Send message
Joined: 27 Aug 07
Posts: 647
Credit: 27,592,547
RAC: 0
Message 28512 - Posted: 29 Jul 2009, 18:21:26 UTC

That was a really quick recovery. And not as much lost as we all had feared I guess. ;-)))
Lovely greetings, Cori
ID: 28512 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
etrecords

Send message
Joined: 15 May 08
Posts: 7
Credit: 126,077,128
RAC: 0
Message 28514 - Posted: 29 Jul 2009, 18:42:12 UTC
Last modified: 29 Jul 2009, 18:46:48 UTC

The only thing I see is that on several stats sides old data is used for milkyway. Regarding boincstats my number of points for milkyway are now 5,109,606.

When I am looking to my tasks I see information of this afternoon and from 21th of may.

The same bahavior is to see in the signatures of the two other repleys. There is a bg differnce between the actual number of points and the number in the signature.
ID: 28514 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rob.B

Send message
Joined: 21 Mar 09
Posts: 15
Credit: 1,545,913
RAC: 0
Message 28515 - Posted: 29 Jul 2009, 18:44:43 UTC

Well done on the prompt recovery, as a DBA I know the feeling when a database goes belly up.

Rob.B
ID: 28515 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 28516 - Posted: 29 Jul 2009, 18:48:05 UTC - in response to Message 28515.  

Well done on the prompt recovery, as a DBA I know the feeling when a database goes belly up.

Rob.B


I was fearing for the worst, but luckily labstaff was on top of their game today :)
ID: 28516 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Cori
Avatar

Send message
Joined: 27 Aug 07
Posts: 647
Credit: 27,592,547
RAC: 0
Message 28523 - Posted: 29 Jul 2009, 19:00:34 UTC

Ahhh... so it's time for a beer now. At least on the European side of the pond... *grin*
Lovely greetings, Cori
ID: 28523 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ananas

Send message
Joined: 19 Aug 08
Posts: 12
Credit: 2,500,263
RAC: 0
Message 28524 - Posted: 29 Jul 2009, 19:01:12 UTC
Last modified: 29 Jul 2009, 19:02:40 UTC

um ... not everything OK, I hope it's just the BOINC cache or an index :

A bunch of tasks from earlier today have been marked "no longer usable" and exactly those are affected and my PC's task list was empty.

My user task list was _not_ empty, but if I opened the older tasks, the PC they are attached to is not mine, e.g. :

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=63855306

from my task list is attached to

http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=73028

belonging to DrMASH

Tasks that I downloaded just now have the correct host entry.
ID: 28524 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ziffen63

Send message
Joined: 26 Mar 09
Posts: 7
Credit: 702,781
RAC: 0
Message 28525 - Posted: 29 Jul 2009, 19:05:31 UTC

Good save!
ID: 28525 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
aldebaran

Send message
Joined: 5 Oct 07
Posts: 2
Credit: 18,553,369
RAC: 1
Message 28528 - Posted: 29 Jul 2009, 19:15:30 UTC - in response to Message 28509.  

new wus seem running ok.

ID: 28528 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
nikolaus

Send message
Joined: 7 Aug 08
Posts: 8
Credit: 672,049
RAC: 0
Message 28530 - Posted: 29 Jul 2009, 19:20:43 UTC

had several crashed WUs in the ast 4 hours and the credits from the last day are lost pressing update.
Something still is very, very wrong here.
ID: 28530 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Sarge

Send message
Joined: 10 Nov 07
Posts: 441
Credit: 761,827
RAC: 0
Message 28531 - Posted: 29 Jul 2009, 19:21:15 UTC

I believe I've lost 80000 in credit and 1700-1800 in RAC and, no, thihs is not all work done in 1 day. So, with the RAC limits, am I even going to be able to post this message?
ID: 28531 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ananas

Send message
Joined: 19 Aug 08
Posts: 12
Credit: 2,500,263
RAC: 0
Message 28533 - Posted: 29 Jul 2009, 19:30:40 UTC - in response to Message 28532.  
Last modified: 29 Jul 2009, 19:32:30 UTC

Weak account key, not much risk.

People can join their computers to that key but not modify or even take over the account.

The weak one is thought to be used in public places.

p.s.: and how is he supposed to edit out the key if you quote it ;-)


oops, I see that he has posted the original one _and_ the weak one, sorry, now I feel a bit stupid :-/
ID: 28533 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
nikolaus

Send message
Joined: 7 Aug 08
Posts: 8
Credit: 672,049
RAC: 0
Message 28534 - Posted: 29 Jul 2009, 19:30:42 UTC

just found out that i lost as well some 95000 points, impressive thats 2 month of work. Hopefully we get restored and some premium for suffering.
ID: 28534 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jay Kz

Send message
Joined: 1 Apr 09
Posts: 4
Credit: 235,809
RAC: 0
Message 28535 - Posted: 29 Jul 2009, 19:32:26 UTC

For the second time, I've lost quite a bit of credits due to Milkyway's system problems, (14,000+ first time, 10,000* this time.) I've detached the Milkyway project, and may look into it again in a year or two to see if it has stabilized. I have volunteered my computer time, but expect it to be used wisely and effectively.
ID: 28535 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[PST]Howard
Avatar

Send message
Joined: 31 Aug 07
Posts: 21
Credit: 21,004,179
RAC: 0
Message 28536 - Posted: 29 Jul 2009, 19:40:47 UTC

Seem to have got lost credits back - BOINCstats update showed I had 105K credits, down about 44k, now back up to 149K which is about right
ID: 28536 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 28537 - Posted: 29 Jul 2009, 19:43:11 UTC - in response to Message 28535.  

For the second time, I've lost quite a bit of credits due to Milkyway's system problems, (14,000+ first time, 10,000* this time.) I've detached the Milkyway project, and may look into it again in a year or two to see if it has stabilized. I have volunteered my computer time, but expect it to be used wisely and effectively.


I re-ran the stats export so hopefully the webpages are updated with the credit values from after the restore.

Let me know if this has fixed anything.


PS. Ya'll are cranky this morning, no coffee yet?
ID: 28537 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Blitzkommando

Send message
Joined: 2 Jun 09
Posts: 3
Credit: 23,317,910
RAC: 0
Message 28538 - Posted: 29 Jul 2009, 19:48:50 UTC

Based on my experiences with other projects I didn't expect the restore to go nearly as quickly, let alone to be only a couple hours old backup. Truly I'm impressed by the speed of the MW team. The stats errors at the various stats sites is a minor inconvenience really and will be corrected when they update.
ID: 28538 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Odd-Rod

Send message
Joined: 7 Sep 07
Posts: 444
Credit: 5,712,523
RAC: 0
Message 28542 - Posted: 29 Jul 2009, 20:00:34 UTC - in response to Message 28523.  

Ahhh... so it's time for a beer now. At least on the European side of the pond... *grin*

Luckily I'm also on this side of the pond, even if it's on the 'far' point of Africa! So cheers, Cori!

Oh yes, well done Milkyway, on a quick recovery!
ID: 28542 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Dan T. Morris
Avatar

Send message
Joined: 17 Mar 08
Posts: 165
Credit: 410,228,216
RAC: 0
Message 28544 - Posted: 29 Jul 2009, 20:09:35 UTC
Last modified: 29 Jul 2009, 20:19:19 UTC

FYI as of 3:00 pm cst I am still showing -35,468,886 on Boinc web stats and on free dc.. Maybe when they update it will correct its self......


DD,
ID: 28544 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : server crash (July 29)

©2024 Astroinformatics Group