Welcome to MilkyWay@home

Server Crash November 10


Advanced search

Message boards : Number crunching : Server Crash November 10
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
ProfileDavid Glogau*
Avatar

Send message
Joined: 12 Aug 09
Posts: 172
Credit: 645,240,165
RAC: 0
500 million credit badge13 year member badge
Message 33177 - Posted: 10 Nov 2009, 15:52:44 UTC

Great DB restore Travis. All my 1.8 million points are back and 168 tasks are current again.

Phew! /wipes brow.

Anyway, Good luck with the thesis defense.
ID: 33177 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 33178 - Posted: 10 Nov 2009, 16:02:16 UTC - in response to Message 33177.  

Great DB restore Travis. All my 1.8 million points are back and 168 tasks are current again.

Phew! /wipes brow.

Anyway, Good luck with the thesis defense.



Whew, at least people didn't lose too much credit. We should be putting in an order for new equipment today, so as soon as that's in we'll be up and running again. Hopefully much faster than before :P
ID: 33178 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 33179 - Posted: 10 Nov 2009, 16:02:41 UTC - in response to Message 33177.  

Anyway, Good luck with the thesis defense.


Yeah... All this couldn't have happened at a better time :(
ID: 33179 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileDavid Glogau*
Avatar

Send message
Joined: 12 Aug 09
Posts: 172
Credit: 645,240,165
RAC: 0
500 million credit badge13 year member badge
Message 33180 - Posted: 10 Nov 2009, 16:10:11 UTC

11/11/2009 5:06:38 a.m. Milkyway@home Reporting 24 completed tasks, requesting new tasks for GPU
11/11/2009 5:06:46 a.m. Milkyway@home Scheduler request completed: got 0 new tasks
11/11/2009 5:06:46 a.m. Milkyway@home Message from server: Server error: feeder not running

Still a few problems left to sort out it seems.
ID: 33180 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 33181 - Posted: 10 Nov 2009, 16:12:10 UTC - in response to Message 33180.  

11/11/2009 5:06:38 a.m. Milkyway@home Reporting 24 completed tasks, requesting new tasks for GPU
11/11/2009 5:06:46 a.m. Milkyway@home Scheduler request completed: got 0 new tasks
11/11/2009 5:06:46 a.m. Milkyway@home Message from server: Server error: feeder not running

Still a few problems left to sort out it seems.


We won't be generating new workunits until we have working hard drives again. The ones running the forums are pretty crippled. If we started generating work again they'd most likely just crash.

It'll probably be a couple days until we get the new hardware and have work flowing again.
ID: 33181 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileDavid Glogau*
Avatar

Send message
Joined: 12 Aug 09
Posts: 172
Credit: 645,240,165
RAC: 0
500 million credit badge13 year member badge
Message 33183 - Posted: 10 Nov 2009, 16:22:10 UTC - in response to Message 33181.  

Okey then. Seems like a good time for me to try converting one of my boxes over to Linux.

Don't feel too bad Travis, Cosmo's transitioner has been down over 27 hours now.

May I suggest Monday as reboot day, so you can work on your defense and we know where we stand.

Something tells me it won't be as simple as plugging in the new drives, loading software, and off we go again! JMHO.
ID: 33183 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Emanuel

Send message
Joined: 18 Nov 07
Posts: 280
Credit: 2,442,757
RAC: 0
2 million credit badge14 year member badge
Message 33184 - Posted: 10 Nov 2009, 16:33:26 UTC

This may not be up to you at all, but I'm curious - are you using shock mounts for the HDDs? (something like this) Hope everything goes well with the new hardware. Is there any chance you might be virtualizing/distributing the workload across several servers in the future? (or is there simply no money for this?)
ID: 33184 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileDavid Glogau*
Avatar

Send message
Joined: 12 Aug 09
Posts: 172
Credit: 645,240,165
RAC: 0
500 million credit badge13 year member badge
Message 33185 - Posted: 10 Nov 2009, 16:47:57 UTC - in response to Message 33184.  

Seems like a good product Emanuel. If you want some Travis I am happy to pay for them.
PM me with a delivery address and quantity.

Cheers
David

PS: Any chance of uploading completed WU's?
ID: 33185 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
SkyeHunter

Send message
Joined: 6 Mar 09
Posts: 41
Credit: 38,856,291
RAC: 0
30 million credit badge13 year member badge
Message 33186 - Posted: 10 Nov 2009, 16:54:45 UTC

Travis,

Thanks for the restore
Thanks for the communication
Good luck with your thesis

Somthing with 10% inspiration and 90% transpiration comes to mind...
ID: 33186 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
50 million credit badge13 year member badge
Message 33187 - Posted: 10 Nov 2009, 17:11:21 UTC

As far as I can see, and remember, the credit reported for me (account, etc) is about what I think it should be. But, as expected with this hard drive problem I am all out of work.

Looks like Collatz and other GPU projects will be hit by host transferring until Travis tells us it's all up up and away again!

Keep up the good work Travis and colleagues.
Go away, I was asleep


ID: 33187 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Smuuth

Send message
Joined: 1 Nov 09
Posts: 3
Credit: 12,689,679
RAC: 0
10 million credit badge12 year member badge
Message 33188 - Posted: 10 Nov 2009, 17:28:42 UTC

Was this caused by the server crash?

11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_22_3480689_1257840816_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_22_3485143_1257841142_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_22_3487110_1257841288_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_23_3489726_1257841476_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_23_3490718_1257841547_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_22_3491606_1257841616_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_24_3495846_1257841925_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_24_3495845_1257841925_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_25_3496959_1257841992_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_25_3496958_1257841992_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_25_3496957_1257841992_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_22_3567660_1257847294_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_22_3567659_1257847294_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_23_3570249_1257847463_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_23_3575233_1257847820_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_24_3584353_1257848662_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_21_3599019_1257849645_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_25_3599934_1257849689_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_25_3600959_1257849772_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_25_3600907_1257849770_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_21_3601061_1257849787_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_24_3601861_1257849828_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_25_3603922_1257849975_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_21_3605040_1257850070_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_25_3605943_1257850114_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_24_3606801_1257850195_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_22_3608657_1257850331_0 is no longer usable
11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_23_3610708_1257850473_0 is no longer usable
ID: 33188 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilebanditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
500 thousand credit badge14 year member badge
Message 33189 - Posted: 10 Nov 2009, 17:47:34 UTC - in response to Message 33188.  

Was this caused by the server crash?

11/10/2009 10:24:19 AM Milkyway@home Message from server: Result de_s222_3s_best_3p_07r_22_3480689_1257840816_0 is no longer usable

Yes it was posted on the front page that ALL current work should be cancelled as it has been deleted from the database.
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 33189 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Smuuth

Send message
Joined: 1 Nov 09
Posts: 3
Credit: 12,689,679
RAC: 0
10 million credit badge12 year member badge
Message 33190 - Posted: 10 Nov 2009, 18:11:51 UTC - in response to Message 33189.  

Thanks. Didn't see the updated front page. Those were all tasks downloaded earlier.
ID: 33190 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfilePaul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
100 million credit badge14 year member badge
Message 33191 - Posted: 10 Nov 2009, 18:36:04 UTC

Thanks for the updates ...

There is that old item of wisdom ... things happen ... :)

Disk drives are mechanical devices and vibration with shocks is not a good thing ...
ID: 33191 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileBymark
Avatar

Send message
Joined: 6 Mar 09
Posts: 51
Credit: 492,109,133
RAC: 0
300 million credit badge13 year member badge
Message 33192 - Posted: 10 Nov 2009, 18:51:42 UTC - in response to Message 33191.  

Yep, thanks for the briefing on the front page, this crash was handled very well
indeed, and I'am not in hurry of new WU, just fix the server so it will last for some week until the new arrive.
ID: 33192 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileThe Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
200 million credit badge14 year member badge
Message 33197 - Posted: 10 Nov 2009, 22:04:18 UTC

Oh well. Sh.. happens and we all get over it.

Hope you are able to get the project going again soon. And great timing in regards to your thesis defense...bugger.

Let's hope Collatz doesn't crash with all the extra crunchers over there!
ID: 33197 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
SkyeHunter

Send message
Joined: 6 Mar 09
Posts: 41
Credit: 38,856,291
RAC: 0
30 million credit badge13 year member badge
Message 33206 - Posted: 11 Nov 2009, 9:30:37 UTC - in response to Message 33197.  

Let's hope Collatz doesn't crash with all the extra crunchers over there!


It crossed my mind too...

Did something like that not happen 2 weekends ago ?

Makes you realize the kind of stress a strong cruncher community can induce on project servers...
ID: 33206 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileDavid Glogau*
Avatar

Send message
Joined: 12 Aug 09
Posts: 172
Credit: 645,240,165
RAC: 0
500 million credit badge13 year member badge
Message 33208 - Posted: 11 Nov 2009, 9:47:54 UTC - in response to Message 33206.  

Let's hope Collatz doesn't crash with all the extra crunchers over there!


It crossed my mind too...

Did something like that not happen 2 weekends ago ?

Makes you realize the kind of stress a strong cruncher community can induce on project servers...


Well don't blame me. Collatz won't let me attach, and doesn't appear to be accepting new users. Got Seti running on the Cudas, just need something for the ATI.
ID: 33208 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profileverstapp
Avatar

Send message
Joined: 26 Jan 09
Posts: 589
Credit: 497,834,261
RAC: 0
300 million credit badge13 year member badge
Message 33209 - Posted: 11 Nov 2009, 10:12:51 UTC

F@H. :p
Cheers,

PeterV

.
ID: 33209 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileDavid Glogau*
Avatar

Send message
Joined: 12 Aug 09
Posts: 172
Credit: 645,240,165
RAC: 0
500 million credit badge13 year member badge
Message 33211 - Posted: 11 Nov 2009, 10:55:56 UTC - in response to Message 33209.  

F@H. :p


Oh yeah. Already running. Forgot about that one hehehe.
Seems to be averaging ~6 hours per WU atm. Up to 134 WU's done now. Last time I checked it was only 107.

IIRC it was you that put me onto them in the first place. Thanks

Cheers David
ID: 33211 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : Server Crash November 10

©2022 Astroinformatics Group