Message boards :
Number crunching :
Aaargh! Server out of new work!
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 22 · Next
Author | Message |
---|---|
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
Work is up now. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 4 Oct 08 Posts: 1734 Credit: 64,228,409 RAC: 0 |
Sighing with relief, and letting the raw patch at the back of my throat time to get better. Where is my voice? Go away, I was asleep |
Send message Joined: 11 Dec 09 Posts: 17 Credit: 62,324,991 RAC: 98 |
Feeder is not running 10-08-2010 10:51:55 Milkyway@home Sending scheduler request: Requested by user. 10-08-2010 10:51:55 Milkyway@home Reporting 2 completed tasks, requesting new tasks for GPU 10-08-2010 10:51:57 Milkyway@home Scheduler request completed: got 0 new tasks 10-08-2010 10:51:57 Milkyway@home Message from server: Server error: feeder not running |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
I've sent a email to stuff that servers are down. Alexander |
Send message Joined: 11 Dec 09 Posts: 17 Credit: 62,324,991 RAC: 98 |
So did I, and the feeder works again! |
Send message Joined: 4 Oct 08 Posts: 1734 Credit: 64,228,409 RAC: 0 |
I presume a third request to the admins for a server reboot will not go amiss. The Validator has a balance of Workunits waiting for validation 34,273 , so the awaiting for work must have happened a while ago. Go away, I was asleep |
Send message Joined: 10 Dec 09 Posts: 18 Credit: 9,456,111 RAC: 0 |
.. well, no new Wu's again. just finished my batch now... i was wondering why there was so many Wu's waiting to report... |
Send message Joined: 12 Aug 09 Posts: 172 Credit: 645,240,165 RAC: 0 |
Since I live on the other side of the planet, I think the project admin should give me a big RED button to push whenever this happens, so the servers reset. At least three of my babies switch automatically over to the backup project now. |
Send message Joined: 8 Mar 09 Posts: 192 Credit: 10,868,615 RAC: 0 |
.. well, no new Wu's again. just finished my batch now... I dont know about other users but i'm switched to dnetc@home. I have 12 tasks cache waiting to run. I run them out when this project is running again but not before it. |
Send message Joined: 4 Oct 08 Posts: 1734 Credit: 64,228,409 RAC: 0 |
I moved over to Collatz as the ATI HD3850 crunches DNETC incredibly slowly compered to either Milkyway (preferred) or Collatz (back up). Now waiting the results of E-mails to the project admins to re-boot the servers. Go away, I was asleep |
Send message Joined: 20 Sep 08 Posts: 1391 Credit: 203,563,566 RAC: 0 |
I'm already crunching elsewhere with backup projects! Micro management sucks though.... Don't drink water, that's the stuff that rusts pipes |
Send message Joined: 25 Jun 10 Posts: 284 Credit: 260,490,091 RAC: 0 |
The reliabilty of this project is almost getting as bad as SETI. I managed to get about 12 work units 30 minutes ago, then it stopped again. |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
The reliabilty of this project is almost getting as bad as SETI. I managed to get about 12 work units 30 minutes ago, then it stopped again. Have you ever tried orbit@home or lhc@home? That could change your mind! For your GPU Collatz Conjecture could be a backup project. Alexander |
Send message Joined: 25 Jun 10 Posts: 284 Credit: 260,490,091 RAC: 0 |
I haven't tried orbit@home, and I gave up on lhc@home awhile ago. As far as Collatz goes, after I installed my second ATI 5970 card, all Collatz will do for me is lock up my system. I wish I could get it to run. Evidently it doesn't like an i7 980x cpu, Win7 64bit, and 2 ATI 5970 cards. Mike.. |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
Mike, collatz likes i7, win64 and 2 ATI-cards. As you can see, my mainsys is a similar configuration, except that I do not have 2 5970 but one 5830 and one 4870 and 'only' 8 threads. And collatz works fine. But when I take a look onto your computers, I cannot find one with ATI-GPU's. There are two listed with nVidia. Maybe you have a more basic problem? Alexander |
Send message Joined: 25 Jun 10 Posts: 284 Credit: 260,490,091 RAC: 0 |
That's strange, when I look at my computers the first one listed at the top is the system I am talking about with the 2 ATI 5970 cards. I see 10 systems when I go to my list of computers. Mike... |
Send message Joined: 1 Sep 08 Posts: 520 Credit: 302,528,469 RAC: 203 |
An automatic pre-emptive stop/start of the server (or server processes) is something of a brute force *work-around* which doesn't deal with what appears to be a root cause problem that could use some analysis and resolution efforts. Back in the day when Travis was more closely involved, pleas here for that sort of corrective action seemed to have more effect. Seemingly at this point, it is more a case of auto-pilot (where the best that can be had is frequent reboots) as the various admins have a lot of other things on their plate in addition to this project. |
Send message Joined: 1 Sep 08 Posts: 520 Credit: 302,528,469 RAC: 203 |
Well not quite -- I mean the approach these days at SETI is a weekly *three day* outage -- preceded by 12 to 24 hour traffic jam and then followed by a post outage traffic jam of 12 to 24 hours. I believe the idea was to improve reliability when the outage wasn't going on -- it hasn't yet done that. So for SETI, what is now in place is a part time project, but their message boards run about 162 hours a week. Rather a fair amount of resource there for message boards it seems to me. SETI moved to close to the bottom of my list rather a long time ago. I suspect a large part of the problem here is that to a fair degree, the now *Doctor* Travis has moved on (as is to be expected) and there no longer is the motivational force behind this project. The reliabilty of this project is almost getting as bad as SETI. I managed to get about 12 work units 30 minutes ago, then it stopped again. |
Send message Joined: 1 Sep 08 Posts: 520 Credit: 302,528,469 RAC: 203 |
For me, Milkyway dropped to my second project simply because I have a flock of GPU's that MW doesn't support (ie non-double precision cards). With DNetC now available which provides largely the same GPU support as Collatz, MW will drop down to my number three project in terms of TC within a couple of months. It is interesting the sort of reliability that Collatz and Dnetc can provide with quite limited resources (of course with lower user counts). I agree with you regarding the somewhat finicky nature of Dnetc -- there are clearly some GPU configurations it doesn't play well with (like the dual 5970 ATI's and your 3850), and it can push the cards to a distracting degree -- I can't run Dnet on my primary computer when I am doing even ordinary tasks, compared to Collatz. I guess we can hope that Travis is able to pass on the torch here to someone at RPI who will be 'invested' in the project as he was in the past. My other hope is that additional 'low end' GPU projects, particularly ATI GPU projects, start showing as well. I moved over to Collatz as the ATI HD3850 crunches DNETC incredibly slowly compered to either Milkyway (preferred) or Collatz (back up). |
Send message Joined: 20 Sep 08 Posts: 1391 Credit: 203,563,566 RAC: 0 |
Well not quite -- I mean the approach these days at SETI is a weekly *three day* outage -- preceded by 12 to 24 hour traffic jam and then followed by a post outage traffic jam of 12 to 24 hours. I believe the idea was to improve reliability when the outage wasn't going on -- it hasn't yet done that. As I understand it, the 3 day outage is to let Nitpicker run on 10 years worth of results to sift for likely candidates to re-examine. When they tried it in real time it zonked the servers and the database out. You can't upload or download work for 3 days, but the message boards are only out for 9-12 hours as they were before. I suspect a large part of the problem here is that to a fair degree, the now *Doctor* Travis has moved on (as is to be expected) and there no longer is the motivational force behind this project. He did say that he would be around but not have as much involvement as before, so you are about right in what you say. The point is that there is DNETC which gives about 90% of credits you get here, and also Collatz which gives about 60%. That is of course running GPU's. Talking of GPU's I have said over and over again, that the basic Boinc infrastructure used by the majority of projects was just not designed for the high levels of data throughput that the onslaught of GPU crunching has unleashed. Servers were scoped out to deal with CPU work and it is not surprising to me at all that all the popular projects are struggling. If you couple that with a general slow down of the www/Internet due to the world population approaching 7 billion, and the fact that China has nearly 20% of that, and is expanding its web presence at an exponential rate, everything is creaking at the seams. Will DC survive ?? Don't drink water, that's the stuff that rusts pipes |
©2024 Astroinformatics Group