| log in |
Message boards : Number crunching : Aaargh! Server out of new work!
| Author | Message |
|---|---|
|
At this moment the server has no new work to send and my limited Milkyway caches are dry, and so am I! | |
| ID: 39188 | Rating: 0 | rate:
| |
|
+1 | |
| ID: 39189 | Rating: 0 | rate:
| |
|
Typical, just as I pass 200 million everything falls over! | |
| ID: 39191 | Rating: 0 | rate:
| |
|
Well Travis did acknowledge impending doom & gloom :) | |
| ID: 39195 | Rating: 0 | rate:
| |
|
so it gonna be a week of the server unavailability? | |
| ID: 39197 | Rating: 0 | rate:
| |
Typical, just as I pass 200 million everything falls over! 200 million, congrats! A link from Collatz message board. http://boinc.thesonntags.com/collatz/forum_thread.php?id=472 | |
| ID: 39204 | Rating: 0 | rate:
| |
Typical, just as I pass 200 million everything falls over! congrats, man :-) good job and RAC :-) I switched to collatz for a while ____________ | |
| ID: 39206 | Rating: 0 | rate:
| |
|
Been running Collatz since I started this thread. | |
| ID: 39215 | Rating: 0 | rate:
| |
... my limited Milkyway caches are dry, and so am I! ... Thank god I'm drinking a fresh cold beer so just my ATI card has run dry so far. ____________ Lovely greetings, Cori | |
| ID: 39217 | Rating: 0 | rate:
| |
|
An odd thing, I have exactly 50 tasks waiting for credit, is this an coincidence? | |
| ID: 39225 | Rating: 0 | rate:
| |
|
It could well be related -- the number of work units awaiting validation is way up as well. I'm guessing the status page is doing the 'politician' thing and not representing things as they are (aside from the no work status). An odd thing, I have exactly 50 tasks waiting for credit, is this an coincidence? ____________ | |
| ID: 39227 | Rating: 0 | rate:
| |
|
Work available again. | |
| ID: 39259 | Rating: 0 | rate:
| |
|
Validator crapped. Also no tasks. Workunits waiting for validation 17,366 ____________ Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. | |
| ID: 39389 | Rating: 0 | rate:
| |
|
Dropped a kung fu bomb on the server last night, looks like things are running. I'm pretty close to tracking down the problem -- it seems after a couple days of uptime it can have a memory leak. Trying to track it down (but it'll take a little time as I have to wait for it to happen again). | |
| ID: 39402 | Rating: 0 | rate:
| |
Dropped a kung fu bomb on the server last night... You can ask me next time, I'm known as notorious 'Kaboom girlie' in BOINCdom... *LOL* ____________ Lovely greetings, Cori | |
| ID: 39404 | Rating: 0 | rate:
| |
|
Message from server: No work available. | |
| ID: 39491 | Rating: 0 | rate:
| |
|
+1 | |
| ID: 39492 | Rating: 0 | rate:
| |
|
+1 | |
| ID: 39493 | Rating: 0 | rate:
| |
|
+1 | |
| ID: 39494 | Rating: 0 | rate:
| |
|
+1 | |
| ID: 39495 | Rating: 0 | rate:
| |
|
Just noticed that instead of seeing FreeHAL and Einstein running on the CPU and Milkyway running on the GPU, I only see Einstein (and there is serious upload and download problems problems on theat project. | |
| ID: 39498 | Rating: 0 | rate:
| |
|
No work available ... | |
| ID: 39501 | Rating: 0 | rate:
| |
|
I hope Travis was able to get information on the memory leak he was looking for! | |
| ID: 39502 | Rating: 0 | rate:
| |
|
Sadly, it wasn't the memory leak but something else, looking into it right now. | |
| ID: 39504 | Rating: 0 | rate:
| |
|
Looks like it is back up as I just got work. Of course I also turned on Collatz as I will be away this weekend and I want to keep the card running. | |
| ID: 39505 | Rating: 0 | rate:
| |
|
Anybody having trouble uploading results? | |
| ID: 39552 | Rating: 0 | rate:
| |
|
Must be due to the pentathlon, which is now concentrating on MW for the next week. | |
| ID: 39555 | Rating: 0 | rate:
| |
|
ARRRGH! Out again. | |
| ID: 39603 | Rating: 0 | rate:
| |
|
GDI. | |
| ID: 39604 | Rating: 0 | rate:
| |
|
Should be back up and running now. | |
| ID: 39605 | Rating: 0 | rate:
| |
|
don't see any progres. | |
| ID: 39606 | Rating: 0 | rate:
| |
Time to add another bug to our longstanding issues list: Lets see now, gathering all the words that are common to all three of the above: "Travis causes the server to crash" ;-) I suggest you have another beer after that revelation! Travis, it would appear that you and Murphy are well acquainted. Time to remove tongue from cheek... | |
| ID: 39608 | Rating: 0 | rate:
| |
Time to add another bug to our longstanding issues list: 1+ ____________ Lovely greetings, Cori | |
| ID: 39618 | Rating: 0 | rate:
| |
Time to add another bug to our longstanding issues list: Well technically I'd have to say at least 90% of the server crashes were because of me in some way shape or form :) Updating code and causing all those pesky bugs! ____________ | |
| ID: 39637 | Rating: 0 | rate:
| |
|
Is Murphy is sitting in for Travis tonight? | |
| ID: 39769 | Rating: 0 | rate:
| |
|
Murphy must have been offered a beer down the pub...things look good again. | |
| ID: 39775 | Rating: 0 | rate:
| |
Murphy must have been offered a beer down the pub...things look good again. Ssshhh, there! Tempting fate are we? ____________ Go away, I was asleep | |
| ID: 39776 | Rating: 0 | rate:
| |
|
Cant be avoided. Lots of people will run out of work with hosts like this, http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=171705. | |
| ID: 39801 | Rating: 0 | rate:
| |
Cant be avoided. Lots of people will run out of work with hosts like this, http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=171705. Don't know how that happened! I sure hope Travis takes a look at that one ... no way he should be getting more than 24 tasks at a time ... something bad is happening ... and it does not seem like he is returning any at all either ... | |
| ID: 39804 | Rating: 0 | rate:
| |
|
He's also using a very old version of Boinc and MW app. Time to pull the plug methinks.... | |
| ID: 39805 | Rating: 0 | rate:
| |
He's also using a very old version of Boinc and MW app. Time to pull the plug methinks.... I'm using 5.10.45, no problem with it. I don't see those as the problem. Seems to be 2000+ ghost wus. How about posting some of the messages from the timeframe the tasks kept downloading? ____________ Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. | |
| ID: 39815 | Rating: 0 | rate:
| |
|
Somebody poke Travis, the tasks are now 2598. :) | |
| ID: 39820 | Rating: 0 | rate:
| |
|
I would suggest putting MW on NNW untill this is fixed. | |
| ID: 39831 | Rating: 0 | rate:
| |
|
That host is not mine, it belongs to Lord Nelloz. I only found out about it after I checked a task of mine that's been inconclusive for a week now. | |
| ID: 39836 | Rating: 0 | rate:
| |
|
Arrrgghhh. | |
| ID: 39970 | Rating: 0 | rate:
| |
|
Gnnnnnnnnh! :) | |
| ID: 39972 | Rating: 0 | rate:
| |
|
| |
| ID: 39973 | Rating: 0 | rate:
| |
|
| |
| ID: 39979 | Rating: 0 | rate:
| |
|
I see the Server Status shows them all working (green), but there is no work yet. | |
| ID: 39981 | Rating: 0 | rate:
| |
|
Could it be that http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=171705 is soaking it all up? That host's task count has now increased to 2920 :( | |
| ID: 39982 | Rating: 0 | rate:
| |
Could it be that http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=171705 is soaking it all up? That host's task count has now increased to 2920 :( How come that guy haven't been banned yet? | |
| ID: 39986 | Rating: 0 | rate:
| |
|
Well boys and girls, | |
| ID: 39989 | Rating: 0 | rate:
| |
|
I also got fresh WU's (was quite dry for some hours), seems we are up and running again =) | |
| ID: 39991 | Rating: 0 | rate:
| |
Could it be that http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=171705 is soaking it all up? That host's task count has now increased to 2920 :( The information on this cruncher has been sent up to Travis to be investigated further. ____________ | |
| ID: 39992 | Rating: 0 | rate:
| |
|
Although the Server Status says all is running OK, we have been out of work for 8+ hours now. | |
| ID: 40773 | Rating: 0 | rate:
| |
|
Aaargh! | |
| ID: 40846 | Rating: 0 | rate:
| |
|
Must be... The trolls are acting up again... | |
| ID: 40847 | Rating: 0 | rate:
| |
|
Hello, | |
| ID: 40848 | Rating: 0 | rate:
| |
|
I wonder if the Milkyway server was made by the same guy that made this Alarm Clock? | |
| ID: 40849 | Rating: 0 | rate:
| |
|
Hi, | |
| ID: 40850 | Rating: 0 | rate:
| |
|
Still nowt to crunch! | |
| ID: 40852 | Rating: 0 | rate:
| |
|
Validator crapped out again. | |
| ID: 40854 | Rating: 0 | rate:
| |
|
OK, back to Collatz, sigh .... | |
| ID: 40855 | Rating: 0 | rate:
| |
|
| |
| ID: 40859 | Rating: 0 | rate:
| |
|
and again no work :( | |
| ID: 40900 | Rating: 0 | rate:
| |
|
My GPU-fan went eerily silent a couple of minutes ago. I guess all those Collatz crunchers returned the favour with Collatz down and sucked our well dry as well :-) | |
| ID: 40901 | Rating: 0 | rate:
| |
|
Argh! | |
| ID: 40902 | Rating: 0 | rate:
| |
|
Noooooooo | |
| ID: 40903 | Rating: 0 | rate:
| |
|
I also keep an account on DNETC, just in case. I downloaded 50 5xxx units to keep my GPU busy. | |
| ID: 40905 | Rating: 0 | rate:
| |
|
This is great news. Time to swap out a failing PSU and do some maintainence | |
| ID: 40906 | Rating: 0 | rate:
| |
|
out of work. this starts to happen so often. | |
| ID: 40912 | Rating: 0 | rate:
| |
out of work. this starts to happen so often. And its going to be out of work for awhile .. http://milkyway.cs.rpi.edu/milkyway/server_status.php Its got nothing to send..and with a growing number of people getting their WUs done that have yet passed validation..I imagine its going to be another mad rush when the server does get up again.. I got to say though..this is getting silly for server downtime. I think they need a new guy behind the scenes for the server or something.. | |
| ID: 40913 | Rating: 0 | rate:
| |
out of work. this starts to happen so often. I believe that the guy is good enough but the hardware lacks the capability to catch up. Maybe it is time to renew or upgrade the server. | |
| ID: 40914 | Rating: 0 | rate:
| |
|
Look at bright side... just think how boring the graphs would be without the occasional high hills and deep, wide valleys. | |
| ID: 40915 | Rating: 0 | rate:
| |
out of work. this starts to happen so often. More and more people use faster GPU's, which results in more wu's to be processed. Maybe the next part, which lacks the required capabilities is the network (collatz!!!). This is a good example how evolution works! @ all responsible stuff: please think about a solution for at least 6 months or so! Regards, Alexander | |
| ID: 40916 | Rating: 0 | rate:
| |
Look at bright side... Hmmm ... Always look at the byte side of life ! There's a song as well .. Alexander | |
| ID: 40917 | Rating: 0 | rate:
| |
|
If it is possible from the science, much longer WU could be a solution. It will reduce traffic and server load. Of course then a checkpoint-system must be implemented in the application code. | |
| ID: 40918 | Rating: 0 | rate:
| |
Maybe the constraint is not the server, but the brain... This is the reason why I will not join this team. | |
| ID: 40919 | Rating: 0 | rate:
| |
|
There is 1 WU available, let´s look who get it ;) | |
| ID: 40920 | Rating: 0 | rate:
| |
|
Hello, | |
| ID: 40925 | Rating: 0 | rate:
| |
There is 1 WU available, let´s look who get it ;) I just got ~20 wu's. Don't give up! | |
| ID: 40932 | Rating: 0 | rate:
| |
There is 1 WU available, let´s look who get it ;) Dang! I just down loaded wu's for Folding@home and they will take hours to finish and not just 1 minute like the Milkyway wu's. | |
| ID: 40933 | Rating: 0 | rate:
| |
Hello, Why do you need to babysit this project? Just set a backup project and let BOINC do its thing. Yes it is annoying when MW stops but with two other project that support ATI cards your ATI GPUs shouldn't run out of work. | |
| ID: 40934 | Rating: 0 | rate:
| |
|
hi, | |
| ID: 40935 | Rating: 0 | rate:
| |
Hello, I think the point is that the project has been down repeatedly and since the main application as well as the amount of current output heavily relies on fast gpus, then it should have been expected that the server load will be heavy also. Apparently, everybody knows the down is about feeder can't catching up, or validation too loaded or server load. This is not surprise. So if to continue with current equipment, there will be much more downs for sure. My point is to indicate the lack of hardware so that proper adjustments are to be made. | |
| ID: 40937 | Rating: 0 | rate:
| |
|
Zero work again. | |
| ID: 40938 | Rating: 0 | rate:
| |
|
Just as I am returning 2 rigs to Milkyway, do I detect another stoppage with ZERO work ready to send? | |
| ID: 40939 | Rating: 0 | rate:
| |
|
Yup - same here. | |
| ID: 40940 | Rating: 0 | rate:
| |
|
lol | |
| ID: 40941 | Rating: 0 | rate:
| |
|
What we are missing is a statement from the project stuff. Is somebody working for a solution? | |
| ID: 40942 | Rating: 0 | rate:
| |
|
Just thinking. | |
| ID: 40943 | Rating: 0 | rate:
| |
|
Hi, | |
| ID: 40944 | Rating: 0 | rate:
| |
|
I wish I had a 58xx card, so I could complain more often ;-)) | |
| ID: 40945 | Rating: 0 | rate:
| |
|
On POETS Day there in no need to complain | |
| ID: 40946 | Rating: 0 | rate:
| |
|
Indeed, the lack of information this time is getting a tad tedious. Oh well, at least both Dnetc and Collatz are (at the moment) running. What we are missing is a statement from the project stuff. Is somebody working for a solution? ____________ | |
| ID: 40949 | Rating: 0 | rate:
| |
I wish I had a 58xx card, so I could complain more often ;-)) Jup! There is so many more important things to do in life than constantly glaring at a server when The sun is shining :) | |
| ID: 40950 | Rating: 0 | rate:
| |
|
Correction -- Collatz is encountering one of their daily Comcast can't figure it out connectivity problems. | |
| ID: 40952 | Rating: 0 | rate:
| |
|
Milkyway servers are up, but most are not running nor have any work. | |
| ID: 40953 | Rating: 0 | rate:
| |
|
We're baaaack. | |
| ID: 40954 | Rating: 0 | rate:
| |
|
Sorta, I still don't have any work from here and I also have a pretty full cache of DNETC work to keep the 5830 warm. | |
| ID: 40956 | Rating: 0 | rate:
| |
|
Kick It! | |
| ID: 41065 | Rating: 0 | rate:
| |
|
I am down as well. | |
| ID: 41066 | Rating: 0 | rate:
| |
|
Shake it! | |
| ID: 41067 | Rating: 0 | rate:
| |
|
Looks like the validator again? | |
| ID: 41068 | Rating: 0 | rate:
| |
|
So all the GPU folks returned their wu's in less than 4 hrs (60,000 wu's) and now the CPU's are returning their wu's at 10,000 per 4hrs. | |
| ID: 41070 | Rating: 0 | rate:
| |
|
Maybe we should make a collection for something like an 'Weekend Automatic Validator Kicker'. | |
| ID: 41072 | Rating: 0 | rate:
| |
|
Could something be set up to automatically restart it? | |
| ID: 41073 | Rating: 0 | rate:
| |
Could something be set up to automatically restart it? I think so. Since at least one server gets info about nr. of wu's waiting for validation there could be a limit set up, which causes an auto-restart (like windows does ater updates), local or remote. But this may depend on the operating-system and security-settings they use. What should work in every situation is a remote operated relay, whose contact is parallel to the reset-button of the PC. Alexander | |
| ID: 41074 | Rating: 0 | rate:
| |
|
With apologies to DEVO ... (ah the eighties) | |
| ID: 41075 | Rating: 0 | rate:
| |
|
It's baaaaack.... | |
| ID: 41090 | Rating: 0 | rate:
| |
|
No work, again, for a little while now, and caches running down. | |
| ID: 41217 | Rating: 0 | rate:
| |
|
I need wooooooooooooooooooooooooork! ;-)))))) | |
| ID: 41227 | Rating: 0 | rate:
| |
I need wooooooooooooooooooooooooork! ;-)))))) ____________ Join BOINC United now! | |
| ID: 41229 | Rating: 0 | rate:
| |
|
It seems we have a repetitive problem -- symptoms are that work isn't being produced for download and completed work isn't getting validated. | |
| ID: 41230 | Rating: 0 | rate:
| |
It seems we have a repetitive problem -- symptoms are that work isn't being produced for download and completed work isn't getting validated. i agree because it gets more annoying each time. There hasn't been a month withouth an at least one day long break out. | |
| ID: 41231 | Rating: 0 | rate:
| |
|
i can say even more - this happens on bi-weekly basis... | |
| ID: 41232 | Rating: 0 | rate:
| |
I need wooooooooooooooooooooooooork! ;-)))))) *ROFL* PS. Besser wär: ____________ Lovely greetings, Cori | |
| ID: 41233 | Rating: 0 | rate:
| |
|
I gave the server a kick, let me know if it worked. | |
| ID: 41235 | Rating: 0 | rate:
| |
I gave the server a kick, let me know if it worked. Unfortunately not. Alexander | |
| ID: 41236 | Rating: 0 | rate:
| |
|
Yay, now there's frrrresh crrrrunchies! THX! | |
| ID: 41239 | Rating: 0 | rate:
| |
|
It's working fine for me ATM, and I am d/ling new work as needed. | |
| ID: 41241 | Rating: 0 | rate:
| |
|
Arrrgghhhh! | |
| ID: 41258 | Rating: 0 | rate:
| |
I need wooooooooooooooooooooooooork! ;-)))))) Can anyone translate text to english please. And yes: there is no work this time but DNETC@home is good backup project. :-) | |
| ID: 41259 | Rating: 0 | rate:
| |
|
No wonder all the problem This is the server kicking staf ;) | |
| ID: 41260 | Rating: 0 | rate:
| |
I need wooooooooooooooooooooooooork! ;-)))))) Bundesagentur für Arbeit = Federal Employment Office of Germany | |
| ID: 41261 | Rating: 0 | rate:
| |
|
Kick the serveR | |
| ID: 41262 | Rating: 0 | rate:
| |
|
again... | |
| ID: 41263 | Rating: 0 | rate:
| |
|
Travis you selfish bas!ard, you snuck out out of the building to have some sort of a life didn't you ? | |
| ID: 41265 | Rating: 0 | rate:
| |
|
Every time a staff member kicks server, it gets broken sooner. Maybe all it needs a love? or a Xeon CPU he can make love to? so don't kick it, just love it. | |
| ID: 41267 | Rating: 0 | rate:
| |
|
Looks like the server(s) have been working and dishing out work for a few hours now. Just need to NNT Collatz and run a 5 hour cache down then MW resumes. | |
| ID: 41270 | Rating: 0 | rate:
| |
|
bundesagentur für arbeitspakete translates to: | |
| ID: 41271 | Rating: 0 | rate:
| |
|
I see the validator is building again | |
| ID: 41274 | Rating: 0 | rate:
| |
|
It was nearly Aaargh! again a couple of hours ago this morning. The validator seemed to be building up and the "Ready to send" read ZERO! | |
| ID: 41285 | Rating: 0 | rate:
| |
|
A usual hiccup ! | |
| ID: 41287 | Rating: 0 | rate:
| |
|
It looks like we are rapidly heading to a new Aaargh session according to the current server status. | |
| ID: 41340 | Rating: 0 | rate:
| |
|
Definitely moved to Aaarrrrggghhhh! | |
| ID: 41341 | Rating: 0 | rate:
| |
|
Still no work | |
| ID: 41343 | Rating: 0 | rate:
| |
|
I realize others have suggested this in the past, but it seems that either a bit of serious root cause identification/resolution effort ought to be done, or, alternatively, a automatic stop/restart process (perhaps once a week) to clear things out (or even daily off hours to insure no more than single day outages) needs to be in place. Still no work ____________ | |
| ID: 41344 | Rating: 0 | rate:
| |
|
A preemptive server reboot daily or every 48hrs sounds like a good idea! | |
| ID: 41345 | Rating: 0 | rate:
| |
|
The best and fastest way to get someone to restart the validator is to send an email to astro@cs.rpi.edu because all 11 administrators receive it. I have already sent an email this time. | |
| ID: 41346 | Rating: 0 | rate:
| |
|
Work is up now. | |
| ID: 41347 | Rating: 0 | rate:
| |
|
Sighing with relief, and letting the raw patch at the back of my throat time to get better. | |
| ID: 41348 | Rating: 0 | rate:
| |
|
Feeder is not running | |
| ID: 41353 | Rating: 0 | rate:
| |
|
I've sent a email to stuff that servers are down. | |
| ID: 41354 | Rating: 0 | rate:
| |
|
So did I, and the feeder works again! | |
| ID: 41355 | Rating: 0 | rate:
| |
|
I presume a third request to the admins for a server reboot will not go amiss. The Validator has a balance of Workunits waiting for validation 34,273 , so the awaiting for work must have happened a while ago. | |
| ID: 41356 | Rating: 0 | rate:
| |
|
.. well, no new Wu's again. just finished my batch now... | |
| ID: 41357 | Rating: 0 | rate:
| |
|
Since I live on the other side of the planet, I think the project admin should give me a big RED button to push whenever this happens, so the servers reset. | |
| ID: 41359 | Rating: 0 | rate:
| |
.. well, no new Wu's again. just finished my batch now... I dont know about other users but i'm switched to dnetc@home. I have 12 tasks cache waiting to run. I run them out when this project is running again but not before it. | |
| ID: 41360 | Rating: 0 | rate:
| |
|
I moved over to Collatz as the ATI HD3850 crunches DNETC incredibly slowly compered to either Milkyway (preferred) or Collatz (back up). | |
| ID: 41361 | Rating: 0 | rate:
| |
|
I'm already crunching elsewhere with backup projects! Micro management sucks though.... | |
| ID: 41362 | Rating: 0 | rate:
| |
|
The reliabilty of this project is almost getting as bad as SETI. I managed to get about 12 work units 30 minutes ago, then it stopped again. | |
| ID: 41365 | Rating: 0 | rate:
| |
The reliabilty of this project is almost getting as bad as SETI. I managed to get about 12 work units 30 minutes ago, then it stopped again. Have you ever tried orbit@home or lhc@home? That could change your mind! For your GPU Collatz Conjecture could be a backup project. Alexander | |
| ID: 41367 | Rating: 0 | rate:
| |
I haven't tried orbit@home, and I gave up on lhc@home awhile ago. As far as Collatz goes, after I installed my second ATI 5970 card, all Collatz will do for me is lock up my system. I wish I could get it to run. Evidently it doesn't like an i7 980x cpu, Win7 64bit, and 2 ATI 5970 cards. Mike.. | |
| ID: 41368 | Rating: 0 | rate:
| |
Mike, collatz likes i7, win64 and 2 ATI-cards. As you can see, my mainsys is a similar configuration, except that I do not have 2 5970 but one 5830 and one 4870 and 'only' 8 threads. And collatz works fine. But when I take a look onto your computers, I cannot find one with ATI-GPU's. There are two listed with nVidia. Maybe you have a more basic problem? Alexander | |
| ID: 41369 | Rating: 0 | rate:
| |
That's strange, when I look at my computers the first one listed at the top is the system I am talking about with the 2 ATI 5970 cards. I see 10 systems when I go to my list of computers. Mike... | |
| ID: 41371 | Rating: 0 | rate:
| |
|
An automatic pre-emptive stop/start of the server (or server processes) is something of a brute force *work-around* which doesn't deal with what appears to be a root cause problem that could use some analysis and resolution efforts. | |
| ID: 41372 | Rating: 0 | rate:
| |
|
Well not quite -- I mean the approach these days at SETI is a weekly *three day* outage -- preceded by 12 to 24 hour traffic jam and then followed by a post outage traffic jam of 12 to 24 hours. I believe the idea was to improve reliability when the outage wasn't going on -- it hasn't yet done that. The reliabilty of this project is almost getting as bad as SETI. I managed to get about 12 work units 30 minutes ago, then it stopped again. ____________ | |
| ID: 41373 | Rating: 0 | rate:
| |
|
For me, Milkyway dropped to my second project simply because I have a flock of GPU's that MW doesn't support (ie non-double precision cards). I moved over to Collatz as the ATI HD3850 crunches DNETC incredibly slowly compered to either Milkyway (preferred) or Collatz (back up). ____________ | |
| ID: 41374 | Rating: 0 | rate:
| |
Well not quite -- I mean the approach these days at SETI is a weekly *three day* outage -- preceded by 12 to 24 hour traffic jam and then followed by a post outage traffic jam of 12 to 24 hours. I believe the idea was to improve reliability when the outage wasn't going on -- it hasn't yet done that. As I understand it, the 3 day outage is to let Nitpicker run on 10 years worth of results to sift for likely candidates to re-examine. When they tried it in real time it zonked the servers and the database out. You can't upload or download work for 3 days, but the message boards are only out for 9-12 hours as they were before. I suspect a large part of the problem here is that to a fair degree, the now *Doctor* Travis has moved on (as is to be expected) and there no longer is the motivational force behind this project. He did say that he would be around but not have as much involvement as before, so you are about right in what you say. The point is that there is DNETC which gives about 90% of credits you get here, and also Collatz which gives about 60%. That is of course running GPU's. Talking of GPU's I have said over and over again, that the basic Boinc infrastructure used by the majority of projects was just not designed for the high levels of data throughput that the onslaught of GPU crunching has unleashed. Servers were scoped out to deal with CPU work and it is not surprising to me at all that all the popular projects are struggling. If you couple that with a general slow down of the www/Internet due to the world population approaching 7 billion, and the fact that China has nearly 20% of that, and is expanding its web presence at an exponential rate, everything is creaking at the seams. Will DC survive ?? ____________ Don't drink water, that's the stuff that rusts pipes | |
| ID: 41375 | Rating: 0 | rate:
| |
BarryAZ, I'm afraid you're right. My hope is that someone follows his footsteps. As I've learned from GPUGRID, cal is an 'outdated' programming tool. They are working on OpenCL Apps, which should be a step to an independency of the type of your card as long as it supports OpenCL. Two or three month's ago somebody posted here in this forum, that other projects are working on GPU-Apps. Let's see if I can find that again. Alexander | |
| ID: 41377 | Rating: 0 | rate:
| |
|
Yes, I got that -- if that is the 'permanent' plan for SETI, then the folks most affected by that are the 'SETI uber alles' crowd -- for that group, the self-imposed 'only one project' makes them dependent on a project that, aside from the newsgroups, runs only 50% of the time and is pretty well stressed with folks crowding into the 'half-space' for uploads and downloads.
____________ | |
| ID: 41379 | Rating: 0 | rate:
| |
|
I run a GTX260 24/7 on Seti. it is no big deal. I needed to adjust my cache to 4 days to assure constant work, but that's no big deal. | |
| ID: 41380 | Rating: 0 | rate:
| |
|
hmm, i got some WU's, assimilation seems to be up and running. have to wait though. AFAICS, there are a small bunch ready to send. | |
| ID: 41381 | Rating: 0 | rate:
| |
|
Work is avilable again | |
| ID: 41382 | Rating: 0 | rate:
| |
|
no validation though | |
| ID: 41383 | Rating: 0 | rate:
| |
no validation though It may have still been catshing up with the backlog when you looked. The Validator backlog now reads 6,287 WUs ____________ Go away, I was asleep | |
| ID: 41384 | Rating: 0 | rate:
| |
|
Looks like the root cause is either winning or being looked into -- web pages are on -- sort of in a SETI emulation mode we are at the moment. | |
| ID: 41385 | Rating: 0 | rate:
| |
|
Back to Collatz, which has good uptime since Comcast repaired/replaced parts of the BB line. | |
| ID: 41387 | Rating: 0 | rate:
| |
|
I just setup Collatz to be my primary project and Milky Way to be my backup. I really like Milky Way, but when you average in the downtime and the almost nonexistant cache, I am getting much better credit returns at Collatz. | |
| ID: 41390 | Rating: 0 | rate:
| |
|
OK -- so could we PLEASE get some feedback from folks at the project as to what is going (or not going) on? | |
| ID: 41407 | Rating: 0 | rate:
| |
OK -- so could we PLEASE get some feedback from folks at the project as to what is going (or not going) on? Erm..front page. | |
| ID: 41410 | Rating: 0 | rate:
| |
|
Now the Server Status page is beginning to look ominous regarding another validatior backing up problem. | |
| ID: 41529 | Rating: 0 | rate:
| |
|
Knock knock! | |
| ID: 41625 | Rating: 0 | rate:
| |
|
Sending scheduler request: To fetch work. | |
| ID: 41626 | Rating: 0 | rate:
| |
Sending scheduler request: To fetch work. I was getting the same thing, so I reset the project. It didn't help. The server page shows workunits available, but I am unable to get any of them. They could be CPU N-Body workunits. But, the validator is backed up and looks like it is about to crash, also. | |
| ID: 41627 | Rating: 0 | rate:
| |
|
No panic boys, it is weekend.... | |
| ID: 41628 | Rating: 0 | rate:
| |
|
OK -- so this problem -- which has shown up every week or so remains unresolved -- frankly, not really a surprise. Sending scheduler request: To fetch work. ____________ | |
| ID: 41629 | Rating: 0 | rate:
| |
|
It's the validator again - it has backed up with 59K worth of work to knock off. | |
| ID: 41630 | Rating: 0 | rate:
| |
|
Are the servers dishing up nBody work ATM? | |
| ID: 41637 | Rating: 0 | rate:
| |
|
Looks like it. I am getting Nbody only at the moment. With about | |
| ID: 41638 | Rating: 0 | rate:
| |
|
I've gone back to Collatz until the normal Milkyway WUs are being dished out. | |
| ID: 41641 | Rating: 0 | rate:
| |
|
Nope. CPU only. Apparently GPU to come later. | |
| ID: 41642 | Rating: 0 | rate:
| |
|
Did the project stop using the old workunits in favor of the N-Body workunits? For about the last day, I can only get one or two workunits every now and then. And, the ones I do get are resends because of invalid results from another user. It looks like there are no new workunits being sent out. | |
| ID: 41678 | Rating: 0 | rate:
| |
|
And about every hour I loose and hour or so of completed work! How does that happen? | |
| ID: 41680 | Rating: 0 | rate:
| |
And about every hour I loose and hour or so of completed work! How does that happen? It happens because the work units are validated! Monday started early.... ____________ | |
| ID: 41685 | Rating: 0 | rate:
| |
| ID: 41692 | Rating: 0 | rate:
| |
|
Ican dowload normal MW GPU WUS now, so the system is feeding both Nbody and normal again (I hope) | |
| ID: 41696 | Rating: 0 | rate:
| |
|
Aaah...the smell of GPU's working again. | |
| ID: 41700 | Rating: 0 | rate:
| |
|
It's Friday, the server has been working well for a few days, time for it to fall over... | |
| ID: 41767 | Rating: 0 | rate:
| |
|
Then you'll head for Collatz as DNETC is in the middle of a server upgrade/change. | |
| ID: 41771 | Rating: 0 | rate:
| |
|
BOOM!!! | |
| ID: 41774 | Rating: 0 | rate:
| |
|
Indeed, absent dealing with the underlying problem (which is growing something of a beard), lots of server regular stop/starts just might keep things going. We the weekend in view, it seems that on every Friday a single or double server reboot should be done automatically. It's Friday, the server has been working well for a few days, time for it to fall over... ____________ | |
| ID: 41775 | Rating: 0 | rate:
| |
|
Then you'll head for Collatz as DNETC is in the middle of a server upgrade/change. ____________ | |
| ID: 41776 | Rating: 0 | rate:
| |
|
Ouch -- things might just have gotten worse -- can't connect to Collatz either. | |
| ID: 41777 | Rating: 0 | rate:
| |
|
So it is clear that a number of us poor fools are (and have been) aware of MW being in dead state for several hours, do we have any wagers on the awareness of this in RPI land? | |
| ID: 41778 | Rating: 0 | rate:
| |
|
Aaaarrgghhh, my GPUs are cold! | |
| ID: 41780 | Rating: 0 | rate:
| |
|
Just to add insult to injury,Climate Prediction is having problems too!! I cant upload any trickles. As it turns out CPDN is making room on one of thier disks so all I can do is wait. | |
| ID: 41782 | Rating: 0 | rate:
| |
|
Definitely! | |
| ID: 41783 | Rating: 0 | rate:
| |
|
feeder milkyway Not Running | |
| ID: 41787 | Rating: 0 | rate:
| |
Definitely! Rosetta has not had work either. I now have no tasks to do. Oh well. ____________ Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. | |
| ID: 41788 | Rating: 0 | rate:
| |
|
Collatz probably went down due to the mass switching of people from MW and DNETC. | |
| ID: 41789 | Rating: 0 | rate:
| |
|
I wonder if MW will be offline for the entire weekend. Is Travis the only person at RPI who can figure this one out? | |
| ID: 41790 | Rating: 0 | rate:
| |
|
Dnetc will be offline for at least another day -- during their software update/upgrade one of the HD's on the RAID failed -- they are in rebuild mode. | |
| ID: 41792 | Rating: 0 | rate:
| |
|
Wooohoo! I got a full MW cache. :) | |
| ID: 41795 | Rating: 0 | rate:
| |
| ID: 41796 | Rating: 0 | rate:
| |
|
Yes, mine are now filling up since I suspended Collatz. Now need to work off the Collatz cache between Milkyway sessions. | |
| ID: 41797 | Rating: 0 | rate:
| |
|
Validator needs a kick. It's only validating wu's that are paired. All single wu's are not being validated. | |
| ID: 41801 | Rating: 0 | rate:
| |
|
Well it is zero again. The validater has "overheathed". | |
| ID: 41846 | Rating: 0 | rate:
| |
|
Dnetc may be back and running by the end of the week. The dreaded 'software upgrade' -- stress tested their server and they had a 'mid upgrade' RAID drive failure as well as a memory module failure. They are pretty much in a full rebuild mode for now. | |
| ID: 41847 | Rating: 0 | rate:
| |
|
f.ck, again... I've got used to shutdown every week on weekends, but it's Tuesday only. common guys, you might be kidding me - one day of work and then one day off. | |
| ID: 41848 | Rating: 0 | rate:
| |
Dnetc may be back and running by the end of the week. The dreaded 'software upgrade' -- stress tested their server and they had a 'mid upgrade' RAID drive failure as well as a memory module failure. They are pretty much in a full rebuild mode for now. That is good to hear :-) This place is just unreliable so what we need is more Boinc ATI projects!!!! ____________ Don't drink water, that's the stuff that rusts pipes | |
| ID: 41853 | Rating: 0 | rate:
| |
|
One thing that seems odd to me. The problem appears fairly straightforward (at least the symptoms are pretty obvious and *repetitive*). The workaround resolution (either stop/start processes or a full server stop/restart) also seems reasonably straightforward. | |
| ID: 41855 | Rating: 0 | rate:
| |
|
That sure would be nice as there are still very little projects using ATI GPU's | |
| ID: 41856 | Rating: 0 | rate:
| |
One thing that seems odd to me. The problem appears fairly straightforward (at least the symptoms are pretty obvious and *repetitive*). The workaround resolution (either stop/start processes or a full server stop/restart) also seems reasonably straightforward. It looks like there is something going on at RPI. Can you remember the posting 'Screensaver coming soon' ? Or can you remember the project DNA@HOME ? Milkyway3 ? It should not be a problem to detect that the validator stops validating. And of course, they do detect that because they stop producing wu's. But we all miss the next step which should be a rework of the validator, be it hardware, software or setup or what else. Or at least a quick restart. It looks like nobody is responsible there. This is the best way to kill not only a project but the whole idea of distributed computing. Project responsibles should be serious in their handling of the project issues. Alexander | |
| ID: 41857 | Rating: 0 | rate:
| |
|
Dnetc -- when running, also supports ATI GPU's -- they hope to be back up and running again later this week (they encountered something of the worst case scenario -- memory and hard drive failure while in the middle of a software upgrade). They are in recovery mode for now. That sure would be nice as there are still very little projects using ATI GPU's ____________ | |
| ID: 41858 | Rating: 0 | rate:
| |
|
Barry | |
| ID: 41861 | Rating: 0 | rate:
| |
|
It really should not be so hard to automaticly restart what is failling every day or 2 days or so. Most MMOs that i played over the years also have a daily downtime to prevent stuff like this. (totally different application, but i guess the problem is sort of the same). | |
| ID: 41862 | Rating: 0 | rate:
| |
|
Memroy leak kills system every 2 to 3 days, therefore reboot every 1 to 2 days until the source of the memory leak is found. Simple really. | |
| ID: 41864 | Rating: 0 | rate:
| |
|
Work available again | |
| ID: 41865 | Rating: 0 | rate:
| |
|
Looks like one of the work creator servers are down and work available is falling rapidly. | |
| ID: 41963 | Rating: 0 | rate:
| |
|
Told you so, there is no no work to dish out. Collatz coming soon to a screen near you. | |
| ID: 41971 | Rating: 0 | rate:
| |
|
Got it wrong - all is sweetness and light again! | |
| ID: 41980 | Rating: 0 | rate:
| |
|
I ran out long enough that my backup project kicked in and got me something to crunch. | |
| ID: 41986 | Rating: 0 | rate:
| |
|
Servers need kicking again, or we are outta work | |
| ID: 42065 | Rating: 0 | rate:
| |
|
There are still some units to crunch but I am not getting one anymore. It is the validator again. (And I just thought this is the second weekend with good crunching...) | |
| ID: 42070 | Rating: 0 | rate:
| |
|
Well at least there is some consistency in the problem -- if we didn't have 'no new work' and 'validator not validator' runs -- we'd not know this was MilkyWay. | |
| ID: 42082 | Rating: 0 | rate:
| |
|
Arrggghhhh!!!! | |
| ID: 42317 | Rating: 0 | rate:
| |
|
Just starting to get messages, in BM, saying no work sent. | |
| ID: 42319 | Rating: 0 | rate:
| |
|
I have got DNETC set as my backup GPU project and 6.11.7 is working perfectly on that front, it requests 1 unit at a time and keeps after MW for work. | |
| ID: 42322 | Rating: 0 | rate:
| |
|
| |
| ID: 42454 | Rating: 0 | rate:
| |
|
Aaaarrgh! | |
| ID: 42516 | Rating: 0 | rate:
| |
|
Maybe, you are going too fast, I see incredible RAC's here....... ;^) | |
| ID: 42521 | Rating: 0 | rate:
| |
|
I got a few "Message from server: No work sent" msgs and my cache was dwindling. | |
| ID: 42532 | Rating: 0 | rate:
| |
|
Reeee-boooot.... :) | |
| ID: 42534 | Rating: 0 | rate:
| |
|
Looks like I ran out long enough to crunch 2 DNETC wu today. | |
| ID: 42537 | Rating: 0 | rate:
| |
|
5/10/2010 7:57:18 PM | Milkyway@home | Message from Milkyway@home: Your app_info.xml file doesn't have a version of MilkyWay@Home N-Body Simulation. | |
| ID: 42577 | Rating: 0 | rate:
| |
5/10/2010 7:57:18 PM | Milkyway@home | Message from Milkyway@home: Your app_info.xml file doesn't have a version of MilkyWay@Home N-Body Simulation. That's what I got a few hours back. But I haven't been able to find out, what exactly is going on, expected a message on the Front Page, about the new NBODY-simulation and where you can find this, so I can Down Load this file and use it? ____________ Knight Who says Ni | |
| ID: 42581 | Rating: 0 | rate:
| |
5/10/2010 7:57:18 PM | Milkyway@home | Message from Milkyway@home: Your app_info.xml file doesn't have a version of MilkyWay@Home N-Body Simulation. I've been geting that same message all morning. I too have no idea what I am supposed to do about it. Can somebody in the know please enlighten us? Thanks. ____________ | |
| ID: 42583 | Rating: 0 | rate:
| |
5/10/2010 7:57:18 PM | Milkyway@home | Message from Milkyway@home: Your app_info.xml file doesn't have a version of MilkyWay@Home N-Body Simulation. AFAIK those are WUs that so far only run on CPUs. There doesn't seem to be any GPU WUs available. List of available apps: http://milkyway.cs.rpi.edu/milkyway/apps.php | |
| ID: 42585 | Rating: 0 | rate:
| |
5/10/2010 7:57:18 PM | Milkyway@home | Message from Milkyway@home: Your app_info.xml file doesn't have a version of MilkyWay@Home N-Body Simulation. But I don't have a GPU. ____________ | |
| ID: 42586 | Rating: 0 | rate:
| |
|
Hello Wes, | |
| ID: 42587 | Rating: 0 | rate:
| |
5/10/2010 7:57:18 PM | Milkyway@home | Message from Milkyway@home: Your app_info.xml file doesn't have a version of MilkyWay@Home N-Body Simulation. Looks like you're running Windows with an optimized client. You won't get the nbody work unless you add the app to your app_info.xml or go back to the automatically downloaded client. You have several days worth of work still in your queue anyway, so why worry? http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=91971 | |
| ID: 42588 | Rating: 0 | rate:
| |
|
*facepalm* | |
| ID: 42589 | Rating: 0 | rate:
| |
|
Thanks, Matt :-) | |
| ID: 42591 | Rating: 0 | rate:
| |
*facepalm* No problem, things happen sometimes. Would it be possible to split the server-status that we can see which app has work and which does not? Alexander | |
| ID: 42592 | Rating: 0 | rate:
| |
Einstein at home does a full breakout on their server status page, so it's definately doable. Not sure how much of their stuff is custom though. http://einstein.phys.uwm.edu/server_status.html[/url] | |
| ID: 42598 | Rating: 0 | rate:
| |
|
Validator seems to be going out again. | |
| ID: 42736 | Rating: 0 | rate:
| |
Validator seems to be going out again. I whish you didn't post this, now it happend..."it" is out. ____________ Greetings from, TJ | |
| ID: 42744 | Rating: 0 | rate:
| |
|
No work sent and the validator is now up to 48K. The last time this happened it was cleared relatively quickly. Hope this is the case again!! | |
| ID: 42745 | Rating: 0 | rate:
| |
|
Up to 21k waiting for validation. | |
| ID: 42842 | Rating: 0 | rate:
| |
|
over 30k now... | |
| ID: 42843 | Rating: 0 | rate:
| |
|
I just finished my last wu, all they have on the server is Nbody work now. | |
| ID: 42846 | Rating: 0 | rate:
| |
|
back to work :-) | |
| ID: 42848 | Rating: 0 | rate:
| |
|
I cane here | |
| ID: 42857 | Rating: 0 | rate:
| |
|
Feeder not running | |
| ID: 42886 | Rating: 0 | rate:
| |
Feeder not running data-driven web pages milkyway Running upload/download server milkyway Running scheduler milkyway Running feeder milkyway Not Running transitioner milkyway Not Running milkyway_purge milkyway Not Running file_deleter milkyway Not Running nbody_assimilator milkyway Not Running separation_assimilator milkyway Not Running ____________ | |
| ID: 42887 | Rating: 0 | rate:
| |
|
Still no work, and DNETC is down. Just leaves Collatz to crunch | |
| ID: 42890 | Rating: 0 | rate:
| |
|
It's back, but we lost over 7 crunching hours. | |
| ID: 42895 | Rating: 0 | rate:
| |
|
Validator crapped again. | |
| ID: 42897 | Rating: 0 | rate:
| |
|
I am still getting work even with the validator up that high. | |
| ID: 42902 | Rating: 0 | rate:
| |
|
So am I, and the RAC count on the rig in question is slowly rising. Does this mean a way has been found to keep the system going with the validator so high or are we looking at imminent collapse? | |
| ID: 42906 | Rating: 0 | rate:
| |
|
Well, i currently have 44.26 pending credit which actually equals to (44.26/0.05)*213.76=189220 granted credit. Hope it validates soon. | |
| ID: 42910 | Rating: 0 | rate:
| |
|
The CPU work units have been fixed, now if only we could download them! | |
| ID: 42922 | Rating: 0 | rate:
| |
|
Looking at the server status now, all seems to be up and running as expected. Even the validator is back to the accustomed 6.3K WUs. | |
| ID: 42936 | Rating: 0 | rate:
| |
|
Someone want to kick the server? This is what I'm seeing on all my machines: | |
| ID: 43038 | Rating: 0 | rate:
| |
|
Workunits waiting for validation 38,275 | |
| ID: 43039 | Rating: 0 | rate:
| |
|
Well, it looks like the situation has gone from bad to worse. | |
| ID: 43041 | Rating: 0 | rate:
| |
|
Should be working now. | |
| ID: 43042 | Rating: 0 | rate:
| |
|
Thank you, Matthew. | |
| ID: 43043 | Rating: 0 | rate:
| |
|
Then again maybe not! | |
| ID: 43044 | Rating: 0 | rate:
| |
|
Ok, now we have gone from worse to bad, again. Hopefully Matthew will double check the server before he calls it a night. | |
| ID: 43045 | Rating: 0 | rate:
| |
|
oh geez, off to Collatz we go! | |
| ID: 43046 | Rating: 0 | rate:
| |
|
I worked my magic, hopefully things will be smooth again in a few minutes. I'll give Travis a head's up just in case. | |
| ID: 43047 | Rating: 0 | rate:
| |
|
It failed again. It looks like it doesn't like the new work units. | |
| ID: 43048 | Rating: 0 | rate:
| |
|
Well, we are completely out of work, now. | |
| ID: 43049 | Rating: 0 | rate:
| |
|
<GRIN> and out of work as well. | |
| ID: 43051 | Rating: 0 | rate:
| |
|
Ohh.. | |
| ID: 43056 | Rating: 0 | rate:
| |
|
Atleast it is back to normal with the continual problems. | |
| ID: 43059 | Rating: 0 | rate:
| |
|
Well, we are completely out of work, again. | |
| ID: 43142 | Rating: 0 | rate:
| |
|
Get the same here, out of work. | |
| ID: 43145 | Rating: 0 | rate:
| |
|
Is the MW server out of GPU WU's again? (I will be out of WU's to process in about 10min.) BOINC shows MW server declining to send out any GPU work when asked. | |
| ID: 43218 | Rating: 0 | rate:
| |
|
There is 2,403 WUs ready to send, but most of the other servers are currently down. | |
| ID: 43224 | Rating: 0 | rate:
| |
|
Validator not working as 63586 wus waiting for validation. Also, what is the mystery of 6300 wus that validator has never been validating? I have never seen it below that value. | |
| ID: 43228 | Rating: 0 | rate:
| |
Validator not working as 63586 wus waiting for validation. Also, what is the mystery of 6300 wus that validator has never been validating? I have never seen it below that value. Once the validator gets back to 'normal' check your pending and any inconclusive results.... | |
| ID: 43229 | Rating: 0 | rate:
| |
Also, what is the mystery of 6300 wus that validator has never been validating? I have never seen it below that value. I've wondered about this, also. It might be WUs like this one. After sending it out five times, it could not be validated. http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=167143584 | |
| ID: 43230 | Rating: 0 | rate:
| |
|
Could it be the wus that don't disappear from months ago? | |
| ID: 43231 | Rating: 0 | rate:
| |
|
Aaaahhhaaarrrgh! | |
| ID: 43234 | Rating: 0 | rate:
| |
Aaaahhhaaarrrgh! It is still going up. Workunits waiting for validation 170,249 I am surprised that we are still getting WUs with it being that high. I sent an email to the admins requesting that it be restarted.... | |
| ID: 43236 | Rating: 0 | rate:
| |
|
Now at 225,581 and counting. We should start a betting pool to see who can predict how high it will get... | |
| ID: 43237 | Rating: 0 | rate:
| |
|
Are the usual validator rules not broke? | |
| ID: 43238 | Rating: 0 | rate:
| |
Returning them at a rate of 50,000 every 3 hrs! | |
| ID: 43240 | Rating: 0 | rate:
| |
|
At that rate we will be back to normal quite quickly. | |
| ID: 43252 | Rating: 0 | rate:
| |
|
I ran out for a little bit, and also so a message that the project was down for maintenance once. | |
| ID: 43270 | Rating: 0 | rate:
| |
|
Validator shows more than 48k wu pending. | |
| ID: 43382 | Rating: 0 | rate:
| |
|
The server status shows the majority of then red, showing someone is working on them. Let us hope the 24 cache is long enough for them to come back online and dish out new work <cross fingers> | |
| ID: 43386 | Rating: 0 | rate:
| |
|
Fresh out of work here.....a pair of GTX295s makes pretty short work out of that small cache. | |
| ID: 43387 | Rating: 0 | rate:
| |
|
Looks like we are back underway...... | |
| ID: 43389 | Rating: 0 | rate:
| |
|
Or not: Looks like we are back underway...... ____________ | |
| ID: 43392 | Rating: 0 | rate:
| |
|
Server down.. | |
| ID: 43428 | Rating: 0 | rate:
| |
|
..and up. | |
| ID: 43429 | Rating: 0 | rate:
| |
|
Still no work being distributed though. Running on reducing cache. | |
| ID: 43430 | Rating: 0 | rate:
| |
|
I keep getting one machine running dry, and not being fed. The other is ok albeit gets short at times. Its my dual 5970 machine that is not getting fed quick enough, keeps getting "zero sent" whilst there are WUs available. At literally the same instance in time, my other machine with a 5850 gets WUs..... | |
| ID: 43431 | Rating: 0 | rate:
| |
I keep getting one machine running dry, and not being fed. The other is ok albeit gets short at times. Its my dual 5970 machine that is not getting fed quick enough, keeps getting "zero sent" whilst there are WUs available. At literally the same instance in time, my other machine with a 5850 gets WUs..... B A C K U P P R O J E C T ! | |
| ID: 43432 | Rating: 0 | rate:
| |
|
We are in "code red" at the moment. | |
| ID: 43454 | Rating: 0 | rate:
| |
|
We're working on it. | |
| ID: 43456 | Rating: 0 | rate:
| |
B A C K U P P R O J E C T ! this has probably been hammered to pulp, but I am not interested in scouring this whopper thread. I have a feeling backupproject is, in the current ati environement, not as envisioned when collatz thrives on memory clock and mw does not. changes have to be made if interested in keeping the gpus setup optimally for each project and that requires intervention. | |
| ID: 43457 | Rating: 0 | rate:
| |
|
I agree, I run three ATI GPU projects, Dnetc and Collatz run at least moderately well together. When I add MW into the mix, the only way MW will process work is when I temporarily suspend the other two projects. B A C K U P P R O J E C T ! ____________ | |
| ID: 43461 | Rating: 0 | rate:
| |
We're working on it. Thanks for info! ____________ Best regards! | |
| ID: 43462 | Rating: 0 | rate:
| |
|
Though a bit more info might be a good thing (ie outage for hours, days)... We're working on it. ____________ | |
| ID: 43463 | Rating: 0 | rate:
| |
|
I always get the message won´t finnish in time. | |
| ID: 43561 | Rating: 0 | rate:
| |
|
Mike | |
| ID: 43563 | Rating: 0 | rate:
| |
Mike At least i dont get work. I´m running 6 cores @ 3.8 GHZ atm but nothing to crunch. ____________ | |
| ID: 43564 | Rating: 0 | rate:
| |
|
Validator up to 50K ATM, so trouble is here. | |
| ID: 43584 | Rating: 0 | rate:
| |
|
Validator choking on returned work. | |
| ID: 43753 | Rating: 0 | rate:
| |
|
11/14/2010 6:14:52 AM Milkyway@home Sending scheduler request: To fetch work. | |
| ID: 43822 | Rating: 0 | rate:
| |
|
The validator is choking again. An hour ago is was about par, 6,500 waiting for validation. Over the last hour its steadily risen, and its now well over 50,000 and still rising. When it gets like that it trys to balance off WU production V Supply V Validation and you end up with bursts of available WUs etc, but its an inexorable move to falling over. See what happens, lately its survived on over 200,000 waiting validation, but at present the runes are not good, and back up projects need checking and winding up. Would be no surprise if it fell over totally within an hour from now. | |
| ID: 43823 | Rating: 0 | rate:
| |
|
And still rising - now to 72,490 | |
| ID: 43825 | Rating: 0 | rate:
| |
|
Not gotten squat the past few days. GPU's sitting there doing nothing. | |
| ID: 43827 | Rating: 0 | rate:
| |
|
Supply was fine up to earlier this morning. There must be another reason connected with your setup - what is your cache level, held WUs for different Projects, and BOINC stated run time for each of your GPU Project WUs? | |
| ID: 43828 | Rating: 0 | rate:
| |
|
The 3300 is the one showing me in red,multiple lines, no work received. I switched it to Collatz. It's been doing this for a few days. I should be over or near 5 mill by now. | |
| ID: 43830 | Rating: 0 | rate:
| |
|
The reason for my question was that the 3300 crunched over 60 this morning up to the time the servers fell over, yet the others show no crunching. Those 60 have not yet been validated due to this mornings problems, but they were downloaded and crunched, so there must be a difference in the setups somewhere. Cant do much anyway until the servers back up properly as cant test anything. | |
| ID: 43831 | Rating: 0 | rate:
| |
|
Thanks for the replies. | |
| ID: 43832 | Rating: 0 | rate:
| |
|
Aaargh! | |
| ID: 43864 | Rating: 0 | rate:
| |
|
i'm not getting WUs... | |
| ID: 43866 | Rating: 0 | rate:
| |
|
Ditto. | |
| ID: 43867 | Rating: 0 | rate:
| |
|
now i'm getting WUs, let's crunch :-) | |
| ID: 43868 | Rating: 0 | rate:
| |
|
Won't be for long I think. A few hours ago the six servers from the bottom up where red. And the validator is rising. I have a lot of pages with "waiting for validation". But yes I am also getting WU's, as is there a small delay at times. | |
| ID: 43869 | Rating: 0 | rate:
| |
|
Still getting work OK, but the validator is now up to 90,699. It cannot be long before the servers need kicking to make things work smoothly again. | |
| ID: 43874 | Rating: 0 | rate:
| |
|
Not getting new work for the last 4 hours. | |
| ID: 43876 | Rating: 0 | rate:
| |
|
Waiting for validation just keeps rising, so I have pulled the plug until it starts to go down. | |
| ID: 43877 | Rating: 0 | rate:
| |
|
Validator still high, but has dropped to 50,732 since I last posted. | |
| ID: 43879 | Rating: 0 | rate:
| |
|
Aaargh! | |
| ID: 43914 | Rating: 0 | rate:
| |
|
No new work for me. | |
| ID: 43918 | Rating: 0 | rate:
| |
|
Back to normal again, if you can define normal? | |
| ID: 43929 | Rating: 0 | rate:
| |
|
Normal it is! | |
| ID: 43966 | Rating: 0 | rate:
| |
|
And, so it begins, yet again: | |
| ID: 44023 | Rating: 0 | rate:
| |
|
Heading for normal again - | |
| ID: 44025 | Rating: 0 | rate:
| |
Heading for normal again - Interesting thing is: The server reduced the number of workunits waiting to be validated by just marking all of them as invalid. The validator is continuing to mark most of the workunits returned as invalid. I hope this gets fixed soon. I sent an email to the admins about this problem. -Mike | |
| ID: 44027 | Rating: 0 | rate:
| |
Heading for normal again - Good thing it works great! ____________ Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. | |
| ID: 44033 | Rating: 0 | rate:
| |
|
Kaboom!! | |
| ID: 44058 | Rating: 0 | rate:
| |
|
bugger... | |
| ID: 44059 | Rating: 0 | rate:
| |
|
Aaargh! Down again. | |
| ID: 44061 | Rating: 0 | rate:
| |
|
I'm getting "feeder not running" error? | |
| ID: 44069 | Rating: 0 | rate:
| |
Aaargh! Down again. That sounds like a plan. Good luck and share some times. I am still waiting for the 58xx prices to drop so I can snag a few. Sadly, I may be waiting awhile. | |
| ID: 44070 | Rating: 0 | rate:
| |
|
The good news is that "something" is happening as opposed to having the server contemplating its navel :) | |
| ID: 44073 | Rating: 0 | rate:
| |
The good news is that "something" is happening as opposed to having the server contemplating its navel :) Well, we are not really sure anything is happening, other than the server is down. I haven't seen any posts from the admins about the problem. I think they think that "posts" means, post something post-problem. Like after it has been fixed. -Mike Edited to add: I hope they are not on Thanksgiving break until November 29. | |
| ID: 44074 | Rating: 0 | rate:
| |
bugger... Couldn't have said it better. | |
| ID: 44075 | Rating: 0 | rate:
| |
I'm getting "feeder not running" error? Me too, Can't report any work, video card near starvation... ____________ 12/21/2012? Bah Humbug! | |
| ID: 44077 | Rating: 0 | rate:
| |
The good news is that "something" is happening as opposed to having the server contemplating its navel :) Past downtimes have shown the entire server up and ready, but just not doing anything. Now part of the server is down and not doing anything. I'm guessing this is better :) I'm processing Collatz and DNETC one task at a time as a backup (resource share 0) until everything is copacetic again with MW. | |
| ID: 44078 | Rating: 0 | rate:
| |
That's a good posibility, I would think the school is only closed on Thanksgiving day. Unless it could be restarted/fixed away from campus. ____________ Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. | |
| ID: 44079 | Rating: 0 | rate:
| |
The good news is that "something" is happening as opposed to having the server contemplating its navel :) Yeah, I am back on Collatz as well. Seems like MW starts going good then it craps out. Is the server(s) run from the team owners location or from "HQ"? | |
| ID: 44080 | Rating: 0 | rate:
| |
The college that I work at is closed all this week so the students can head home for the holiday. We closed Saturday and will not reopen until the 29th. I am sure they probably are doing the same thing. -Mike | |
| ID: 44081 | Rating: 0 | rate:
| |
|
I'll be around this week making sure things are running, but our labstaff wont so if there's a serious problem it wont be fixed until after the break. However, it seems our serious problem for the holidays already happened over this weekend (the corrupted disk) and it was fixed today, so hopefully it will be smooth sailing here on out. | |
| ID: 44094 | Rating: 0 | rate:
| |
|
Thanks, Travis!! | |
| ID: 44105 | Rating: 0 | rate:
| |
|
Workunits waiting for validation 59,155 Workunits waiting for assimilation 147 Workunits waiting for deletion 73 Results waiting for deletion 445 Transitioner backlog (hours) 0 Well, it looks like the validator is backed up. It needs a good flushing.... | |
| ID: 44107 | Rating: 0 | rate:
| |
|
update | |
| ID: 44111 | Rating: 0 | rate:
| |
|
Wasn't it kicked couple of hours ago? This server is dying:( | |
| ID: 44116 | Rating: 0 | rate:
| |
|
server is down for a while. and now BOINC challenge ongoing :( | |
| ID: 44119 | Rating: 0 | rate:
| |
|
Perhaps Milkyway needs a Fund Drive for a New Server? Over at Seti@home, We the Users bought the project two New HP Servers, Named Carolyn and Oscar, Oscar is the name of Msattlers deceased Cat, But then He started both of the Fund Raisers, The 2nd one has raised a little over $16,000 so far(lots of small donations are happening, S@H is having a weekly outage right now though for a few hours), Carolyn is now online as the Boinc Replica Database Server and by Next week Carolyn will be swapped with Jocelyn to become the Master, Oscar will take about another week or so to come online and then Oscar will replace Thumper, Cause that Wabbit is just about fricasseed. | |
| ID: 44120 | Rating: 0 | rate:
| |
server is down for a while. and now BOINC challenge ongoing :( Server is back up CTAPbIi and the challenge continues ... (until the server is down again) ____________ | |
| ID: 44124 | Rating: 0 | rate:
| |
|
Just another hitch building - Workunits waiting for validation 174,591 | |
| ID: 44128 | Rating: 0 | rate:
| |
Just another hitch building - Workunits waiting for validation 174,591 I sure wish that the validator would start validating something! It is up to 231,744. | |
| ID: 44138 | Rating: 0 | rate:
| |
|
And still rising - Workunits waiting for validation 258,318 | |
| ID: 44139 | Rating: 0 | rate:
| |
And still rising - Workunits waiting for validation 258,318 I am willing to bet that it will fail before it goes over 300,000. -Mike | |
| ID: 44142 | Rating: 0 | rate:
| |
And still rising - Workunits waiting for validation 258,318 Workunits waiting for validation 280,041. ____________ Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. | |
| ID: 44144 | Rating: 0 | rate:
| |
|
Down | |
| ID: 44145 | Rating: 0 | rate:
| |
And still rising - Workunits waiting for validation 258,318 Well, the validator made it to 282,697 before it either failed or was stopped. feeder milkyway Not Running transitioner milkyway Not Running milkyway_purge milkyway Not Running file_deleter milkyway Not Running nbody_assimilator milkyway Not Running separation_assimilator milkyway Not Running -Mike[/code] | |
| ID: 44146 | Rating: 0 | rate:
| |
|
Up | |
| ID: 44147 | Rating: 0 | rate:
| |
|
284,000 | |
| ID: 44150 | Rating: 0 | rate:
| |
Up Rather by DNETC@HOME :D 2010-11-24 07:25:01 Milkyway@home Message from server: No work available Workunits waiting for validation 200,807 ____________ A proud member of the Polish National Team COME VISIT US at Polish National Team FORUM | |
| ID: 44155 | Rating: 0 | rate:
| |
|
They stopped work generation because there were duplicate assimilators running, which was messing up credit. The backlog is quite large though...and I've been getting work consistently (although I do have a huge buffer) | |
| ID: 44158 | Rating: 0 | rate:
| |
|
Falling! | |
| ID: 44160 | Rating: 0 | rate:
| |
They stopped work generation because there were duplicate assimilators running, which was messing up credit. Yes, that was so. But since then they have reported a corrupt disk (which may have caused all the credit problems, so they say), and now they have stopped work generation to try and get the assimilator/validator to catch up. ____________ | |
| ID: 44161 | Rating: 0 | rate:
| |
|
| |
| ID: 44163 | Rating: 0 | rate:
| |
|
This is very bad timing with the Retupmoc Milky Way Challenge going on this week. | |
| ID: 44172 | Rating: 0 | rate:
| |
|
Well, i´m crunching for 3 days now with my new GPU. | |
| ID: 44173 | Rating: 0 | rate:
| |
This is very bad timing with the Retupmoc Milky Way Challenge going on this week. Let me know if it gets going again. It's a little strange to be sitting around in the middle of a race. ____________ | |
| ID: 44179 | Rating: 0 | rate:
| |
|
I just now got an allocation of 24 tasks....... | |
| ID: 44182 | Rating: 0 | rate:
| |
|
And now nothing available again. | |
| ID: 44184 | Rating: 0 | rate:
| |
|
The rig I am currently dedicating to Milkyway GPU crunching seems to have been running OK with it's minimal cache of 23 WUS (and one being crunched). | |
| ID: 44195 | Rating: 0 | rate:
| |
|
All my ATI cards are now getting 12 WUs and crunching two at a time, in hope that the Retupmoc competition continues. ...Until it all goes pear shaped again at which time they'll all be switching over to pear@shape again. | |
| ID: 44196 | Rating: 0 | rate:
| |
|
Once again it's rising ... | |
| ID: 44231 | Rating: 0 | rate:
| |
|
Now, isn't that strange? | |
| ID: 44251 | Rating: 0 | rate:
| |
Now, isn't that strange? After you read this post, it won't seem as strange. http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2080 -Mike ____________ | |
| ID: 44252 | Rating: 0 | rate:
| |
Now, isn't that strange? It dawns, but slowly!! ____________ Go away, I was asleep | |
| ID: 44262 | Rating: 0 | rate:
| |
|
I don't want to break things by saying it, but it seems like we might make it through the holiday without too many problems. :) | |
| ID: 44271 | Rating: 0 | rate:
| |
|
I take it you mean the American holiday yesterday with some Americans taking a day of vacation today (Friday) to make a four-day weekend ? | |
| ID: 44279 | Rating: 0 | rate:
| |
|
Just the thanksgiving holiday. I'm sure we'll have all kinds of things break before the end of the year. Fate doesn't like me sleeping too well. | |
| ID: 44284 | Rating: 0 | rate:
| |
|
Separation_assimilator not running. This is probably due to high load? It appears very often these days. | |
| ID: 44382 | Rating: 0 | rate:
| |
|
Back to normal, again, I think? | |
| ID: 44387 | Rating: 0 | rate:
| |
|
feeder milkyway Not Running | |
| ID: 44422 | Rating: 0 | rate:
| |
feeder milkyway Not Running Yes, back to normal I think ____________ | |
| ID: 44424 | Rating: 0 | rate:
| |
|
SNAFU | |
| ID: 44425 | Rating: 0 | rate:
| |
|
Should be back up now. | |
| ID: 44426 | Rating: 0 | rate:
| |
|
Still no work. | |
| ID: 44441 | Rating: 0 | rate:
| |
|
Work flowing now, but I needed Collatz overnight. | |
| ID: 44449 | Rating: 0 | rate:
| |
|
128,000 waiting for validation | |
| ID: 44450 | Rating: 0 | rate:
| |
|
I dont get work again. | |
| ID: 44652 | Rating: 0 | rate:
| |
|
Back to normal all ready again - Workunits waiting for validation 15 | |
| ID: 44653 | Rating: 0 | rate:
| |
|
Still not getting work. | |
| ID: 44658 | Rating: 0 | rate:
| |
|
Mike | |
| ID: 44661 | Rating: 0 | rate:
| |
Still not getting work. Check again.... Server status is all go right now. I have my 24 tasks in cache, and am getting new ones as they complete to top off. ____________ I am the Kittyman. Please visit and give a Click for Seti City. | |
| ID: 44662 | Rating: 0 | rate:
| |
|
Crash and burn here... | |
| ID: 44672 | Rating: 0 | rate:
| |
|
Same here. | |
| ID: 44674 | Rating: 0 | rate:
| |
|
Going..Going....Going........ | |
| ID: 44861 | Rating: 0 | rate:
| |
|
Kaa-blam-oh! | |
| ID: 44864 | Rating: 0 | rate:
| |
|
I knew this was going to happen. The credit dispenser was handing out credits twice as fast as it should have been. Now, it has run out of credits to dispense. Would someone please refill it with credits so we can start collecting again. | |
| ID: 44868 | Rating: 0 | rate:
| |
|
once again, no more work. Plan B: collatz. | |
| ID: 44869 | Rating: 0 | rate:
| |
|
I also am crunching SETI on my HD5830 using a opencl app from Lunatics. | |
| ID: 44871 | Rating: 0 | rate:
| |
|
*KICK* | |
| ID: 44877 | Rating: 0 | rate:
| |
*KICK* Thank you! Did you refill the credit dispenser, also? ____________ | |
| ID: 44878 | Rating: 0 | rate:
| |
|
Then it NNT time for Collatz just now, | |
| ID: 44881 | Rating: 0 | rate:
| |
|
Is the validator fracked again? | |
| ID: 44984 | Rating: 0 | rate:
| |
|
Now we just have to wait and see what got nuked this time. | |
| ID: 44993 | Rating: 0 | rate:
| |
|
That wish should soon be granted. | |
| ID: 44994 | Rating: 0 | rate:
| |
|
Travis has been notified and is working on the issue | |
| ID: 44995 | Rating: 0 | rate:
| |
|
Validator waiting building up quickly again? | |
| ID: 45022 | Rating: 0 | rate:
| |
|
Yea I got a bunch waiting to be validated,dunno what's up | |
| ID: 45029 | Rating: 0 | rate:
| |
|
Servers down for maintenance (I think) and validator building again. | |
| ID: 45073 | Rating: 0 | rate:
| |
|
There is not match running at the moment. Pitty as my computer goes on holiday for almost a week. | |
| ID: 45084 | Rating: 0 | rate:
| |
|
In a few hours my MW GPU (HD3850) will turn it's attention from Collatz back to MW. | |
| ID: 45088 | Rating: 0 | rate:
| |
|
After updating my drivers from 10.11 APP to 10.12 APP I went back to DnetC and went through a few of those yesterday. | |
| ID: 45092 | Rating: 0 | rate:
| |
|
Validator again = 42k+ | |
| ID: 45154 | Rating: 0 | rate:
| |
|
No New Work is send AAAAAAARGH | |
| ID: 45157 | Rating: 0 | rate:
| |
No New Work is send AAAAAAARGH Admins have been notified. ____________ | |
| ID: 45163 | Rating: 0 | rate:
| |
|
[As of 22 Dec 2010 16:21:48 UTC] | |
| ID: 45206 | Rating: 0 | rate:
| |
|
I suggest a new thread is started (hint hint John :) ), and this one closed. Call the new one - say - "Aaargh! Server out of new work!(2)" as this one is getting very big and slow(ish) loading even for high end PCs/Cards. For those with low-medium cards its probably getting a pain to wait for it to load up and type into. | |
| ID: 45211 | Rating: 0 | rate:
| |
I suggest a new thread is started (hint hint John :) ), and this one closed. Call the new one - say - "Aaargh! Server out of new work!(2)" as this one is getting very big and slow(ish) loading even for high end PCs/Cards. For those with low-medium cards its probably getting a pain to wait for it to load up and type into. Actually if you set your MW preferences you can limit the number of posts so it loads faster. If a thread contains more than this number of posts ____________ Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. | |
| ID: 45213 | Rating: 0 | rate:
| |
|
Nice one - good save - hadnt noticed that before, works a treat :) | |
| ID: 45214 | Rating: 0 | rate:
| |
|
This is good -- did they indicate whether (several hours later now) they would be able to help out on this? No New Work is send AAAAAAARGH ____________ | |
| ID: 45217 | Rating: 0 | rate:
| |
This is good -- did they indicate whether (several hours later now) they would be able to help out on this? Actually, the message you quoted was posted yesterday in response to the server problems yesterday. The server come back up right after Blurf posted that. Hopefully they are working on the current problem. ____________ | |
| ID: 45218 | Rating: 0 | rate:
| |
|
Now setting up a new replacement thread | |
| ID: 45223 | Rating: 0 | rate:
| |
Message boards :
Number crunching :
Aaargh! Server out of new work!