Message boards :
Number crunching :
Down for maintenance?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Send message Joined: 20 Mar 08 Posts: 46 Credit: 69,382,802 RAC: 0 |
Collatz swamped? Yep. Normal is about 300 concurrent users. This a.m. there were over 700. It seems to have settled back down now but the feeder is still having trouble keeping up. Once everyone' caches are full, it should be able to handle it. The "return results immediately" setting that many have turned on in order to attempt to keep their MW cache full at 6 WUs per core on MW doesn't help any. Contacting the server when multiple WUs are completed verses after each one is done would reduce the Collatz server load considerable (70% or more!). If a machine has 4 GPUs and each can do a WU in 10 minutes, then it contacts the server every 2.5 minutes instead of once every couple hours (because of a cache size of over a hundred WUs on Collatz) if the results are returned immediately. While the user gets instant gratification, it really pounds on the server and eventually, the server gets overwhelmed. |
Send message Joined: 20 Mar 08 Posts: 46 Credit: 69,382,802 RAC: 0 |
"90% off my GPU time should be used for MW, so i often just stop Collatz to fetch new work (YES i know not the best way ;) ). So i waste yesterday about 2 hours of GPUtime. Idle sucks :/" If MW takes 1:40 and you have it limited to GPU crunching and a quad core, you can get 6 * 4 WUs @ 1.40 = ~40 minutes of work. So, set your additional network resources to only allow 0.03 days cache and it won't fill several days cache of Collatz when MW is down. Since BOINC processes GPU results in first in, first out order, it will only have to process about 7 Collatz WUs before it switches back to MW when MW comes back online. At least, in theory, that is how BOINC should work. ;-) |
Send message Joined: 8 Sep 09 Posts: 62 Credit: 61,330,584 RAC: 0 |
a good idea to go down to .00# cache. I have not tried a cache limit so small. however, I am at .25 cache and have tried resource sharing at 1 for collatz and 99 for mw and the result collatz still hoged the gpu. I know this is out of place. if I continue to struggle with collatz domination, I will start another thread. I like collatz just trying to get at least at first a 50-50 thing going to establish some control of this sharing. great support. thanks. |
Send message Joined: 31 Oct 09 Posts: 20 Credit: 12,074,198 RAC: 0 |
I only have a E8400, but @3.8 GHz ;), so i just can have work for 20 min. That´s not enough. I always wont a bit more work in the cache, so if it give some problems, i have something to do. And also i want to crunch most of the time for MW. Your proposal don´t fix this problem. I would be very happy if i can cache 100 WU´s+. With a 5870 it´s no problem to solve them in time. |
Send message Joined: 2 Jan 08 Posts: 79 Credit: 365,471,675 RAC: 0 |
And still no reply from the prosjekt. |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
And still no reply from the prosjekt. I wouldn't expect one for a bit yet. The last year it seems to take longer and longer for any reply most times. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
Has anyone PM'd the admins? |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
a good idea to go down to .00# cache. I have not tried a cache limit so small. however, I am at .25 cache and have tried resource sharing at 1 for collatz and 99 for mw and the result collatz still hoged the gpu. I know this is out of place. if I continue to struggle with collatz domination, I will start another thread. I like collatz just trying to get at least at first a 50-50 thing going to establish some control of this sharing. great support. thanks. The problem is due to: 1) BOINC's FIFO method of processing GPU work. 2) MilkyWay's limit of 6 WUs per CPU core, also applied to machines with GPUs. To make scheduling work correctly one or both of the above policies has to change. IMO both policies are poor decisions. |
Send message Joined: 12 Apr 09 Posts: 15 Credit: 278,731,391 RAC: 0 |
It's Spring Break! Maybe no one is at Castle Greyskull, all in FLA. |
Send message Joined: 4 Oct 08 Posts: 1734 Credit: 64,228,409 RAC: 0 |
Although Collatz is working it's very slow, and was down some time ratlier. I seem to be finding Einstein is also down/problems pulling down new work, with downloads reporting obtaining files at 10.5kbps. Malaria is between runs, so all my projects ate pooping ATM Go away, I was asleep |
Send message Joined: 18 Nov 07 Posts: 280 Credit: 2,442,757 RAC: 0 |
I seem to be finding Einstein is also down/problems pulling down new work, with downloads reporting obtaining files at 10.5kbps. Yep, their new run is starting and it looks like all their mirrors are overloaded - leaving BOINC at it seems to be the best solution, as downloads do work on occasion. Just 4 files left (down from, I don't know, 50?) and I'll be able to get to crunching for them again ... |
Send message Joined: 19 Feb 10 Posts: 4 Credit: 1,050,320 RAC: 0 |
Any one know how much longer its gonna be down for MAINTENANCE ? I`ve been trying for best part of 2days to upload results .No downloads either! |
Send message Joined: 4 Oct 08 Posts: 1734 Credit: 64,228,409 RAC: 0 |
Your guess is as good as anyones, George. If it's not sorted Friday then we will be down for the week end as well. Go away, I was asleep |
Send message Joined: 1 Sep 08 Posts: 520 Credit: 302,528,262 RAC: 263 |
Thanks for coming over here in MW land to provide an explanation of the Collatz server 'response' -- now if you could answer for the unanswering MW folks that would be REALLY impressive <smile>. Collatz swamped? Yep. Normal is about 300 concurrent users. This a.m. there were over 700. It seems to have settled back down now but the feeder is still having trouble keeping up. Once everyone' caches are full, it should be able to handle it. |
Send message Joined: 1 Sep 08 Posts: 520 Credit: 302,528,262 RAC: 263 |
One of the things that has shown up more frequently of late for MW is that when things go bump in the night, there seems to be a lack of folks 'on the job' to either notice that there is a problem or provide information regarding the problem. Perhaps this is some form of blowback on the project -- it demonstrates 'black hole' syndrome. Might be due to the focus of the research... That being said, the lack of project comment on problems which sometimes extend for more than a day or two is rather troublesome and tiresome. |
Send message Joined: 31 Aug 07 Posts: 21 Credit: 21,004,179 RAC: 0 |
If its still Spring Break there, nowt may happen till Monday |
Send message Joined: 19 Feb 10 Posts: 4 Credit: 1,050,320 RAC: 0 |
Thanks for reply just have to wait n`see,havin problems with other projects as well.CE LA VIE! |
Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0 |
Collatz swamped? Yep. Normal is about 300 concurrent users. This a.m. there were over 700. It seems to have settled back down now but the feeder is still having trouble keeping up. Once everyone' caches are full, it should be able to handle it. THis is another of those devices that DA has made all or nothing when we have asked several times that it be made per project. Some projects want/need results as fast as possible (GPU Grid is the best example) and RRI is good for them. Other projects would prefer as you have noted that would be better in a more batch mode. There is one more reason to use RRI and that is to avoid the issue of the systems "running dry" of work. A problem that I had noted on the alpha list with logs included... but of course I am on UCB's ignore list so ... One side note, I am even having trouble getting work from GPU Grid because of the unexpected down-time of MW and the slowness of Collatz ... I have one GPU core idle as I write this as I cannot get work ... well, it does make the room quieter ... :) |
Send message Joined: 8 Apr 08 Posts: 45 Credit: 161,943,995 RAC: 0 |
Up and running !! |
Send message Joined: 27 Aug 07 Posts: 647 Credit: 27,592,547 RAC: 0 |
And now who took the last WUs without generating new ones? *LOL* Lovely greetings, Cori |
©2024 Astroinformatics Group