Welcome to MilkyWay@home

Aaargh! Servers are out of new work!(2)"

Message boards : Number crunching : Aaargh! Servers are out of new work!(2)"
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next

AuthorMessage
scottishwebcamslive.com
Avatar

Send message
Joined: 10 Oct 07
Posts: 79
Credit: 69,337,972
RAC: 0
Message 45614 - Posted: 18 Jan 2011, 18:45:34 UTC
Last modified: 18 Jan 2011, 19:24:21 UTC

hello,

the problem of running out of work can be cured in two ways

1. you up the amount of WU per core that we can download ( which for some reason your desperate not to do )

2. hold a far bigger cashe of work units on your own server so that even if it jams for an hour or two we still can download work until it starts producing new work again ( although that wont work either .... just now for instance theres 5000 WU on the server page waiting to be worked on it says and yet nothing for my two machines :( )


GPU technology is making leaps and bounds ( twin dual GPU 5970's here ) and having 10 or 15 mins of work when theres a server glitch which can last hours just isnt cutting it anymore

best regards
Ian
....Please Join team Scotland HERE
ID: 45614 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JAMC

Send message
Joined: 9 Sep 08
Posts: 96
Credit: 336,443,946
RAC: 0
Message 45615 - Posted: 18 Jan 2011, 19:32:23 UTC

What happened to the longer running 17s WU's??
ID: 45615 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 45617 - Posted: 18 Jan 2011, 20:07:03 UTC - in response to Message 45613.  

Thanks, I gave it a kick and took some old runs down. It looks like things are calming down... I'll keep an eye on it today.

-Matthew

Hi Matthew,

I know queue size has been discussed many times before, but now that we have the N-Body WUs for CPUs how about dedicating the CPUs to them and leave the old WUs for the GPUs. That way GPU cache size could be increased and turn around time would still be faster than it is now for both types of WUs.
ID: 45617 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Werkstatt

Send message
Joined: 19 Feb 08
Posts: 350
Credit: 141,284,369
RAC: 0
Message 45618 - Posted: 18 Jan 2011, 20:32:09 UTC - in response to Message 45617.  


I know queue size has been discussed many times before, but now that we have the N-Body WUs for CPUs how about dedicating the CPUs to them and leave the old WUs for the GPUs. That way GPU cache size could be increased and turn around time would still be faster than it is now for both types of WUs.


I agree, this would make app_infos unnecessary. At least for non-OpenCL users.
ID: 45618 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matthew
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 6 May 09
Posts: 217
Credit: 6,856,375
RAC: 0
Message 45620 - Posted: 18 Jan 2011, 22:38:34 UTC

The server seems to be running smoothly now - let me know if it isn't working well on your end.

I'll discuss some of you ideas with the rest of the group; I will let you know what happens.

-Matthew
ID: 45620 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 45622 - Posted: 18 Jan 2011, 23:22:50 UTC

Thanks Matthew, all running normally here
Go away, I was asleep


ID: 45622 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 45625 - Posted: 19 Jan 2011, 4:52:56 UTC - in response to Message 45620.  

The server seems to be running smoothly now - let me know if it isn't working well on your end.

I'll discuss some of you ideas with the rest of the group; I will let you know what happens.

-Matthew

I'm not getting double credits now...bugger!
ID: 45625 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Berserk_Tux
Avatar

Send message
Joined: 2 Jan 08
Posts: 79
Credit: 365,471,675
RAC: 0
Message 45630 - Posted: 19 Jan 2011, 13:35:58 UTC - in response to Message 45625.  

Here we go again!! No work.
ID: 45630 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
scottishwebcamslive.com
Avatar

Send message
Joined: 10 Oct 07
Posts: 79
Credit: 69,337,972
RAC: 0
Message 45631 - Posted: 19 Jan 2011, 13:37:43 UTC

hello,

there is always a third way of doing this which would involve alot more work on your part i would suspect
that is make the WU way longer than 90 seconds on some of our machines ( say 10 or 15 minute length or longer each ) and let us choose to do those WU on our faster machines if you dont want to figure out some way of sending them out to faster machines automatically

just a thought

Ian
....Please Join team Scotland HERE
ID: 45631 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 45632 - Posted: 19 Jan 2011, 16:11:31 UTC

Matthew

The servers are going funny again, like they did a few days ago.

Database/file status
State #
Results ready to send 2,876
Results in progress 181,275
Workunits waiting for validation 56,749
Workunits waiting for assimilation 176
Workunits waiting for deletion 3
Results waiting for deletion 269
Transitioner backlog (hours) 0


Can you give the servers a kick again, please?
Go away, I was asleep


ID: 45632 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cncguru
Avatar

Send message
Joined: 11 Jun 10
Posts: 329
Credit: 1,166,222,661
RAC: 0
Message 45688 - Posted: 22 Jan 2011, 14:45:17 UTC
Last modified: 22 Jan 2011, 15:21:15 UTC

This is becoming so common place that we aren't even bothering to post about it anymore!!
Why can't this be fixed?
Is it so astronomically difficult that it just cannot be solved?
Cmon guys I need my BILLION credits before I can go home, LOL!
Yes, they are still sending work but the validator is choking and the RAC's are moving backwards even as we crunch!
ID: 45688 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 45690 - Posted: 22 Jan 2011, 15:27:36 UTC

Twice yesterday and once so far today is too often. Whatever is being done is only last hours not even days.
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 45690 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 12 Aug 09
Posts: 262
Credit: 92,631,041
RAC: 0
Message 45700 - Posted: 22 Jan 2011, 18:48:29 UTC
Last modified: 22 Jan 2011, 18:49:32 UTC

Well I'm not getting new work in the last 2 hours while there are WU's though.
The validater is in "stress-mode" as it says on the server status page: Workunits waiting for validation 167,508 , 411 of them are mine.

How just when posting this, 18:49UTC, I got 2 WU's.
Greetings from,
TJ
ID: 45700 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
loeakaodas

Send message
Joined: 2 Jan 09
Posts: 34
Credit: 93,631,891
RAC: 0
Message 45737 - Posted: 24 Jan 2011, 16:49:26 UTC

Looks like the server needs a kick or two.

data-driven web pages milkyway Running
upload/download server milkyway Running
scheduler milkyway Running
feeder milkyway Not Running
transitioner milkyway Not Running
milkyway_purge milkyway Not Running
file_deleter milkyway Not Running
nbody_assimilator milkyway Not Running
separation_assimilator milkyway Not Running

ID: 45737 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matthew
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 6 May 09
Posts: 217
Credit: 6,856,375
RAC: 0
Message 45741 - Posted: 24 Jan 2011, 20:02:37 UTC
Last modified: 24 Jan 2011, 20:03:09 UTC

Server-side, things look like they should be stable; Travis has asked users to stop certain 'iffy' behaviors (http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2179#45739) as they may be responsible for overloading the server.

Once their behavior stops (Either by choice, or by ban from us admins), we should see things start to calm down.

Cheers,
Matthew
ID: 45741 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Berserk_Tux
Avatar

Send message
Joined: 2 Jan 08
Posts: 79
Credit: 365,471,675
RAC: 0
Message 45762 - Posted: 25 Jan 2011, 14:20:55 UTC - in response to Message 45741.  

No work again.
ID: 45762 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 45768 - Posted: 25 Jan 2011, 16:21:24 UTC

No teams also.
ID: 45768 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 45853 - Posted: 28 Jan 2011, 11:06:34 UTC

The validator retention is growing again, and the servers are not sending me work.
Go away, I was asleep


ID: 45853 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
vandiesel

Send message
Joined: 10 May 10
Posts: 27
Credit: 43,104,187
RAC: 0
Message 45854 - Posted: 28 Jan 2011, 11:37:34 UTC

ditto
ID: 45854 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 45855 - Posted: 28 Jan 2011, 12:54:43 UTC

Assimilator is offline. 37k results pending.
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 45855 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next

Message boards : Number crunching : Aaargh! Servers are out of new work!(2)"

©2024 Astroinformatics Group