Welcome to MilkyWay@home

20 workunit limit

Message boards : Number crunching : 20 workunit limit
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · Next

AuthorMessage
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 6956 - Posted: 29 Nov 2008, 21:33:53 UTC

I don't think the WUs are to take longer to crunch, but Travis has set the number to 5-per-core and not to 5-per-computer. So, if you have a Quaddie, then the new limit will let 20 Wus to accumulate.

I think the limit should be raised to 10 to give fast hosts a chance to keep up. If the script starts to extend the server contact time then the work will run out very quickly.

I think the server recontact time should max out at 10 minutes for a chance to collect work.
ID: 6956 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Logan
Avatar

Send message
Joined: 15 Aug 08
Posts: 163
Credit: 3,876,869
RAC: 0
Message 6957 - Posted: 29 Nov 2008, 21:36:18 UTC

I begin to think that the objective of this project is to determine as far as the patience of the volunteers ....
Logan.

BOINC FAQ Service (Ahora, también disponible en Español/Now available in Spanish)
ID: 6957 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile KWSN imcrazynow
Avatar

Send message
Joined: 22 Nov 08
Posts: 136
Credit: 319,414,799
RAC: 0
Message 6960 - Posted: 29 Nov 2008, 21:38:26 UTC - in response to Message 6957.  

I begin to think that the objective of this project is to determine as far as the patience of the volunteers ....

Agreed!

4870 GPU
4870 GPU
ID: 6960 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 6961 - Posted: 29 Nov 2008, 21:42:30 UTC - in response to Message 6960.  

I begin to think that the objective of this project is to determine as far as the patience of the volunteers ....

Agreed!


3rd.

Doesn't this put more strain on the server since everyones computer will need to contact more often?
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 6961 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Alinator

Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0
Message 6962 - Posted: 29 Nov 2008, 21:43:08 UTC

Well, let's not forget the fact that the work they send out is not designed to let participants burn thorough it as fast as they feel like.

Since the future work to be sent out for a search depends on what comes back from the current work in the field, the big gun battleships may just have to go idle on MW for a while to ensure the search doesn't wander off in the wrong direction.

IOWs we are here to help the project run their science, they are not here to help heat your computer room by loading your hosts up with work which may or may not being giving them the outcome they are seeking. You get what you get at the time your host asks, and that's all there is.

Alinator
ID: 6962 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 6963 - Posted: 29 Nov 2008, 21:46:44 UTC - in response to Message 6956.  

The previous limit was also per core (ie 20 per core). If the new WU's are only 5 minutes long as well, and the limit remains at 5, then it will push off folks to other applications for sure as a 25 minute maximum CACHE will simply not work for folks. I REALLY hope the 5 WU limit is because the plan is for something like 2 hour long WU's. Frankly, a cache limit of less than 10 hours results in a lot of wasted manual effort. Of course WU's of 5 minutes have got to cause problems on the server end.

Spinhenge has a 10 WU per download limit with 30 minute WU's. But not only do they always have the WU's for the download, they also have a higher cache limit (in their case 75 WU's per workstation -- so you don't have all that much work cached on a quad). But the thing is, those are 30 minute WU's -- so each download is 5 hours of work and the max cache is over 36 hours (or 9 hours for a quad). Here with a 5 minute WU and a 5 WU per CPU cache, we're talking a max cache of 25 minutes per CPU -- nope, that definitely does not work.



I don't think the WUs are to take longer to crunch, but Travis has set the number to 5-per-core and not to 5-per-computer. So, if you have a Quaddie, then the new limit will let 20 Wus to accumulate.

I think the limit should be raised to 10 to give fast hosts a chance to keep up. If the script starts to extend the server contact time then the work will run out very quickly.

I think the server recontact time should max out at 10 minutes for a chance to collect work.


ID: 6963 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 6964 - Posted: 29 Nov 2008, 21:53:05 UTC - in response to Message 6962.  

True enough -- at one point I was really doing most of my CPU cycles for MW -- and, because of the higher credit per CPU cycle, I was running something over 80% of my credits to MW. Now, due to the available work constraints and the higher effort level required to get work, that 80%+ credit share number of a month ago has dropped to under 40% and with the new lower cache limit will likely drop down to 25%. In my project shares the winners of the cuts over here are SETI (so Travis has made Dave Anderson happy after all <smile>), Spinhenge, Climate and POEM. I've trickled off a bit more work back to Malaria and Rosetta as well.


Well, let's not forget the fact that the work they send out is not designed to let participants burn thorough it as fast as they feel like.

Since the future work to be sent out for a search depends on what comes back from the current work in the field, the big gun battleships may just have to go idle on MW for a while to ensure the search doesn't wander off in the wrong direction.

IOWs we are here to help the project run their science, they are not here to help heat your computer room by loading your hosts up with work which may or may not being giving them the outcome they are seeking. You get what you get at the time your host asks, and that's all there is.

Alinator


ID: 6964 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 6965 - Posted: 29 Nov 2008, 22:04:59 UTC - in response to Message 6962.  

Well, with the new limit, I suppose one test will be to see if the workstations reporting off the chart high RAC numbers also reduce (I have a sneaking suspicion that the methods employed on those workstations will succeed with the new limits as well......)


Well, let's not forget the fact that the work they send out is not designed to let participants burn thorough it as fast as they feel like.

IOWs we are here to help the project run their science, they are not here to help heat your computer room by loading your hosts up with work which may or may not being giving them the outcome they are seeking. You get what you get at the time your host asks, and that's all there is.

Alinator


ID: 6965 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile KWSN imcrazynow
Avatar

Send message
Joined: 22 Nov 08
Posts: 136
Credit: 319,414,799
RAC: 0
Message 6966 - Posted: 29 Nov 2008, 22:08:54 UTC - in response to Message 6964.  
Last modified: 29 Nov 2008, 22:09:55 UTC

True enough -- at one point I was really doing most of my CPU cycles for MW -- and, because of the higher credit per CPU cycle, I was running something over 80% of my credits to MW. Now, due to the available work constraints and the higher effort level required to get work, that 80%+ credit share number of a month ago has dropped to under 40% and with the new lower cache limit will likely drop down to 25%. In my project shares the winners of the cuts over here are SETI (so Travis has made Dave Anderson happy after all <smile>), Spinhenge, Climate and POEM. I've trickled off a bit more work back to Malaria and Rosetta as well.

Looks like I'll have to reduce resources as well.

the big gun battleships may just have to go idle on MW for a while .

Nuff said.

4870 GPU
4870 GPU
ID: 6966 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 6967 - Posted: 29 Nov 2008, 22:19:06 UTC

I used to get 10 days of rosetta to do and conect when they were done. But with tempermental mw setup I can't get days of work to do (not even 1 or 2), only min or a few hours.
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 6967 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile KWSN imcrazynow
Avatar

Send message
Joined: 22 Nov 08
Posts: 136
Credit: 319,414,799
RAC: 0
Message 6970 - Posted: 29 Nov 2008, 22:32:40 UTC - in response to Message 6967.  

I used to get 10 days of rosetta to do and conect when they were done. But with tempermental mw setup I can't get days of work to do (not even 1 or 2), only min or a few hours.

Maybe this project isn't well suited to BOINC as earlier suggested. If the science and the time constraints for the information to be useful to the project are so tight to only allow a 5 W/U per CPU maybe it should use another platform instead.
I would hate to see it leave but if that is what's neccessary then that is what should be done.


4870 GPU
4870 GPU
ID: 6970 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile KWSN imcrazynow
Avatar

Send message
Joined: 22 Nov 08
Posts: 136
Credit: 319,414,799
RAC: 0
Message 6986 - Posted: 29 Nov 2008, 23:27:49 UTC
Last modified: 29 Nov 2008, 23:34:09 UTC

OK, seems that the limit per CPU has been raised to 8. That's still only 10 minutes worth of work on my quad giving you 50% of cpu time. If you really want to have useful science quicker that should still be raised some more. Your 75% reduction per core earlier was WAY too much if you want to keep as many cpu's running your app as possible.Even though some of the slower CPU may take longer per task, the faster ones supporting your project should outweigh the others giving you the usefull science you need.

4870 GPU
4870 GPU
ID: 6986 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 6987 - Posted: 29 Nov 2008, 23:32:24 UTC - in response to Message 6986.  

This project seems to go by extreems when something changes. 10 min wu to 10 hours (old 260 credit wu's)
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 6987 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 6993 - Posted: 29 Nov 2008, 23:44:53 UTC - in response to Message 6987.  

The problem is really dealing with the un-optimized and optimized versions of the old application. For people using the non-optimized version WUs are extremely long, while for the rest they aren't. This makes it pretty hard to increase their size.

At any rate, we should be swapping over to the new application exclusively within the next week; and be able to increase the length of WUs without things getting overly weird (as if they aren't really weird as it is). Having a computation time that isn't extremely bipolar will really help us tweak things to make everything run smoother.
ID: 6993 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 6998 - Posted: 29 Nov 2008, 23:59:44 UTC - in response to Message 6986.  
Last modified: 30 Nov 2008, 0:01:59 UTC

OK, seems that the limit per CPU has been raised to 8. That's still only 10 minutes worth of work on my quad giving you 50% of cpu time. If you really want to have useful science quicker that should still be raised some more. Your 75% reduction per core earlier was WAY too much if you want to keep as many cpu's running your app as possible.Even though some of the slower CPU may take longer per task, the faster ones supporting your project should outweigh the others giving you the usefull science you need.



I was looking at your Quad, and the WUs are averaging 299 seconds each. This is a little faster than my old Quad (QX6,700 @ 3.0GHz). This one of mine takes 312 seconds, while my Penny takes 205 seconds.

With a limit of 8 WUs-per-core, my fastest Quad will crunch those 8 WUs over a total time of 1,640 seconds (or 27.33 minutes) per core (assuming each core has 8 WUs and all start and finish together).

That means -

(a) Provided there are sufficient WUs always available on the server, and
(b) deferring communications for xx minuted (see BOINC Manager /messages/ tab) does not exceed 20 minutes;

- then there should be no problems running MW on any rig, including a Penryn class Quad (@4.0GHz on air).
ID: 6998 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Alinator

Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0
Message 7000 - Posted: 30 Nov 2008, 0:02:56 UTC - in response to Message 6970.  

I used to get 10 days of rosetta to do and conect when they were done. But with tempermental mw setup I can't get days of work to do (not even 1 or 2), only min or a few hours.

Maybe this project isn't well suited to BOINC as earlier suggested. If the science and the time constraints for the information to be useful to the project are so tight to only allow a 5 W/U per CPU maybe it should use another platform instead.
I would hate to see it leave but if that is what's neccessary then that is what should be done.


If you read through some of the publications you see that the overall project team has run and tested the GMLE simulations on tightly coupled homogeneous clusters and supercomputers already.

The whole point of MWAH is to see if it can be made to play reasonably well on the loosely couple, heterogeneous platform BOINC offers.

That being said, I think they have demonstrated that even though there are some difficult issues which have to be addressed in order to get the most out of BOINC, it appears it can do the job at least from a preliminary viewpoint.

So the need to limit outstanding work queues and have tight deadlines is just a function of the science they are doing, and BOINC was designed to accommodate projects like that. The only problem I see is that some participants seem to want to put their requirements ahead of what the project's requirements are.

It basically comes down to being able to live with the work flow pattern MW will impart to your hosts when you run it. If one doesn't like that, then you either have to change your priorities about it or look for a different project which dovetails with your objectives better.

The one thing which is already clear is that MW is not an intermittent network connection or large reserve work cache friendly project. This is neither 'good' or 'bad', it's just a fact.

Alinator


ID: 7000 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 7001 - Posted: 30 Nov 2008, 0:08:09 UTC

I tend to agree with you Alinator. Certainly my Quads, and my older PCs, are living happily with the current WU-per-core limit (assuming the "deferring communications for xx minutes does not get above about 20 minutes).
ID: 7001 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nicolas
Avatar

Send message
Joined: 19 Nov 07
Posts: 29
Credit: 3,353,124
RAC: 0
Message 7004 - Posted: 30 Nov 2008, 0:16:36 UTC - in response to Message 3004.  
Last modified: 30 Nov 2008, 0:17:38 UTC

[should read whole thread before posting]
Please use "Reply" or "Quote" buttons on posts, instead of "reply to this thread". Keep the posts linked together ("X is a reply to Y").
ID: 7004 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 7007 - Posted: 30 Nov 2008, 0:46:34 UTC

The dreaded deferring communications for xx minutes has struck with a defferred communication of 1 hour 55 minutes, and when forced there seems to be no WUs ready (despite the server status saying there was over 1,500).

The current limits are OK if there is work (track record is an issue here) and the deferring communications for .... do not go out of sight.
ID: 7007 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gavin Shaw
Avatar

Send message
Joined: 16 Jan 08
Posts: 98
Credit: 1,371,299
RAC: 0
Message 7020 - Posted: 30 Nov 2008, 2:36:26 UTC - in response to Message 6966.  

the big gun battleships may just have to go idle on MW for a while.


Well, I don't have a big battleship here. Just 3 medium - heavy cruisers and an older destroyer (my wife's laptop). But the fleet keeps chugging along.

Haven't spotted any U-boats yet. :)

Never surrender and never give up. In the darkest hour there is always hope.

ID: 7020 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · Next

Message boards : Number crunching : 20 workunit limit

©2024 Astroinformatics Group