Quads waiting on the Server

Author	Message
voltron Send message Joined: 30 Mar 08 Posts: 50 Credit: 11,593,755 RAC: 0	Message 3114 - Posted: 12 Apr 2008, 2:21:26 UTC I just attached a new build to the project. It's a Q6600 running at 3.6 ghz. While I appreciate the project's quota of 20 WU's, this rig rips through the 20 in less time than the scheduler allows to refill my cache. Is there a machine to machine learning curve in progress or can I expect this dead air between the 20 packs? My other option is to throttle back to pace with the server. What's the skinny? Voltron ID: 3114 · Rating: 0 · rate: / Reply Quote

Dave Przybylo Send message Joined: 5 Feb 08 Posts: 236 Credit: 49,648 RAC: 0	Message 3115 - Posted: 12 Apr 2008, 2:37:37 UTC - in response to Message 3114. Last modified: 12 Apr 2008, 2:39:28 UTC I just attached a new build to the project. It's a Q6600 running at 3.6 ghz. While I appreciate the project's quota of 20 WU's, this rig rips through the 20 in less time than the scheduler allows to refill my cache. Is there a machine to machine learning curve in progress or can I expect this dead air between the 20 packs? My other option is to throttle back to pace with the server. What's the skinny? Voltron Well, we're trying to build a scheduler that allows for wu's per core. This was supposed to be done n the upgrade, however some files did not get upgraded properly as reverted to the old version. We'll keep you updated on when we get it. Shouldn't be too long. Until then, you're free to throttle back and place your resources on other servers where they will be used. Dave Przybylo MilkyWay@home Developer Department of Computer Science Rensselaer Polytechnic Institute ID: 3115 · Rating: 0 · rate: / Reply Quote

Jayargh Send message Joined: 8 Oct 07 Posts: 289 Credit: 3,690,838 RAC: 0	Message 3116 - Posted: 12 Apr 2008, 2:39:12 UTC - in response to Message 3114. I just attached a new build to the project. It's a Q6600 running at 3.6 ghz. While I appreciate the project's quota of 20 WU's, this rig rips through the 20 in less time than the scheduler allows to refill my cache. Is there a machine to machine learning curve in progress or can I expect this dead air between the 20 packs? My other option is to throttle back to pace with the server. What's the skinny? Voltron Dead air...consider the behaviour of an 8 or 16 core here.....most people actually use a 2nd project so there is no dead air but how efficient you can be in tweaking your Boinc manager determines how little the 2nd project crunches. ID: 3116 · Rating: 0 · rate: / Reply Quote

Webmaster Yoda Send message Joined: 21 Dec 07 Posts: 69 Credit: 7,048,412 RAC: 0	Message 3117 - Posted: 12 Apr 2008, 3:33:51 UTC Last modified: 12 Apr 2008, 3:44:54 UTC This question keeps rearing its head. But if you can't set the server up to issue 10, 20 or whatever number of work units per CORE, why not reduce (or eliminate) the "defering communication for 20 minutes" (as I mentioned in a post of 20 January) Set it to something like 5 minutes perhaps, rather than the current 20 minutes. It's a setting in the server's BOINC configuration. See also this post (quoted below) I'm no expert on the server-side options of BOINC, but a search of the BOINC site shows the following seemingly relevant config options (at http://boinc.berkeley.edu/trac/wiki/ProjectOptions): <max_wus_in_progress> N </max_wus_in_progress> <min_sendwork_interval> N </min_sendwork_interval> Here's what it says about that last option: min_sendwork_interval Minimum number of seconds to wait after sending results to a given host, before new results are sent to the same host. Helps prevent hosts with download or application problems from trashing lots of results by returning lots of error results. But don't set it to be so long that a host goes idle after completing its work, before getting new work. What we are seeing on fast hosts, particularly with short (2 credit) work units is exactly as described - they run out of work before they are allowed to connect again... This may become even more prevalent if the new applications are faster. Perhaps if that were set to 5 or 10 minutes, things would run more smoothly Join the #1 Aussie Alliance on MilkyWay! ID: 3117 · Rating: 0 · rate: / Reply Quote

Jayargh Send message Joined: 8 Oct 07 Posts: 289 Credit: 3,690,838 RAC: 0	Message 3118 - Posted: 12 Apr 2008, 3:48:38 UTC Yoda -If I remember right Travis tried that change on the old server version and he said in a post (not going to look) that it didn't work or change anything...so either the old code or something was overriding it, Might be worth trying again :) ID: 3118 · Rating: 0 · rate: / Reply Quote

Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 3125 - Posted: 12 Apr 2008, 12:40:15 UTC - in response to Message 3118. Yoda -If I remember right Travis tried that change on the old server version and he said in a post (not going to look) that it didn't work or change anything...so either the old code or something was overriding it, Might be worth trying again :) the new server could SHOULD have fixed the communication deferral problem. i take it that it hasn't :( hopefully we're going to be swapping to a wu per core limit when everything is updated. When we do that i think we'll actually drop down the limit to maybe 5-10 per core (which should be enough to keep machines full), and then we'll get better search results because most results will have a faster turn around. ID: 3125 · Rating: 0 · rate: / Reply Quote

Jayargh Send message Joined: 8 Oct 07 Posts: 289 Credit: 3,690,838 RAC: 0	Message 3126 - Posted: 12 Apr 2008, 15:30:52 UTC - in response to Message 3125. Last modified: 12 Apr 2008, 15:46:29 UTC Yoda -If I remember right Travis tried that change on the old server version and he said in a post (not going to look) that it didn't work or change anything...so either the old code or something was overriding it, Might be worth trying again :) the new server could SHOULD have fixed the communication deferral problem. i take it that it hasn't :( hopefully we're going to be swapping to a wu per core limit when everything is updated. When we do that i think we'll actually drop down the limit to maybe 5-10 per core (which should be enough to keep machines full), and then we'll get better search results because most results will have a faster turn around. My opinion is a 5/per core and a 10 min rpc call would be ideal from the last few months' discussion. No new host built should run out that way with current wu length and if they did it could be slightly tweaked but this is probably about optimal for project and user :) How soon until we start working longer units? ID: 3126 · Rating: 0 · rate: / Reply Quote

voltron Send message Joined: 30 Mar 08 Posts: 50 Credit: 11,593,755 RAC: 0	Message 3129 - Posted: 13 Apr 2008, 3:44:50 UTC Thanks for the feedback. Sounds like I eat it (the dead air) until RPI diddles the code. There is some compensation (cold), I have an E4500 on a pathetic Biostar board and they do not make nice together. So this rig runs cold. Time for a new (refurb) mobo. I appreciate your posts. Voltron ID: 3129 · Rating: 0 · rate: / Reply Quote

voltron Send message Joined: 30 Mar 08 Posts: 50 Credit: 11,593,755 RAC: 0	Message 3288 - Posted: 23 Apr 2008, 1:53:29 UTC - in response to Message 3129. Thanks for the feedback. Sounds like I eat it (the dead air) until RPI diddles the code. There is some compensation (cold), I have an E4500 on a pathetic Biostar board and they do not make nice together. So this rig runs cold. Time for a new (refurb) mobo. I appreciate your posts. Voltron I throttled back the Q6600 to 2.8 ghz and this is approximate to the servers response for 20 packs. This is the processor involved in the electrical fire incident. Luckily, the only component damaged was the motherboard. I am still in the process of finding a new motherboard. The most recent replacement (non incendiary) was doa, so I switched the proc into one of my dual core rigs. It was a DFI 965P refurb from NE, it would spin up, but no post. De bios was not ready for the show. Voltron ID: 3288 · Rating: 0 · rate: / Reply Quote