Guidence from Project Team Requested

Author	Message
Alinator Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0	Message 4212 - Posted: 17 Jul 2008, 17:41:38 UTC Last modified: 17 Jul 2008, 17:43:44 UTC In some of the other threads here, strategies for dealing with the new work as it stands have been laid out for working around operational issues for the hosts. However, they all have their good points and bad points and depending on how long it will be before new parameters are set by the project makes a difference on which way to go depending on individual circumstances. So an idea of when the adjusted searches will start would be handy for those of us who have tweaked MW on our hosts to accomodate other projects better and make sure this won't cause other problems when they arrive. If memory serves me, once a work set is generated it's not an easy matter to make changes to it, without just summarily canceling it and starting over. So I assume that's not an option and we are going to continue running the current sets in the field before we see any changes. Also, since currently have 'short, medium, and long' ones to work on now, it would be handy to know which one you are leaning towards going forward (medium, long, or something else). Also, info on what the range for FPOP's you're thinking about would be helpful, as well as a what the new deadlines might be like. The latter will be important in evaluating how well MW will play with other projects at a given CI/Work Cache setting. TIA, Alinator ID: 4212 · Rating: 0 · rate: / Reply Quote

voltron Send message Joined: 30 Mar 08 Posts: 50 Credit: 11,593,755 RAC: 0	Message 4213 - Posted: 17 Jul 2008, 18:08:10 UTC - in response to Message 4212. In some of the other threads here, strategies for dealing with the new work as it stands have been laid out for working around operational issues for the hosts. However, they all have their good points and bad points and depending on how long it will be before new parameters are set by the project makes a difference on which way to go depending on individual circumstances. So an idea of when the adjusted searches will start would be handy for those of us who have tweaked MW on our hosts to accomodate other projects better and make sure this won't cause other problems when they arrive. If memory serves me, once a work set is generated it's not an easy matter to make changes to it, without just summarily canceling it and starting over. So I assume that's not an option and we are going to continue running the current sets in the field before we see any changes. Also, since currently have 'short, medium, and long' ones to work on now, it would be handy to know which one you are leaning towards going forward (medium, long, or something else). Also, info on what the range for FPOP's you're thinking about would be helpful, as well as a what the new deadlines might be like. The latter will be important in evaluating how well MW will play with other projects at a given CI/Work Cache setting. TIA, Alinator And your question is................? Eh? ID: 4213 · Rating: 0 · rate: / Reply Quote

Thunder Send message Joined: 9 Jul 08 Posts: 85 Credit: 44,842,651 RAC: 0	Message 4215 - Posted: 17 Jul 2008, 18:09:26 UTC Thanks for posting this Alinator! I agree these are pretty much the crucial questions/issues that need to be addressed. I have 3 more machines that should be fairly high productivity for MW, but since the recent changes caused 11 of 12 machines to go into 'panic mode' and run MW exclusively because they think they won't meet deadline, I'm holding off attaching them. :( ID: 4215 · Rating: 0 · rate: / Reply Quote

Alinator Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0	Message 4216 - Posted: 17 Jul 2008, 18:19:16 UTC - in response to Message 4213. Last modified: 17 Jul 2008, 19:08:05 UTC And your question is................? Eh? LOL... There were no questions, there is a request for four separate bits of info in paragraphs 2, 4, and 5. So... 1.) When might we see parameter adjusted tasks? 2.) Will the 'official' new work going forward be one of the three types of work we have seen currently, or something else? 3.) A ballpark figure for the new FPOP's value for what the new work will be going forward. 4.) Thoughts about the new deadlines, so we can figure out what this means in terms of project tightness factor. So there, now they are questions. I didn't know we had to post in 'Jeopardy' format! :-D Alinator ID: 4216 · Rating: 0 · rate: / Reply Quote

Nathan Project scientist Send message Joined: 4 Oct 07 Posts: 43 Credit: 53,898 RAC: 0	Message 4217 - Posted: 17 Jul 2008, 18:22:30 UTC After seeing some of the times, I'm thinking that the "medium" length ones will probably work out the best for everyone, so figure on timings roughly around what you're getting with the 373... series. As for the rest, I got my times wrong and Travis is flying back today, so tonight or tomorrow we should have some of the timing issues fixed. I need to talk to him and confirm things, but what I've been thinking is upping the deadline to about a week (this way we get results back on a reasonable schedule) and reducing the number of max work units (so you can actually finish all that you get). If this sounds unreasonable in any way please let us know. Also if you can suggest a good number of WUs given the new runtimes associated with teh 373 series, it would be very helpful. ~Nate~ ID: 4217 · Rating: 0 · rate: / Reply Quote

Alinator Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0	Message 4219 - Posted: 17 Jul 2008, 19:19:23 UTC - in response to Message 4217. Last modified: 17 Jul 2008, 19:41:25 UTC After seeing some of the times, I'm thinking that the "medium" length ones will probably work out the best for everyone, so figure on timings roughly around what you're getting with the 373... series. As for the rest, I got my times wrong and Travis is flying back today, so tonight or tomorrow we should have some of the timing issues fixed. I need to talk to him and confirm things, but what I've been thinking is upping the deadline to about a week (this way we get results back on a reasonable schedule) and reducing the number of max work units (so you can actually finish all that you get). If this sounds unreasonable in any way please let us know. Also if you can suggest a good number of WUs given the new runtimes associated with teh 373 series, it would be very helpful. Thanks for the preliminary info. Giving a first pass at the new proposed deadlines, I would think you might be able to limit the incremental request work allotment to ten as a test. My understanding is this would help make the project more efficient from a science POV. My thinking is by increasing the deadline by only two days is no where near the thirtyfold increase in runtime, thus MW becomes much more a tight deadline project and the BOINC Debt system would be able to handle resource allocation on multi-project hosts and should still slow needless pestering of the project for work from MW Primary hosts. Also, you might want to think about cutting the Max Quota some. As it is with the Medium length work, I doubt there's a computer on Earth that can burn through 700 per core per day! ;-) This will help limit the 'damage' 'rouge' hosts can cause better. <edit> BTW, based on the proposal of Medium Length as the new 'Gold Standard', I'd say that bumping your current TDCF by a factor of fifty is the best compromise for a hands 'kludge' fix until the new run parameters are set in the future from the project. The beauty of that is you won't have to do anything to 'back out' of the fix. The only side effect should be you might carry something less in the cache for a while as the TDCF corrects for the new parameters. Alinator ID: 4219 · Rating: 0 · rate: / Reply Quote

Idefix Send message Joined: 19 Apr 08 Posts: 7 Credit: 3,067 RAC: 0	Message 4221 - Posted: 17 Jul 2008, 19:26:17 UTC - in response to Message 4217. Hi, but what I've been thinking is upping the deadline to about a week My old P3 Laptop will be forced into retirement, because it won't be able to finish the workunits in time (4 hrs uptime per day, sometimes less, 50% cpu usage). Regards, Carsten ID: 4221 · Rating: 0 · rate: / Reply Quote

Alinator Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0	Message 4222 - Posted: 17 Jul 2008, 19:32:46 UTC - in response to Message 4221. Last modified: 17 Jul 2008, 19:33:20 UTC Hi, but what I've been thinking is upping the deadline to about a week My old P3 Laptop will be forced into retirement, because it won't be able to finish the workunits in time (4 hrs uptime per day, sometimes less, 50% cpu usage). Regards, Carsten Agreed, old timers will have to be pretty much 24/7, flatout (except for 'casual' other usage) to be able to participate at 7 days. OTOH, MW has somewhat special time requirements compared to most other projects, but we all already knew that. ;-) Alinator ID: 4222 · Rating: 0 · rate: / Reply Quote

Thunder Send message Joined: 9 Jul 08 Posts: 85 Credit: 44,842,651 RAC: 0	Message 4224 - Posted: 17 Jul 2008, 19:37:00 UTC - in response to Message 4217. ...what I've been thinking is upping the deadline to about a week (this way we get results back on a reasonable schedule) and reducing the number of max work units (so you can actually finish all that you get). If this sounds unreasonable in any way please let us know. Also if you can suggest a good number of WUs given the new runtimes associated with teh 373 series, it would be very helpful. I think the week deadline sounds fine. The number of max work units should really be determined by the science needs and nothing else. I've seen mentioned that the type of 'genetic' algorithm you're using means that if a machine downloads a huge list of WUs, even if it completes them relatively quickly, the parameters for new WUs risks going off in a completely different direction before they work their way down the list. I think Travis mentioned that 16 would be better for the science, so go with that. If you've set (in BOINC) a reasonably accurate estimated time for the WUs (someone help me here, but I believe it's actually estimated FLOPS) for the type of workunits you're sending out, then the clients should only be getting what they need (For probably 90% of the clients out there it means they'll download only what they can finish in about 1 day). If they can do 5 WU's in a day, they'll get 5... if they can do 2, they get 2, etc. ID: 4224 · Rating: 0 · rate: / Reply Quote

Alinator Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0	Message 4226 - Posted: 17 Jul 2008, 19:44:37 UTC Don't know if you saw it in the other thread. The runtime estimate is Core Client calculated from the FPOP's estimate divided by the Floating Point Benchmark (IIRC). ID: 4226 · Rating: 0 · rate: / Reply Quote

JohnMD Send message Joined: 11 Jul 08 Posts: 13 Credit: 10,015,444 RAC: 0	Message 4229 - Posted: 17 Jul 2008, 21:42:59 UTC - in response to Message 4217. After seeing some of the times, I'm thinking that the "medium" length ones will probably work out the best for everyone, so figure on timings roughly around what you're getting with the 373... series. As for the rest, I got my times wrong and Travis is flying back today, so tonight or tomorrow we should have some of the timing issues fixed. It sounds like you know pretty well how many cycles are required for each unit. Whether they last 12 minutes or 12 hours on my P4 doesn't bother me as long as these cycles get translated into realistic (say, factor 2 ?) runtime estimates. One thing that does bother me is that the project can apparently choose unit sizes quite arbitrarily - are they getting 60 times the science from 12-hour units compared with 12-minute ones ? ID: 4229 · Rating: 0 · rate: / Reply Quote

Thunder Send message Joined: 9 Jul 08 Posts: 85 Credit: 44,842,651 RAC: 0	Message 4230 - Posted: 17 Jul 2008, 21:49:12 UTC - in response to Message 4229. One thing that does bother me is that the project can apparently choose unit sizes quite arbitrarily - are they getting 60 times the science from 12-hour units compared with 12-minute ones ? They're able to increase the accuracy of the results with the increased run time. To quote Nathan, one of the project scientists: I increased the time by increasing the accuracy with which we do the integral calculation over the wedge volume. If I understand right, the client is performing more iterations of the same calculation, which results in greater accuracy. So yes, the extra time spent has a definite value. :) ID: 4230 · Rating: 0 · rate: / Reply Quote

Bill & Patsy Send message Joined: 7 Jul 08 Posts: 47 Credit: 13,629,944 RAC: 0	Message 4243 - Posted: 19 Jul 2008, 5:57:41 UTC - in response to Message 4230. Last modified: 19 Jul 2008, 6:22:28 UTC ...So yes, the extra time spent has a definite value. :) Oh really! What is that "definite value"? If the computation time is increased by a factor of 30 or 60 (for example), is the "definite value" increased by 30 or 60? And if the computation time is increased by a factor of 30 or 60, the available computing resources (in terms of throughput) for Milky Way have been reduced by a factor of 30 or 60. What is the "definite value" of reducing the computing resources by 30 or 60? Is the increased accuracy 30 or 60 times more valuable? Seems highly unlikely or the project scientists would have increased the accuracy long ago. That they didn't do so before shows that your so-called "definite value" is small indeed. A modest increase in accuracy could have been credible. But be clear about what's actually happening here. It's not much about the science. Rather, resources are now being largely wasted and accuracy is being "tuned" in order to relieve pressure on the server. There's little "definite value" in that. --Bill ID: 4243 · Rating: 0 · rate: / Reply Quote

Jayargh Send message Joined: 8 Oct 07 Posts: 289 Credit: 3,690,838 RAC: 0	Message 4244 - Posted: 19 Jul 2008, 6:56:21 UTC - in response to Message 4243. Last modified: 19 Jul 2008, 7:46:28 UTC ...So yes, the extra time spent has a definite value. :) Oh really! What is that "definite value"? If the computation time is increased by a factor of 30 or 60 (for example), is the "definite value" increased by 30 or 60? And if the computation time is increased by a factor of 30 or 60, the available computing resources (in terms of throughput) for Milky Way have been reduced by a factor of 30 or 60. What is the "definite value" of reducing the computing resources by 30 or 60? Is the increased accuracy 30 or 60 times more valuable? Seems highly unlikely or the project scientists would have increased the accuracy long ago. That they didn't do so before shows that your so-called "definite value" is small indeed. A modest increase in accuracy could have been credible. But be clear about what's actually happening here. It's not much about the science. Rather, resources are now being largely wasted and accuracy is being "tuned" in order to relieve pressure on the server. There's little "definite value" in that. Ok Bill & Patsy...you are obviously new here by that post...and your criticism. We now for about 9 months have been running genetic searches built by Travis to show that the application works and papers have been written about the process of the searches. These have been the short units where fine tuning wasn't needed and a coarse representation was good enough to prove the process. Nathan has explained in the science section what the new searches are....basically taking real data and using the applications search pattern to come up with populations that fit......he started short also, but with the server overload went to a more fine tuned approach. Because this is all new cutting edge stuff he can go to whatever fine tuning desired because there currently is no data at any level of resolutions. I am sure he can go back at any time to check differently,various resolution levels...and this is just the 1st step to match population searches to currently known data.....once that is proved the goal of predicting somewhat known and unknown and unmapped regions is in view...this IS OF DEFINITE VALUE!!! Your naive attempts to attack the science and decision as somehow 'uncredible' and a 'waste" escapes me.If they want to increase the resolution by a factor of 1000 ...fine with me.Its the learning process that is important here not the 'factors'. Your statement of computing power being reduced by a factor of 30 or 60 is false...computing power is the same,,,the number of results are reduced to find the "sweet spot". I for one am very excited about the research done here at Milkyway@home due to the science that we are getting to that helps explain how our galaxy is evolving ....as well as the teams responsiveness to all issues earns them my top resource share at the moment! In the future please read the science section and understand the subject matter before you post such dribble! ID: 4244 · Rating: 0 · rate: / Reply Quote

STE\/E Send message Joined: 29 Aug 07 Posts: 486 Credit: 576,548,171 RAC: 0	Message 4245 - Posted: 19 Jul 2008, 13:16:25 UTC - in response to Message 4216. And your question is................? Eh? LOL... There were no questions, there is a request for four separate bits of info in paragraphs 2, 4, and 5. So... 1.) When might we see parameter adjusted tasks? 2.) Will the 'official' new work going forward be one of the three types of work we have seen currently, or something else? 3.) A ballpark figure for the new FPOP's value for what the new work will be going forward. 4.) Thoughts about the new deadlines, so we can figure out what this means in terms of project tightness factor. So there, now they are questions. I didn't know we had to post in 'Jeopardy' format! :-D Alinator LOL ... That's way more Information than I need to know, Whatever happened to BOINC's Set it & Forget it Format ... ;) Personally I attach my Box's to a Project & go with the flow, if the Wu's change & I had a Box or Box's that couldn't keep up then I'd move them to a Project they could keep up on & not expect the Project or Projects to issue Taylor Made Wu's to my Box's Specific Needs ... :) ID: 4245 · Rating: 0 · rate: / Reply Quote

Bill & Patsy Send message Joined: 7 Jul 08 Posts: 47 Credit: 13,629,944 RAC: 0	Message 4250 - Posted: 19 Jul 2008, 18:40:32 UTC - in response to Message 4244. Ok Bill & Patsy...you are obviously new here by that post...and your criticism. --< snip >-- In the future please read the science section and understand the subject matter before you post such dribble! Thanks, Jeff, for your spirited defense of MW. That's nicely encouraging. Yes, I'm a newby. And yes, my two postings to date have both been intentionally mildly provocative - in response to postings I found troubling. In shopping for a worthy project (even with a modest resource like mine which in no way compares to your giant resources), it doesn't hurt to "test the waters". Both times I've received a spirited defense of MW. This is what I'm looking for, and confirms that this project and its crunchers may actually be as good as they seem. Accordingly, MW has 3/4 of my modest resource committed to it. Some points, fwiw. I am a physicist, so can perhaps assess things at least a little. So, OK, the project is in a validation stage. Your explanation of Travis' approach shows a solid approach - start with a coarse check and then tune up going forward. But your explanation of what Nathan just did fits with the recent postings and my criticism ("dribble"?) - he certainly appears to have jumped the gun from a new coarse protocol directly to a very finely tuned protocol. You are correct that I'm not in a position to know whether that's scientifically driven and therefore cost-justified. But the circumstantial evidence based upon recent postings is that the decision was driven more by server considerations than science considerations, because a "scientific method" approach would be expected to have been a step-wise, incremental one (viz. Travis' approach), not a sudden jump at the beginning. Yes, he might have wanted to go ahead and probe the high resolution, but the postings belie a different motivation, and to that extent it is fair to challenge if/why the science appears to have taken a back seat to the server. 'nuff said about the project's execution. What I do appreciate, as I indicated, is your (in the plural) spirited support for MW, and your commitment to trust the project scientists and to stick with them. I accordingly plan to do likewise. It will be fun to follow the project's success. Thanks. --Bill ID: 4250 · Rating: 0 · rate: / Reply Quote

Odd-Rod Send message Joined: 7 Sep 07 Posts: 444 Credit: 5,715,481 RAC: 0	Message 4251 - Posted: 19 Jul 2008, 18:54:45 UTC - in response to Message 4250. But the circumstantial evidence based upon recent postings is that the decision was driven more by server considerations than science considerations, because a "scientific method" approach would be expected to have been a step-wise, incremental one (viz. Travis' approach), not a sudden jump at the beginning. Yes, he might have wanted to go ahead and probe the high resolution, but the postings belie a different motivation, and to that extent it is fair to challenge if/why the science appears to have taken a back seat to the server. I think we all agree (at least to some degree) that the jump certainly was too big. However, I must ask if one can truly say that the science is taking a back seat to the server. If the server problems are preventing the science being done or limiting how much is being done, then something does need to be done to improve that. Lets hope a happy medium can be found. Oh, and welcome to this project, Bill! Regards Rod ID: 4251 · Rating: 0 · rate: / Reply Quote

Nathan Project scientist Send message Joined: 4 Oct 07 Posts: 43 Credit: 53,898 RAC: 0	Message 4421 - Posted: 23 Jul 2008, 16:14:15 UTC I can attest that the science has not taken a backseat to the server, however, the science cannot get done without the server performing well. So without catering to the needs of the community and the server (at least with our current hardware) there would be no science getting done. Yes, I will full well admit that I increased the times by too much with the first jump. However, I wasn't sure how much was a good amount. I chose the "go big and then scale back" approach as opposed to the "scale up slowly" approach. Was it correct, maybe not; however, it instantly fixed a lot of the server issues. Thus, it was not all for not. As to the question, of do we get 30-60 times the science by increasing the length by 30-60. That is an entirely subjective question, and the only real answer I can give is that from what we've seen, a factor of 8 increase gives approximately one digit added precision, so the initial WU increase (the 372... series) would give us about 2 more digits of precision to all the calculations. Is this worth it? I don't know yet from the science aspect, but as with any instrument it takes a lot of calibration and fine-tuning to get it working the way you need it to. Currently, I'm still doing a lot of this calibration, so the science can get done. ~Nate~ ID: 4421 · Rating: 0 · rate: / Reply Quote

Thunder Send message Joined: 9 Jul 08 Posts: 85 Credit: 44,842,651 RAC: 0	Message 4593 - Posted: 31 Jul 2008, 20:18:10 UTC Have we seen any adjustment to the estimated FPOP's value for new tasks? I'd like to join a few more machines, but I'm not going to have them download a bunch of WU's that take hours when the estimated length is still a few minutes. :P ID: 4593 · Rating: 0 · rate: / Reply Quote

Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 4695 - Posted: 14 Aug 2008, 18:30:06 UTC - in response to Message 4593. I'm not Nate so I can't talk too much about the science behind the increase -- but the whole point of the project is to be able to get a 3 dimensional model of the milky way galaxy, and we want that model to be as accurate as possible. The new WUs are just making the integral as accurate as needed to match the accuracy of the data. They're also running over new areas of the sky which can have different sizes (which will also increase the amount of calculations doing). The changes to WU length really had nothing to do with server load; we pretty much had those problems under control before Nate started doing the new length WUs. ID: 4695 · Rating: 0 · rate: / Reply Quote