Welcome to MilkyWay@home

Guidence from Project Team Requested

Message boards : Number crunching : Guidence from Project Team Requested
Message board moderation

To post messages, you must log in.

AuthorMessage
Alinator

Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0
Message 4212 - Posted: 17 Jul 2008, 17:41:38 UTC
Last modified: 17 Jul 2008, 17:43:44 UTC

In some of the other threads here, strategies for dealing with the new work as it stands have been laid out for working around operational issues for the hosts.

However, they all have their good points and bad points and depending on how long it will be before new parameters are set by the project makes a difference on which way to go depending on individual circumstances. So an idea of when the adjusted searches will start would be handy for those of us who have tweaked MW on our hosts to accomodate other projects better and make sure this won't cause other problems when they arrive.

If memory serves me, once a work set is generated it's not an easy matter to make changes to it, without just summarily canceling it and starting over. So I assume that's not an option and we are going to continue running the current sets in the field before we see any changes.

Also, since currently have 'short, medium, and long' ones to work on now, it would be handy to know which one you are leaning towards going forward (medium, long, or something else).

Also, info on what the range for FPOP's you're thinking about would be helpful, as well as a what the new deadlines might be like. The latter will be important in evaluating how well MW will play with other projects at a given CI/Work Cache setting.

TIA,

Alinator
ID: 4212 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
voltron
Avatar

Send message
Joined: 30 Mar 08
Posts: 50
Credit: 11,593,755
RAC: 0
Message 4213 - Posted: 17 Jul 2008, 18:08:10 UTC - in response to Message 4212.  

In some of the other threads here, strategies for dealing with the new work as it stands have been laid out for working around operational issues for the hosts.

However, they all have their good points and bad points and depending on how long it will be before new parameters are set by the project makes a difference on which way to go depending on individual circumstances. So an idea of when the adjusted searches will start would be handy for those of us who have tweaked MW on our hosts to accomodate other projects better and make sure this won't cause other problems when they arrive.

If memory serves me, once a work set is generated it's not an easy matter to make changes to it, without just summarily canceling it and starting over. So I assume that's not an option and we are going to continue running the current sets in the field before we see any changes.

Also, since currently have 'short, medium, and long' ones to work on now, it would be handy to know which one you are leaning towards going forward (medium, long, or something else).

Also, info on what the range for FPOP's you're thinking about would be helpful, as well as a what the new deadlines might be like. The latter will be important in evaluating how well MW will play with other projects at a given CI/Work Cache setting.

TIA,

Alinator



And your question is................?

Eh?
ID: 4213 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Thunder
Avatar

Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,842,651
RAC: 0
Message 4215 - Posted: 17 Jul 2008, 18:09:26 UTC

Thanks for posting this Alinator! I agree these are pretty much the crucial questions/issues that need to be addressed.

I have 3 more machines that should be fairly high productivity for MW, but since the recent changes caused 11 of 12 machines to go into 'panic mode' and run MW exclusively because they think they won't meet deadline, I'm holding off attaching them. :(
ID: 4215 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Alinator

Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0
Message 4216 - Posted: 17 Jul 2008, 18:19:16 UTC - in response to Message 4213.  
Last modified: 17 Jul 2008, 19:08:05 UTC



And your question is................?

Eh?


LOL...

There were no questions, there is a request for four separate bits of info in paragraphs 2, 4, and 5. So...

1.) When might we see parameter adjusted tasks?

2.) Will the 'official' new work going forward be one of the three types of work we have seen currently, or something else?

3.) A ballpark figure for the new FPOP's value for what the new work will be going forward.

4.) Thoughts about the new deadlines, so we can figure out what this means in terms of project tightness factor.

So there, now they are questions. I didn't know we had to post in 'Jeopardy' format! :-D

Alinator
ID: 4216 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Nathan
Project scientist
Avatar

Send message
Joined: 4 Oct 07
Posts: 43
Credit: 53,898
RAC: 0
Message 4217 - Posted: 17 Jul 2008, 18:22:30 UTC

After seeing some of the times, I'm thinking that the "medium" length ones will probably work out the best for everyone, so figure on timings roughly around what you're getting with the 373... series.

As for the rest, I got my times wrong and Travis is flying back today, so tonight or tomorrow we should have some of the timing issues fixed. I need to talk to him and confirm things, but what I've been thinking is upping the deadline to about a week (this way we get results back on a reasonable schedule) and reducing the number of max work units (so you can actually finish all that you get). If this sounds unreasonable in any way please let us know. Also if you can suggest a good number of WUs given the new runtimes associated with teh 373 series, it would be very helpful.

~Nate~
ID: 4217 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Alinator

Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0
Message 4219 - Posted: 17 Jul 2008, 19:19:23 UTC - in response to Message 4217.  
Last modified: 17 Jul 2008, 19:41:25 UTC

After seeing some of the times, I'm thinking that the "medium" length ones will probably work out the best for everyone, so figure on timings roughly around what you're getting with the 373... series.

As for the rest, I got my times wrong and Travis is flying back today, so tonight or tomorrow we should have some of the timing issues fixed. I need to talk to him and confirm things, but what I've been thinking is upping the deadline to about a week (this way we get results back on a reasonable schedule) and reducing the number of max work units (so you can actually finish all that you get). If this sounds unreasonable in any way please let us know. Also if you can suggest a good number of WUs given the new runtimes associated with teh 373 series, it would be very helpful.



Thanks for the preliminary info.

Giving a first pass at the new proposed deadlines, I would think you might be able to limit the incremental request work allotment to ten as a test. My understanding is this would help make the project more efficient from a science POV.

My thinking is by increasing the deadline by only two days is no where near the thirtyfold increase in runtime, thus MW becomes much more a tight deadline project and the BOINC Debt system would be able to handle resource allocation on multi-project hosts and should still slow needless pestering of the project for work from MW Primary hosts.

Also, you might want to think about cutting the Max Quota some. As it is with the Medium length work, I doubt there's a computer on Earth that can burn through 700 per core per day! ;-)

This will help limit the 'damage' 'rouge' hosts can cause better.

<edit> BTW, based on the proposal of Medium Length as the new 'Gold Standard', I'd say that bumping your current TDCF by a factor of fifty is the best compromise for a hands 'kludge' fix until the new run parameters are set in the future from the project. The beauty of that is you won't have to do anything to 'back out' of the fix. The only side effect should be you might carry something less in the cache for a while as the TDCF corrects for the new parameters.

Alinator
ID: 4219 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Idefix

Send message
Joined: 19 Apr 08
Posts: 7
Credit: 3,067
RAC: 0
Message 4221 - Posted: 17 Jul 2008, 19:26:17 UTC - in response to Message 4217.  

Hi,

but what I've been thinking is upping the deadline to about a week

My old P3 Laptop will be forced into retirement, because it won't be able to finish the workunits in time (4 hrs uptime per day, sometimes less, 50% cpu usage).

Regards,
Carsten
ID: 4221 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Alinator

Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0
Message 4222 - Posted: 17 Jul 2008, 19:32:46 UTC - in response to Message 4221.  
Last modified: 17 Jul 2008, 19:33:20 UTC

Hi,

but what I've been thinking is upping the deadline to about a week

My old P3 Laptop will be forced into retirement, because it won't be able to finish the workunits in time (4 hrs uptime per day, sometimes less, 50% cpu usage).

Regards,
Carsten


Agreed, old timers will have to be pretty much 24/7, flatout (except for 'casual' other usage) to be able to participate at 7 days.

OTOH, MW has somewhat special time requirements compared to most other projects, but we all already knew that. ;-)

Alinator
ID: 4222 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Thunder
Avatar

Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,842,651
RAC: 0
Message 4224 - Posted: 17 Jul 2008, 19:37:00 UTC - in response to Message 4217.  

...what I've been thinking is upping the deadline to about a week (this way we get results back on a reasonable schedule) and reducing the number of max work units (so you can actually finish all that you get). If this sounds unreasonable in any way please let us know. Also if you can suggest a good number of WUs given the new runtimes associated with teh 373 series, it would be very helpful.


I think the week deadline sounds fine.

The number of max work units should really be determined by the science needs and nothing else. I've seen mentioned that the type of 'genetic' algorithm you're using means that if a machine downloads a huge list of WUs, even if it completes them relatively quickly, the parameters for new WUs risks going off in a completely different direction before they work their way down the list. I think Travis mentioned that 16 would be better for the science, so go with that.

If you've set (in BOINC) a reasonably accurate estimated time for the WUs (someone help me here, but I believe it's actually estimated FLOPS) for the type of workunits you're sending out, then the clients should only be getting what they need (For probably 90% of the clients out there it means they'll download only what they can finish in about 1 day). If they can do 5 WU's in a day, they'll get 5... if they can do 2, they get 2, etc.
ID: 4224 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Alinator

Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0
Message 4226 - Posted: 17 Jul 2008, 19:44:37 UTC

Don't know if you saw it in the other thread.

The runtime estimate is Core Client calculated from the FPOP's estimate divided by the Floating Point Benchmark (IIRC).
ID: 4226 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JohnMD
Avatar

Send message
Joined: 11 Jul 08
Posts: 13
Credit: 10,015,444
RAC: 0
Message 4229 - Posted: 17 Jul 2008, 21:42:59 UTC - in response to Message 4217.  

After seeing some of the times, I'm thinking that the "medium" length ones will probably work out the best for everyone, so figure on timings roughly around what you're getting with the 373... series.

As for the rest, I got my times wrong and Travis is flying back today, so tonight or tomorrow we should have some of the timing issues fixed.


It sounds like you know pretty well how many cycles are required for each unit. Whether they last 12 minutes or 12 hours on my P4 doesn't bother me as long as these cycles get translated into realistic (say, factor 2 ?) runtime estimates.

One thing that does bother me is that the project can apparently choose unit sizes quite arbitrarily - are they getting 60 times the science from 12-hour units compared with 12-minute ones ?
ID: 4229 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Thunder
Avatar

Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,842,651
RAC: 0
Message 4230 - Posted: 17 Jul 2008, 21:49:12 UTC - in response to Message 4229.  

One thing that does bother me is that the project can apparently choose unit sizes quite arbitrarily - are they getting 60 times the science from 12-hour units compared with 12-minute ones ?


They're able to increase the accuracy of the results with the increased run time.

To quote Nathan, one of the project scientists:

I increased the time by increasing the accuracy with which we do the integral calculation over the wedge volume.


If I understand right, the client is performing more iterations of the same calculation, which results in greater accuracy.

So yes, the extra time spent has a definite value. :)
ID: 4230 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bill & Patsy
Avatar

Send message
Joined: 7 Jul 08
Posts: 47
Credit: 13,629,944
RAC: 0
Message 4243 - Posted: 19 Jul 2008, 5:57:41 UTC - in response to Message 4230.  
Last modified: 19 Jul 2008, 6:22:28 UTC

...So yes, the extra time spent has a definite value. :)

Oh really! What is that "definite value"? If the computation time is increased by a factor of 30 or 60 (for example), is the "definite value" increased by 30 or 60? And if the computation time is increased by a factor of 30 or 60, the available computing resources (in terms of throughput) for Milky Way have been reduced by a factor of 30 or 60. What is the "definite value" of reducing the computing resources by 30 or 60?

Is the increased accuracy 30 or 60 times more valuable? Seems highly unlikely or the project scientists would have increased the accuracy long ago. That they didn't do so before shows that your so-called "definite value" is small indeed.

A modest increase in accuracy could have been credible. But be clear about what's actually happening here. It's not much about the science. Rather, resources are now being largely wasted and accuracy is being "tuned" in order to relieve pressure on the server. There's little "definite value" in that.
--Bill

ID: 4243 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jayargh
Avatar

Send message
Joined: 8 Oct 07
Posts: 289
Credit: 3,690,838
RAC: 0
Message 4244 - Posted: 19 Jul 2008, 6:56:21 UTC - in response to Message 4243.  
Last modified: 19 Jul 2008, 7:46:28 UTC

...So yes, the extra time spent has a definite value. :)

Oh really! What is that "definite value"? If the computation time is increased by a factor of 30 or 60 (for example), is the "definite value" increased by 30 or 60? And if the computation time is increased by a factor of 30 or 60, the available computing resources (in terms of throughput) for Milky Way have been reduced by a factor of 30 or 60. What is the "definite value" of reducing the computing resources by 30 or 60?

Is the increased accuracy 30 or 60 times more valuable? Seems highly unlikely or the project scientists would have increased the accuracy long ago. That they didn't do so before shows that your so-called "definite value" is small indeed.

A modest increase in accuracy could have been credible. But be clear about what's actually happening here. It's not much about the science. Rather, resources are now being largely wasted and accuracy is being "tuned" in order to relieve pressure on the server. There's little "definite value" in that.



Ok Bill & Patsy...you are obviously new here by that post...and your criticism.

We now for about 9 months have been running genetic searches built by Travis to show that the application works and papers have been written about the process of the searches. These have been the short units where fine tuning wasn't needed and a coarse representation was good enough to prove the process.

Nathan has explained in the science section what the new searches are....basically taking real data and using the applications search pattern to come up with populations that fit......he started short also, but with the server overload went to a more fine tuned approach.

Because this is all new cutting edge stuff he can go to whatever fine tuning desired because there currently is no data at any level of resolutions.
I am sure he can go back at any time to check differently,various resolution levels...and this is just the 1st step to match population searches to currently known data.....once that is proved the goal of predicting somewhat known and unknown and unmapped regions is in view...this IS OF DEFINITE VALUE!!!

Your naive attempts to attack the science and decision as somehow 'uncredible' and a 'waste" escapes me.If they want to increase the resolution by a factor of 1000 ...fine with me.Its the learning process that is important here not the 'factors'.

Your statement of computing power being reduced by a factor of 30 or 60 is false...computing power is the same,,,the number of results are reduced to find the "sweet spot".

I for one am very excited about the research done here at Milkyway@home due to the science that we are getting to that helps explain how our galaxy is evolving ....as well as the teams responsiveness to all issues earns them my top resource share at the moment!

In the future please read the science section and understand the subject matter before you post such dribble!
ID: 4244 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 29 Aug 07
Posts: 486
Credit: 576,523,040
RAC: 34,701
Message 4245 - Posted: 19 Jul 2008, 13:16:25 UTC - in response to Message 4216.  



And your question is................?

Eh?


LOL...

There were no questions, there is a request for four separate bits of info in paragraphs 2, 4, and 5. So...

1.) When might we see parameter adjusted tasks?

2.) Will the 'official' new work going forward be one of the three types of work we have seen currently, or something else?

3.) A ballpark figure for the new FPOP's value for what the new work will be going forward.

4.) Thoughts about the new deadlines, so we can figure out what this means in terms of project tightness factor.

So there, now they are questions. I didn't know we had to post in 'Jeopardy' format! :-D

Alinator


LOL ... That's way more Information than I need to know, Whatever happened to BOINC's Set it & Forget it Format ... ;)

Personally I attach my Box's to a Project & go with the flow, if the Wu's change & I had a Box or Box's that couldn't keep up then I'd move them to a Project they could keep up on & not expect the Project or Projects to issue Taylor Made Wu's to my Box's Specific Needs ... :)
ID: 4245 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bill & Patsy
Avatar

Send message
Joined: 7 Jul 08
Posts: 47
Credit: 13,629,944
RAC: 0
Message 4250 - Posted: 19 Jul 2008, 18:40:32 UTC - in response to Message 4244.  

Ok Bill & Patsy...you are obviously new here by that post...and your criticism.

--< snip >--

In the future please read the science section and understand the subject matter before you post such dribble!

Thanks, Jeff, for your spirited defense of MW. That's nicely encouraging. Yes, I'm a newby. And yes, my two postings to date have both been intentionally mildly provocative - in response to postings I found troubling. In shopping for a worthy project (even with a modest resource like mine which in no way compares to your giant resources), it doesn't hurt to "test the waters". Both times I've received a spirited defense of MW. This is what I'm looking for, and confirms that this project and its crunchers may actually be as good as they seem. Accordingly, MW has 3/4 of my modest resource committed to it.

Some points, fwiw.

I am a physicist, so can perhaps assess things at least a little. So, OK, the project is in a validation stage. Your explanation of Travis' approach shows a solid approach - start with a coarse check and then tune up going forward. But your explanation of what Nathan just did fits with the recent postings and my criticism ("dribble"?) - he certainly appears to have jumped the gun from a new coarse protocol directly to a very finely tuned protocol. You are correct that I'm not in a position to know whether that's scientifically driven and therefore cost-justified. But the circumstantial evidence based upon recent postings is that the decision was driven more by server considerations than science considerations, because a "scientific method" approach would be expected to have been a step-wise, incremental one (viz. Travis' approach), not a sudden jump at the beginning. Yes, he might have wanted to go ahead and probe the high resolution, but the postings belie a different motivation, and to that extent it is fair to challenge if/why the science appears to have taken a back seat to the server.

'nuff said about the project's execution.

What I do appreciate, as I indicated, is your (in the plural) spirited support for MW, and your commitment to trust the project scientists and to stick with them. I accordingly plan to do likewise. It will be fun to follow the project's success.

Thanks.
--Bill

ID: 4250 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Odd-Rod

Send message
Joined: 7 Sep 07
Posts: 444
Credit: 5,712,451
RAC: 0
Message 4251 - Posted: 19 Jul 2008, 18:54:45 UTC - in response to Message 4250.  

But the circumstantial evidence based upon recent postings is that the decision was driven more by server considerations than science considerations, because a "scientific method" approach would be expected to have been a step-wise, incremental one (viz. Travis' approach), not a sudden jump at the beginning. Yes, he might have wanted to go ahead and probe the high resolution, but the postings belie a different motivation, and to that extent it is fair to challenge if/why the science appears to have taken a back seat to the server.


I think we all agree (at least to some degree) that the jump certainly was too big. However, I must ask if one can truly say that the science is taking a back seat to the server. If the server problems are preventing the science being done or limiting how much is being done, then something does need to be done to improve that.

Lets hope a happy medium can be found.

Oh, and welcome to this project, Bill!

Regards
Rod
ID: 4251 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Nathan
Project scientist
Avatar

Send message
Joined: 4 Oct 07
Posts: 43
Credit: 53,898
RAC: 0
Message 4421 - Posted: 23 Jul 2008, 16:14:15 UTC

I can attest that the science has not taken a backseat to the server, however, the science cannot get done without the server performing well. So without catering to the needs of the community and the server (at least with our current hardware) there would be no science getting done.

Yes, I will full well admit that I increased the times by too much with the first jump. However, I wasn't sure how much was a good amount. I chose the "go big and then scale back" approach as opposed to the "scale up slowly" approach. Was it correct, maybe not; however, it instantly fixed a lot of the server issues. Thus, it was not all for not.

As to the question, of do we get 30-60 times the science by increasing the length by 30-60. That is an entirely subjective question, and the only real answer I can give is that from what we've seen, a factor of 8 increase gives approximately one digit added precision, so the initial WU increase (the 372... series) would give us about 2 more digits of precision to all the calculations.

Is this worth it? I don't know yet from the science aspect, but as with any instrument it takes a lot of calibration and fine-tuning to get it working the way you need it to. Currently, I'm still doing a lot of this calibration, so the science can get done.
~Nate~
ID: 4421 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Thunder
Avatar

Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,842,651
RAC: 0
Message 4593 - Posted: 31 Jul 2008, 20:18:10 UTC

Have we seen any adjustment to the estimated FPOP's value for new tasks?

I'd like to join a few more machines, but I'm not going to have them download a bunch of WU's that take hours when the estimated length is still a few minutes. :P
ID: 4593 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 4695 - Posted: 14 Aug 2008, 18:30:06 UTC - in response to Message 4593.  

I'm not Nate so I can't talk too much about the science behind the increase -- but the whole point of the project is to be able to get a 3 dimensional model of the milky way galaxy, and we want that model to be as accurate as possible. The new WUs are just making the integral as accurate as needed to match the accuracy of the data. They're also running over new areas of the sky which can have different sizes (which will also increase the amount of calculations doing).

The changes to WU length really had nothing to do with server load; we pretty much had those problems under control before Nate started doing the new length WUs.
ID: 4695 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Guidence from Project Team Requested

©2024 Astroinformatics Group