Welcome to MilkyWay@home

N-Body tune initial replication value

Message boards : Number crunching : N-Body tune initial replication value
Message board moderation

To post messages, you must log in.

AuthorMessage
computezrmle

Send message
Joined: 12 Dec 11
Posts: 8
Credit: 291,797,907
RAC: 6,069
Message 76107 - Posted: 1 Jul 2023, 7:54:39 UTC

N-Body simulations seem to require at least 2 results being returned to get validated.
Most of the WU details I checked show that the initial replication is set to 1 and the 2nd result is not sent out before the 1st result has been reported back.

Wouldn't it be more efficient to set the initial replication to "2"?
This would allow 2 wingmen to process a WU concurrently.
ID: 76107 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76108 - Posted: 1 Jul 2023, 9:35:48 UTC - in response to Message 76107.  

N-Body simulations seem to require at least 2 results being returned to get validated.
Most of the WU details I checked show that the initial replication is set to 1 and the 2nd result is not sent out before the 1st result has been reported back.

Wouldn't it be more efficient to set the initial replication to "2"?
This would allow 2 wingmen to process a WU concurrently.


Once your pc returns about 10 valid tasks in a row then that pc won't need a wingman anymore as the Project will think that your results are trustworthy but every one in awhile you will get a task that requires a wingman and if you fail you start over again until you get 10 valid tasks in a row. Each task also has a guesstimate range of things they are looking for, no project gives out those precise details though, and as long as your task is within the range it will be considered valid given the prior conditions I listed above, if it's outside then back to square one you go again. In the end I think it's that most tasks do not need a wingman as long as your pc isn't pushing the limits.
ID: 76108 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle

Send message
Joined: 12 Dec 11
Posts: 8
Credit: 291,797,907
RAC: 6,069
Message 76109 - Posted: 1 Jul 2023, 10:26:29 UTC - in response to Message 76108.  

I did not yet find a single valid N-Body WU without at least 2 results.
Could anybody post a link to an example with only 1 result?


From my 8-core i7 (tweaked to 2 cores) https://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=994185:
Application details for host 994185
Milkyway@home N-Body Simulation 1.82 x86_64-pc-linux-gnu (mt)
Number of tasks completed 23
Max tasks per day 20023
Number of tasks today 2
Consecutive valid tasks 23
Average processing rate 1.72 GFLOPS
Average turnaround time 0.29 days

It looks like the "Max tasks per day" value had been optimized for Separation but is now far too high for N-Body.
ID: 76109 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 213
Credit: 108,362,353
RAC: 4,513
Message 76122 - Posted: 2 Jul 2023, 8:41:17 UTC
Last modified: 2 Jul 2023, 8:42:54 UTC

Mikey described BOINC's built-in Adaptive Replication mechanism, which appears to be in use for N-Body and was in use for Separation. Around the time of the big server crash, both N-Body and Separation had difficulties with Adaptive Replication (possibly caused by delays in how long it was taking initial results to come back?) and I don't think I have seen an N-Body task pass the adaptive replication test since (if they ever did before!), not that I've been checking every task :-)

Typical requirements for a project to use Adaptive Replication are that there are a multitude of tasks that have almost exactly the same parameters (so statistical variations are allowed for) or that a result doesn't have to have extremely high precision anyway. Some WCG projects may fit the latter, and I suspect MilkyWay projects use the former approach!

MilkyWay projects use a back-end toolkit called [I think] Toolkit for Asynchronous Optimization. If I understand what little of the code I tried looking at, the customized BOINC validator insists on a wingman (even in AR-permitted conditions) if a result is for a parameter grouping that hasn't already validated a work unit -- I'm guessing that there are situations in which every workunit ends up needing a wingman anyway, and it may be that N-Body is in that situation.

If that really is the situation, it might be better if they altered the project configuration to not use Adaptive Replication (if that doesn't break the custom validator!)

Hope this is of interest - Al.
ID: 76122 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : N-Body tune initial replication value

©2024 Astroinformatics Group