Questions and Answers :
n-body WU going for 7 day deadline. Bug in n-body app regards /ncpus/N//ncpus/ config setting?
Posted 17 Jan 2016 by marmot
Reprinted from the n-body release thread where no responses given.
I have a 6 core n-body WU that has run for 3 days and 10 hours and is only at 48% completion. The other 6 core n-body's in que report estimates of 38 to 58 minutes.
The questions I'm seeking answers to again are:
1) Is a 7 day run time possible?
2) Will this 7 day WU get paid an appropriate amount of credit?
3) Should I abort this WU?
Message boards :
Nbody Release 1.54
Posted 16 Jan 2016 by marmot
I have a 6 core n-body WU that has run for 3 days and 10 hours and is only at 48% completion. The other 6 core n-body's in que report estimates of 38 to 58 minutes. Is a 7 day run time possible? Will this 7 day WU get paid an appropriate amount of credit? Should I abort this WU?
I can't make a determination of normality because your server deletes my results so quickly. It would be much appreciated if you would maintain 2 weeks of results in our account history so we can get an idea of when packets failed and which machines are under-performing or if a WU app is behaving badly.
Besides this extrremely long calculation time, I've noticed with n-body that sometimes the 8 core n-body will be running and along with another 2 WU's from other projects even though BOINC only thinks this machine has 8 cores. It seems the n-body WU doesn't suspend when BOINC does a WU switch over every 30 minutes.
My configuration is probably rare. Many of my machines are set to a cc_config.xml options <ncpus>N</ncpus> where N is 2 cores higher than actual system cores. It's the only solution that actually fixes the work fetch anomaly where BOINC debt/workfetch algorithm idles a core (or 2) so that a high resource project with no current WU's has a core ready to go. I see this work fetch problem on many of my machines that have Citizen Grid (or a few other intermittent projects) set to 99 resource while the 6 or 8 other projects are set to 20 or less resource share. All the real cores are kept working 24/7 and when a intermittent high priority project actually gets work fetched the BOINC virtual cores get that WU and the OS deals with the extra thread sharing.
Is the <ncpus>N</ncpus> > than real ncpu's an issue for the n-body app?