Message boards :
News :
New Nbody Version 1.50
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 8 Oct 08 Posts: 4 Credit: 125,962 RAC: 0 |
|
Send message Joined: 22 Jun 11 Posts: 32 Credit: 41,852,496 RAC: 0 |
Hey Death, Not sure if you know this, but you are in the News message board, in an application release thread. If you have a problem that is not related, please post it elsewhere, like maybe the Number Crunching message board maybe. Thanks, Jacob |
Send message Joined: 4 Sep 12 Posts: 219 Credit: 456,474 RAC: 0 |
I'm also noticing the same behavior, on: I second Jacob's call for an 'expected' (developer's viewpoint) description of the runtime profile, in terms of CPU usage over time. But I would also urge users to monitor this new application with additional tools, not just BOINC Manager. I've just got back home after a few days away, and I won't even attempt to run one of these tasks (it will be under Windows) until later in the weekend. But what I've just read from several different users is a perfect description of what BOINC v7.4.xx is designed to display when no actual work at all is being reported by the science application. That might be because of an error in the progress reporting or checkpointing functions, or it might mean that nothing is being done. That gradual approach, getting closer and closer to 99.999999% done, but never quite reaching 100%, is what exactly what you should see if an application stalls at startup and goes nowhere. |
Send message Joined: 22 Jun 11 Posts: 32 Credit: 41,852,496 RAC: 0 |
My task has now hit 24 hours, at 100%. Should I let it continue to run? And why or why not? Frustrated. |
Send message Joined: 4 Sep 12 Posts: 219 Credit: 456,474 RAC: 0 |
My task has now hit 24 hours, at 100%. What does Process Explorer say about what it's doing? |
Send message Joined: 22 Jun 11 Posts: 32 Credit: 41,852,496 RAC: 0 |
Same as it has been, for a long time now --- using 1 core, despite BOINC allocating 4. Task properties: Resources: 4 CPUs CPU time at last checkpoint: --- (never any checkpoint) CPU time: 25:12:13 Elapsed time: 26:38:51 Estimated time remaining: --- Fraction done: 100.000% The times show that it has actually run single-threaded for the majority of the time, thus wasting 3 of my CPUs, not allowing BOINC to allocate them for other tasks. Admins: Should I let it continue to run? And why or why not? Frustrated. |
Send message Joined: 13 Mar 08 Posts: 804 Credit: 26,380,161 RAC: 0 |
I am sorry for the delay people are experiencing in getting assistance. I am working with Professor Newberg to improve communications. Here's something I from Sidd: It seems that it is only occurring for people with windows. The other platforms seem to be running ok. We had some trouble getting the binaries for windows in order to release, I am thinking something went wrong there, as none of the windows runs are passing. I took down the Windows version for now until I figure it all out. |
Send message Joined: 22 Jun 11 Posts: 32 Credit: 41,852,496 RAC: 0 |
Thank you, Blurf. We look forward to any more details that you or the admins can provide. I'm sorry that we sound so "complainy". We're frustrated. But in the end, we're here to help solve the problems we're seeing. If there's anything we can do to help, let us know. Thanks, Jacob |
Send message Joined: 19 May 14 Posts: 73 Credit: 356,131 RAC: 0 |
Hey All, I apologize for the silence. I am looking into the problem. We had issues with getting the windows version ready for release, and I believe something went wrong in the process. I ran those binaries on windows and they seemed to have worked properly at the time. However, it seems that no windows runs are working. The other versions of the code are ok so far and I am getting results on my end. If you have nbody running and stalled abort it for now. I deprecated the windows version of the app but I believe some may still be sent out before it is completely down. This was an unexpected problem with the binaries as they worked on our end, and I apologize for the issues you are having and the frustration incurred. Please be patient, as we are working on the issues. Thanks, Sidd |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
Just to provide some extra information on troubleshooitng the N-body wu's.... I'm running Linux Mint 17.1 (Rebecca) on a 5 core AMD with ATI 7850 card. Unlike previously, the N-body wu's are completing, however, I just checked and they are all "validation inconclusive." I also notice that it only processes 1 N-body wu at a time and says it is using 4 cores. This seems to apply ONLY to the MW wu's since the Einstein GPU wu's are processing multiple WU's simultaneously and concurrently with the MW and say they're using the equivalent of 2 cores! Using 6 cores simultaneously on a 5 core cpu! Ya gotta love the math!! :) Regards, Steve |
Send message Joined: 10 Nov 07 Posts: 96 Credit: 29,931,027 RAC: 0 |
I’m seeing similar behaviour on my MacBook Pro as has been reported on other platforms: Nbody claims to be using both CPUs, so my other tasks are Waiting, but Activity Monitor shows much less CPU usage than typical. I actually noticed this from abnormally low temperatures: I keep a close eye on this system because it has a dodgy fan, and it’s reading in the high 50s C instead of the usual low 60s. Progress so far—still pretty early—is consistent with CPU times. |
Send message Joined: 10 Nov 07 Posts: 96 Credit: 29,931,027 RAC: 0 |
The task in question finished much sooner than projected (2.3 h* instead of about 10), and is now awaiting validation. Computer is back to normal, crunching (for other projects) on both CPUs. * Comparing the reported run time of 2:17 to the CPU time of 2:23:35 seems to imply an efficiency of 52.5%, or that only 5% of the second CPU‘s capacity was used. |
Send message Joined: 22 Jun 11 Posts: 32 Credit: 41,852,496 RAC: 0 |
On one of my machines, an N-Body 1.50 x64 task ran for 145 hours, single-threaded (despite saying 8-CPUs and wasting them), and without a single checkpoint... before I killed it. I hope that the server has been set to not resend these N-Body tasks to us Windows users. But, because of the lack of communication and transparency, I will have to turn it off in my web preferences. I'll probably forget to turn it back on. It's a shame that the user has to micromanage these. I just wanted to get that feedback out there. Frustrating. Jacob |
Send message Joined: 11 Apr 15 Posts: 58 Credit: 63,291,127 RAC: 0 |
Doesn't work correctly on my end. It runs fast in the beginning of the simulation and gets slower as it gets to the end and then just stalls at 100% and at which point it has to be aborted because it just hangs. All my other computers have the same issue with this simulation. Same issue here, starts very sporty and then begins to lag more and more. Additionally, when BOINC is restarted, the application starts again at 0% - what a bummer! Aloha, Uli |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
I'm running Linux Mint and after an initial run of WU's running but getting "validation inconclusive" for them, then I started having the old problem of asymtotic WU's (approaching infinitely close but never completing). I ended up aborting all of the wu's before I noticed that some were 1.50 and some were 1.50 (mt). I don't know if there was any pattern to the one's that just keep running and the one's that run but give the "validation incomplete" message. I've delisted the N-body Simulation from my machines for now. Regards, Steve |
Send message Joined: 20 Oct 13 Posts: 1 Credit: 296,017 RAC: 0 |
Strange that I received 12 of these "bad" N-body tasks when I already had the N-body tasks removed from my job preference list. Perhaps there was no other work available from the other applications. In any case, the offending N-body tasks on my Windows 7 (64-bit) have been aborted. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 1 |
Strange that I received 12 of these "bad" N-body tasks when I already had the N-body tasks removed from my job preference list. Perhaps there was no other work available from the other applications. In any case, the offending N-body tasks on my Windows 7 (64-bit) have been aborted. Yes that is another check box you must uncheck for right now. It is at the bottom of the same list where you unchecked getting the n-body units. |
Send message Joined: 6 Oct 09 Posts: 3 Credit: 2,757,339 RAC: 0 |
This doesn't seem to be a windows only problem. I've currently got one on my Mac Mini that should have finished within an hour or so, but is currently been running for over 10 hours with the last 3 hours indicating 100% complete. This is the second time it has tried to run, since it isn't saving a checkpoint the last time the computer was shut down the work unit restarted from 0%. It was running over 24 hours the first time. Also looking at the activity monitor, it doesn't look like it uses more than one processor when it claims it is using all 8. The ones I've watched that do work also seem strange where they will slow down the closer they are to 100%, then will go back near 0% and rapidly count up to completion and finish normally. I'll see how long I can keep this one going to see if it finishes, but if it goes too long I'll have to abort it since no other work units can run. |
Send message Joined: 6 Oct 09 Posts: 3 Credit: 2,757,339 RAC: 0 |
The work unit from the previous message finally finished. It took it 50.5 hours (about 50 hours longer than the original estimate), but it is finished. Has to be some sort of bug, but it looks like they will finish eventually, and with no checkpoints you still run the risk of them starting over if the computer is restarted for any reason. |
Send message Joined: 27 Oct 14 Posts: 9 Credit: 10,527,532 RAC: 0 |
Currently a Milkyway@Home project says Running(4 CPUs) but when viewing the system load only one core is running at 100% while the other three are idle, the core with 100% use switches every several seconds (load balancing or something im assuming) why is this process claming its using 4 cpus but only using one? no other tasks can run while this one is since it is claming that it is going to be using all four cores. The tasks do complete sucessfully in about 25 minutes though. BOINC Client:7.4.23 BOINC Manager:7.4.23 (x64) Operating System: ubuntu 15.04 64-bit |
©2024 Astroinformatics Group