Welcome to MilkyWay@home

Multi-threaded N-body is back


Advanced search

Message boards : News : Multi-threaded N-body is back
Message board moderation

To post messages, you must log in.

AuthorMessage
Eric Mendelsohn
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 21 Aug 18
Posts: 31
Credit: 4,400,789
RAC: 2,975
3 million credit badge1 year member badge
Message 69049 - Posted: 17 Sep 2019, 19:56:29 UTC

Hello everyone,

We decided to once again provide support for the milkyway_nbody multi-threaded application. If you discover any issues with the new application, please do not hesitate to contact us so that we may expeditiously resolve them. Thank you all for your continued support.

-Eric
ID: 69049 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 61
Credit: 9,671,226
RAC: 6,498
5 million credit badge2 year member badge
Message 69054 - Posted: 18 Sep 2019, 19:40:07 UTC - in response to Message 69049.  
Last modified: 18 Sep 2019, 19:40:16 UTC

I request that you do one or the other, but not both, for the reasons previously discussed concerning the BOINC scheduler.
(I don't see any way to choose anything yet.)
ID: 69054 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Phoenix
Avatar

Send message
Joined: 5 Feb 11
Posts: 3
Credit: 1,508,462
RAC: 1,621
1 million credit badge8 year member badge
Message 69107 - Posted: 22 Sep 2019, 19:52:01 UTC

I have tried 2 of the new jobs, both went off the rails
Am trying third one then will give up and do other projects for a while
ID: 69107 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eric Mendelsohn
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 21 Aug 18
Posts: 31
Credit: 4,400,789
RAC: 2,975
3 million credit badge1 year member badge
Message 69119 - Posted: 24 Sep 2019, 17:53:44 UTC - in response to Message 69107.  

Could you please clarify what you meant by "went off the rails?" Are they spitting out errors? Is the run-time too long?
ID: 69119 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bam

Send message
Joined: 2 Dec 16
Posts: 2
Credit: 36,533,379
RAC: 20,865
30 million credit badge3 year member badge
Message 69178 - Posted: 19 Oct 2019, 5:06:16 UTC - in response to Message 69119.  

Run time for me appears to be never-ending. After a while the percentage done stops advancing and the estimated time remaining starts climbing. Runs that were supposed to finish in 8 hours on 8 CPUs are still running after 2 days with over 1 day estimated time to completion. I've aborted anything estimated at over 5 hours to see if the shorter ones will complete.
ID: 69178 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bam

Send message
Joined: 2 Dec 16
Posts: 2
Credit: 36,533,379
RAC: 20,865
30 million credit badge3 year member badge
Message 69179 - Posted: 20 Oct 2019, 2:43:01 UTC - in response to Message 69178.  

One work unit estimated at under 4 hours is still running after 7:17 hours and stuck at 19.729% completed, estimated 1d 05:31 to completion (and climbing).

As far as I can tell my system has not completed a single non-GPU work unit since I restarted processing a week ago.
ID: 69179 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profileearthbilly
Avatar

Send message
Joined: 1 Dec 18
Posts: 4
Credit: 125,792,217
RAC: 217,775
100 million credit badge1 year member badge
Message 69180 - Posted: 21 Oct 2019, 21:22:18 UTC

I have every CPU task finish without problems. CPU tasks seem to switch between multiple single processor and one multi processors without any problems at all on all five workstations as they progress down the task list.
ID: 69180 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ken Penland

Send message
Joined: 18 Jun 19
Posts: 1
Credit: 11,564,378
RAC: 47,626
10 million credit badge
Message 69284 - Posted: 22 Nov 2019, 12:28:49 UTC

for me the multi-threaded jobs says it will take 1-3 hours with 7 CPUs...however they have a real hard time finishing.. current job has been running 3:44...with an estimated 3:26 to go...however the estimated time is counting up instead of down. I dont have anything else running on my computer except for this browser....I have had to abort a ton of jobs as they pass their deadline. lots of wasted CPU it seems...
ID: 69284 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eric Mendelsohn
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 21 Aug 18
Posts: 31
Credit: 4,400,789
RAC: 2,975
3 million credit badge1 year member badge
Message 69310 - Posted: 26 Nov 2019, 2:18:26 UTC - in response to Message 69284.  

From what I'm seeing in my BOINC manager, the multi-threaded nbody application takes about 4 hours to complete on 8 CPUs. The main reason why these simulations take so long is due to two factors:
- We are using 40,000 bodies instead of 20,000. This is the minimum number of bodies we require to ensure the random seed does not drastically affect the final state of the nbody simulation. Our N-body algorithm is O(n log n), which means this change makes runs take 2.14 times longer than before.
- MilkyWay@home is optimizing to ultradense cores about 50% of the time. In order to accurately run a dense collection of bodies in an N-body simulation, you need to have a smaller timestep, otherwise, the collection of bodies explodes outwards. The denser the galaxy, the smaller the timestep needs to be. These dense progenitors take about 4 times longer to run than normal, and when MilkyWay@home optimizes to a dense progenitor, we end up with a population of parameters that each take several hours to compute.

While we cannot improve the number of bodies, we are working on removing runs that converge to these heavily cored progenitors. We apologize for the inconvenience. Thank you for your patience.

-Eric
ID: 69310 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Steven Case

Send message
Joined: 15 Dec 10
Posts: 3
Credit: 127,020,002
RAC: 4,353
100 million credit badge9 year member badge
Message 69315 - Posted: 28 Nov 2019, 14:22:16 UTC - in response to Message 69049.  

I am running W10,. If the multi thread tasks are the ones using 6 cpus I am having trouble. The tasks run very slowly, 3% after two days. They also lock up the BOINC software, other tasks do not run or download. When I abort all the 6 cpu tasks the other projects immediately download and run normally.
ID: 69315 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profileadrianxw

Send message
Joined: 25 May 14
Posts: 25
Credit: 45,458,959
RAC: 94
30 million credit badge5 year member badge
Message 69317 - Posted: 29 Nov 2019, 9:11:39 UTC

I've just aborted a work unit. 6+ hours CPU, only 8% complete, deadline later today. Your work units normally execute in a few minutes on here, (4GHz i7 + GPU). All other units aborted also before they start, NNT set. Something is wrong now.
ID: 69317 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
NumCrunch

Send message
Joined: 26 Jun 09
Posts: 1
Credit: 6,582,300
RAC: 2,217
5 million credit badge10 year member badge
Message 69333 - Posted: 11 Dec 2019, 16:06:50 UTC

Eric,
My multi-thread work unit runs slow like most of the cases listed here. I noticed on the windows "task manager" (press Ctrl-Alt-Del), on the performance tab, that the work units are only using less then 20% of the CPU time. It is not a setting in my app. All of the other projects use 80-100% of the CPU time.
Michael
ID: 69333 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 61
Credit: 9,671,226
RAC: 6,498
5 million credit badge2 year member badge
Message 69334 - Posted: 11 Dec 2019, 18:25:28 UTC - in response to Message 69315.  
Last modified: 11 Dec 2019, 18:29:47 UTC

I am running W10,. If the multi thread tasks are the ones using 6 cpus I am having trouble. The tasks run very slowly, 3% after two days.

I am running 6 cores on Win7 64-bit. On an i7-4771 they are taking about 4 to 6 hours now (one is estimating 8 hours), but completing OK at this point.
ID: 69334 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Steven Case

Send message
Joined: 15 Dec 10
Posts: 3
Credit: 127,020,002
RAC: 4,353
100 million credit badge9 year member badge
Message 69335 - Posted: 11 Dec 2019, 21:55:28 UTC

Is there a fix for the 6CPU task problem? I just aborted 19 tasks so the other projects could run. I may have to drop MW and look for another project. Is there a way to download just the single CPU tasks?
ID: 69335 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 61
Credit: 9,671,226
RAC: 6,498
5 million credit badge2 year member badge
Message 69336 - Posted: 11 Dec 2019, 22:47:11 UTC - in response to Message 69335.  

Is there a fix for the 6CPU task problem?

They work for me.
https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=737912&offset=0&show_names=0&state=4&appid=
Are you sure it is not your AV interfering?
ID: 69336 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Scott

Send message
Joined: 26 May 19
Posts: 1
Credit: 13,081,623
RAC: 63,505
10 million credit badge
Message 69337 - Posted: 15 Dec 2019, 3:28:18 UTC

I'm in the same boat. it crunches the numbers then gets stuck at anywhere from 32.882% to 99.577% it stops there and time remaining starts climbing. I have Windows 10 so I do a restart when I see this and when BOINC comes back the stuck N-Body item is no longer there and a new one has started
ID: 69337 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eric Mendelsohn
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 21 Aug 18
Posts: 31
Credit: 4,400,789
RAC: 2,975
3 million credit badge1 year member badge
Message 69350 - Posted: 16 Dec 2019, 20:00:08 UTC - in response to Message 69337.  

When you look up the work unit that gets stuck and disappears, does it say it was completed or does it log it as a computation error?
ID: 69350 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 357
Credit: 16,319,220
RAC: 7
10 million credit badge9 year member badge
Message 69400 - Posted: 24 Dec 2019, 10:44:49 UTC

Seems to be incompatible with using less than 100% of CPU time, stops processing in that case after short time (around 1 minute). Need that to keep my laptop at reasonable noise level.
.
ID: 69400 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profileearthbilly
Avatar

Send message
Joined: 1 Dec 18
Posts: 4
Credit: 125,792,217
RAC: 217,775
100 million credit badge1 year member badge
Message 69429 - Posted: 10 Jan 2020, 23:06:23 UTC

Hi Eric and everyone,

We had some bad experiences joining gridpool with issues on getting paid our due coins daily. We kept loosing all daily earnings for days at a time so I'm back to milkyway (OFF gridpool). Milkyway is the BEST!!!

Due to low light and solar production through the short days I have turned off your GPU tasks and selected only nbody cpu tasks to do for now.

Nbody I run at 76% CPU. I have noticed nbody are touchy with automatic windows 10 updates and updating GPU drivers, causing tasks to slow to a crawl or stop all together. I selected within internet properties 'metered internet connection' so I have a chance to suspend milkyway tasks before any updating. (No automatic updates with metered windows 10 internet connection). Windows asks permission before downloading and updating with a metered internet connection giving me the chance to suspend tasks before updates. That seems to stop all problems with nbody tasks for me.

I hope this makes sense. My brain hurts right now from so much pain in my face. Not thinking as clearly as i'd like.

Sunny regards,
100% POWERED BY SOLAR, SUNNY REGARDS
ID: 69429 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profileearthbilly
Avatar

Send message
Joined: 1 Dec 18
Posts: 4
Credit: 125,792,217
RAC: 217,775
100 million credit badge1 year member badge
Message 69430 - Posted: 12 Jan 2020, 13:51:56 UTC

I also think it's absolutely essential we "terminate" operation of every application within windows 10 not essential to BOINC and our GPU.

There are so many trash apps now in windows 10 and many will reload even when uninstalled. I now leave them in but terminate them so there is less going on in the background. I also go through every app permissions in PRIVACY and turn them off.

I like to use the Resource Monitor within Windows Administrative Tools to monitor CPU usage so I can see what is running and if a heartbeat is getting out of hand. Great tool I keep pinned to my taskbar. I've seen heartbeats crash nbody tasks. Some temperature monitor apps really spike the CPU usage every second leaving what looks like a heartbeat in the CPU graph. Bad for milky way tasks when it gets extreme.

I am sure you old timers know all this. It's just for any new bee like me that has problems. It really helps. Reduce background apps!!! Your CPU will like you;-)
100% POWERED BY SOLAR, SUNNY REGARDS
ID: 69430 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : Multi-threaded N-body is back

©2020 Astroinformatics Group