Welcome to MilkyWay@home

Something is wrong with N-Body Simulation: it counts units of 3538 gigaflops endlessly without progress

Questions and Answers : Windows : Something is wrong with N-Body Simulation: it counts units of 3538 gigaflops endlessly without progress
Message board moderation

To post messages, you must log in.

AuthorMessage
Eliovich Alexander & Yan

Send message
Joined: 7 Jan 24
Posts: 6
Credit: 2,554,615
RAC: 5,721
Message 76746 - Posted: 8 Jan 2024, 21:35:47 UTC
Last modified: 8 Jan 2024, 22:27:28 UTC

I think, something is seriously wrong with N-Body Simulation 1.83 (mt) when it counts units of 3538 gigaflops.
The program freezes at a certain percentage of the work completed (7.583% in my case), does not progress for dozens of hours, and the processor is only 0.1% occupied.
The Boinc writes that 1.5 hours of processor time were spent, but its total operating time is 14 hours or more!
It feels like the program is in some kind of endless loop.

This whole freezing situation persists even when the processor is doing nothing else!

At the same time, the program calculates units of several tens of gigaflops quite cheerfully

I have processor Intel Core I7-1255U 16 Gb DDR4-3200 1600 Mhz with Windows 10 Prof
The same situation with processor Intel Pentium J3710 Braswell 8 Gb DDR3-1600 with Widows 10 Prof
(freezing percentage is another, about 6%)

I set in Boinc to use maximum 86% threads of my processor I7-1255U (10 cores, 12 threads)

Maybe the MilkyWay@Home is hampered by the fact that tasks for Einstein@Home are being processed on the graphics card in parallel?
(Einstein uses video card and half of one CPU thread)

P.S. Here's one more thing: if you close the Boinc, stop all tasks, and after a few minutes start the Boinc again, the freeze will be removed. For a while or forever -- I don’t know yet
ID: 76746 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eliovich Alexander & Yan

Send message
Joined: 7 Jan 24
Posts: 6
Credit: 2,554,615
RAC: 5,721
Message 76747 - Posted: 8 Jan 2024, 22:49:51 UTC - in response to Message 76746.  

I already know the answer for last question:
if you close the Boinc, stop all tasks, and later restart the Boinc -- the freeze will be removed alas only for a while
ID: 76747 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eliovich Alexander & Yan

Send message
Joined: 7 Jan 24
Posts: 6
Credit: 2,554,615
RAC: 5,721
Message 76748 - Posted: 9 Jan 2024, 4:03:08 UTC
Last modified: 9 Jan 2024, 4:47:32 UTC

And I received an answer to the question of whether Einstein@Home has an influence.

I froze Einstein, and counting of the N-Body unit went relatively quickly.
But a few hours later I turned on Mozilla just for 30 minutes.
This was enough for the unit to freeze to death again!

Something is very wrong with MilkiWay N-Body!!

P.S.I’ll emphasize once again that I haven’t seen any problems with tasks of 65 thousand gigaflops yet: the counting is going very quickly.
It's strange that they are calculated much faster than 3 thousand gigaflops tasks (which also freeze)!

P.P.S. And the site is frustratingly slow...
ID: 76748 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 605
Credit: 19,164,631
RAC: 4,549
Message 76749 - Posted: 9 Jan 2024, 9:34:56 UTC - in response to Message 76746.  
Last modified: 9 Jan 2024, 10:10:36 UTC

I set in Boinc to use maximum 86% threads of my processor I7-1255U (10 cores, 12 threads)
Do you allow the usage of 100% of CPU time? If not, set it to 100% and the issue with tasks getting stuck should be gone unless the N-Body app gets confused also by the different types of cores your CPU has, but so far there were no complains here about that.

P.S.I’ll emphasize once again that I haven’t seen any problems with tasks of 65 thousand gigaflops yet: the counting is going very quickly.
It's strange that they are calculated much faster than 3 thousand gigaflops tasks (which also freeze)!
Don't worry about that, the estimates here are completely wrong, tasks that are supposed to run 9 days (!) on my computer finisch in two hours and tasks, which are supposed to run ~8 hours need 16+ hours.
ID: 76749 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eliovich Alexander & Yan

Send message
Joined: 7 Jan 24
Posts: 6
Credit: 2,554,615
RAC: 5,721
Message 76750 - Posted: 10 Jan 2024, 2:57:07 UTC - in response to Message 76749.  

1. Different cores have nothing to do with it -- exactly the same problem arises for my Ryzen 3700x (Windows 10 also).

2. I'll try setting the counting time to 100% and report the results.
But, this is not a very good situation -- I set time to 85% to prevent the processor from overheating.

Thank you so much
ID: 76750 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eliovich Alexander & Yan

Send message
Joined: 7 Jan 24
Posts: 6
Credit: 2,554,615
RAC: 5,721
Message 76751 - Posted: 10 Jan 2024, 19:38:24 UTC - in response to Message 76750.  
Last modified: 10 Jan 2024, 19:48:11 UTC

As far as I can see, the program has really begun to work sustainably. Apparently, it is the frequent stops that drive her into closed loops.
Thanks for the advice!

But in general, this is a big drawback of the program that it's work is not sustainable enough and it couldn't not prevent processor overheating.
ID: 76751 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3328
Credit: 523,337,118
RAC: 107,731
Message 76752 - Posted: 11 Jan 2024, 13:20:42 UTC - in response to Message 76751.  

As far as I can see, the program has really begun to work sustainably. Apparently, it is the frequent stops that drive her into closed loops.
Thanks for the advice!

But in general, this is a big drawback of the program that it's work is not sustainable enough and it couldn't not prevent processor overheating.


If it's a purchased system then the fan on the cpu is not the best of the best to keep the price where it was when you bought it, if you don't do things like that yourself talk to a local computer shop about which fan would be a better choice for your system. If you do do things like that yourself then a good all-in-one water cooled system should drop the temps 5 degrees with no problem, if you need to go further than a full blown water cooled system can drop them 10 degrees with no problem.

An easy answer to the heat is to take off the side of the pc and let the heat out or you can even blow a fan into the now open side. But I know that's not always possible with life going on around us.
ID: 76752 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 605
Credit: 19,164,631
RAC: 4,549
Message 76753 - Posted: 11 Jan 2024, 13:58:24 UTC - in response to Message 76750.  

But, this is not a very good situation -- I set time to 85% to prevent the processor from overheating.
Use less cores instead.
ID: 76753 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 605
Credit: 19,164,631
RAC: 4,549
Message 76754 - Posted: 11 Jan 2024, 14:26:07 UTC - in response to Message 76752.  

If it's a purchased system then the fan on the cpu is not the best of the best to keep the price where it was when you bought it, if you don't do things like that yourself talk to a local computer shop about which fan would be a better choice for your system. If you do do things like that yourself then a good all-in-one water cooled system should drop the temps 5 degrees with no problem, if you need to go further than a full blown water cooled system can drop them 10 degrees with no problem.

An easy answer to the heat is to take off the side of the pc and let the heat out or you can even blow a fan into the now open side. But I know that's not always possible with life going on around us.
The Intel Core i7-1255U is a laptop CPU, so likely there's not much that can be done about the cooling.

Also in most ready build PCs the CPU cooler isn't great and often not enough for running BOINC, but the main issue is usually the more or less the complete lack of adequate air flow. Usually this can only be fixed with a new computer case unless you are sure you have a good one with just not enough fans in it. This should always be the first step, even the best cooler won't help if the heat is accumulating inside the case. This is in particular important when using the PC for things like BOINC. While today's CPUs will simply throttle to protect themselves, there are many other parts running hot in case with no air flow and some of those parts do not have the possibility to protect themselves from overheating.
ID: 76754 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3328
Credit: 523,337,118
RAC: 107,731
Message 76755 - Posted: 12 Jan 2024, 12:01:45 UTC - in response to Message 76754.  

If it's a purchased system then the fan on the cpu is not the best of the best to keep the price where it was when you bought it, if you don't do things like that yourself talk to a local computer shop about which fan would be a better choice for your system. If you do do things like that yourself then a good all-in-one water cooled system should drop the temps 5 degrees with no problem, if you need to go further than a full blown water cooled system can drop them 10 degrees with no problem.

An easy answer to the heat is to take off the side of the pc and let the heat out or you can even blow a fan into the now open side. But I know that's not always possible with life going on around us.
The Intel Core i7-1255U is a laptop CPU, so likely there's not much that can be done about the cooling.

Also in most ready build PCs the CPU cooler isn't great and often not enough for running BOINC, but the main issue is usually the more or less the complete lack of adequate air flow. Usually this can only be fixed with a new computer case unless you are sure you have a good one with just not enough fans in it. This should always be the first step, even the best cooler won't help if the heat is accumulating inside the case. This is in particular important when using the PC for things like BOINC. While today's CPUs will simply throttle to protect themselves, there are many other parts running hot in case with no air flow and some of those parts do not have the possibility to protect themselves from overheating.


All 3 of my laptops have a cooling system underneath the laptop blowing air up into it to help more air flow go thru it. And yes the only way to handle a self throttling laptop is to use less cores.
ID: 76755 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eliovich Alexander & Yan

Send message
Joined: 7 Jan 24
Posts: 6
Credit: 2,554,615
RAC: 5,721
Message 76756 - Posted: 12 Jan 2024, 15:17:22 UTC
Last modified: 12 Jan 2024, 15:32:38 UTC

Thanks everyone for your valuable advice!

And I still hope that the N-Body program will be improved and will not collapse at the slightest breath of wind...
ID: 76756 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 605
Credit: 19,164,631
RAC: 4,549
Message 76757 - Posted: 12 Jan 2024, 16:14:49 UTC - in response to Message 76755.  

And yes the only way to handle a self throttling laptop is to use less cores.
The other one ist undervolting the CPU if possible. Worked great on my Pentium M laptop, could undervolt the CPU by about 300 mV and (together with some small modifications on the case and new thermal grease) lower the temperature by about 30°C while the fan was running at the lowest instead of the highest possible speed. Not possible on my newer Sandy Bridge Laptop at all, any kind of voltage control is not supported by the platform. So it depends on the CPU generation (and the CPU itself of course). AFAIK some of the newer generations should be undervoltable.
ID: 76757 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Questions and Answers : Windows : Something is wrong with N-Body Simulation: it counts units of 3538 gigaflops endlessly without progress

©2024 Astroinformatics Group