Some tasks seem never-ending.

Author	Message
Grzegorz Skoczylas Send message Joined: 2 Feb 12 Posts: 3 Credit: 5,236,155 RAC: 3,470	Message 76388 - Posted: 20 Sep 2023, 17:55:51 UTC For some time now, the tasks of this project have been behaving strangely from time to time. For example, at the beginning it is estimated that the whole task should take less than 2 hours to calculate. Meanwhile, after more than 2 hours of calculation, it turns out that more than 23 days are still needed to complete the calculation and this time keeps increasing! I understand that calculation algorithms can be complex. However, I believe that in a situation like this, where there is no chance of completing these calculations in a reasonable amount of time, such a task should terminate such an activity on its own, rather than wasting the power of my processor and my electricity bills on worthless calculations that nobody will need for anything after such a long time. It would probably be possible to implement some kind of watchdog to force the abandonment of calculations in such situations. I have already had several such situations on two different computers. One such task I aborted after more than 24 hours of calculation. A few others â€“ after a few hours. One of those cases: https://drive.google.com/file/d/1-6UbrvuVOs91uJylDOo-u4ztubtPZvaH/view?usp=sharing ID: 76388 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 19 Jul 10 Posts: 624 Credit: 19,297,971 RAC: 2,484	Message 76389 - Posted: 20 Sep 2023, 19:27:22 UTC - in response to Message 76388. Is the % completed increasing? Is the WU using any CPU time? If yes, just let them run, the estimate is just an estimate and in case of n-Body often very wrong. If not, do you allow BOINC to use 100% of CPU time? If not, that's usually the reson why such WU gets stuck. Set BOINC to 100% CPU time and restart it, the WU should than continue from the last checkpoint. ID: 76389 · Rating: 0 · rate: / Reply Quote

keputnam Send message Joined: 22 Oct 10 Posts: 17 Credit: 144,544,832 RAC: 3,390	Message 76426 - Posted: 13 Oct 2023, 1:16:35 UTC - in response to Message 76388. N-Body seems to be broken like this When you notice it, end BOINC and restart, and they should finish normally ID: 76426 · Rating: 0 · rate: / Reply Quote

julianop Send message Joined: 12 Oct 11 Posts: 7 Credit: 23,330,249 RAC: 7,587	Message 76983 - Posted: 25 Mar 2024, 2:57:03 UTC I'm getting a large number of never-ending work units (de_nbody_orbit_fitting_03-06-2024_v186_OCS_data_3_1709197135_xxxxx) at least five of them in the past five days. I've just rejoined MilkyWay@home after an absence from the project, and am immediately getting an alarming number of them: WU's that are estimated at a mere couple of minutes are taking over twenty four hours with increasing time remaining. I can't keep monitoring the progress every now and again to abort bad WU's, and don't want to waste processing time and energy. Is this a bad lot of WU's, or could there be something wrong with my setup? Other projects: Einstein@Home, Rosetta@Home are behaving normally; even WCU, thought WU's come down only sporadically. ID: 76983 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 19 Jul 10 Posts: 624 Credit: 19,297,971 RAC: 2,484	Message 76984 - Posted: 25 Mar 2024, 13:35:30 UTC - in response to Message 76983. or could there be something wrong with my setup? If you don't allow computing 100% of the CPU time, than it's your setup. It's OK to limit the number of CPUs used, but for Milkyway you must allow 100% of CPU time. ID: 76984 · Rating: 0 · rate: / Reply Quote

julianop Send message Joined: 12 Oct 11 Posts: 7 Credit: 23,330,249 RAC: 7,587	Message 76986 - Posted: 25 Mar 2024, 21:52:47 UTC - in response to Message 76984. Thanks, glad it's something simple. I'll make that change. ID: 76986 · Rating: 0 · rate: / Reply Quote

julianop Send message Joined: 12 Oct 11 Posts: 7 Credit: 23,330,249 RAC: 7,587	Message 76988 - Posted: 27 Mar 2024, 18:36:28 UTC - in response to Message 76986. Last modified: 27 Mar 2024, 18:36:51 UTC Quick note to confirm that setting processor time to 100% solved the problem, thanks Link :-) ID: 76988 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 19 Jul 10 Posts: 624 Credit: 19,297,971 RAC: 2,484	Message 76989 - Posted: 27 Mar 2024, 19:25:30 UTC - in response to Message 76988. Thanks for confirmation, that it worked for you, there are many people with this issue. ID: 76989 · Rating: 0 · rate: / Reply Quote