Welcome to MilkyWay@home

Some tasks stalling

Questions and Answers : Windows : Some tasks stalling
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Gerry Prich

Send message
Joined: 4 Dec 14
Posts: 8
Credit: 87,956,155
RAC: 4,571
Message 74929 - Posted: 24 Jan 2023, 0:27:06 UTC

Has anyone been reporting issues with tasks stalling? I've had to cancel several work units that basically stalled and were showing days left to complete after processing for days. They were holding up other work units. Normally work tasks complete within the day they are started. I feel bad about cancelling them, but they were causing other units to go past their due dates. This is something I have never had issues with before. Unfortunately, I did not write down the name of the tasks, but they disappeared after aborting.
ID: 74929 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,038,236
RAC: 5,987
Message 74930 - Posted: 24 Jan 2023, 3:12:39 UTC - in response to Message 74929.  

were they separation or N body?
ID: 74930 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 588
Credit: 18,923,561
RAC: 4,877
Message 74931 - Posted: 24 Jan 2023, 10:21:07 UTC - in response to Message 74929.  

Were they actually stalled, i.e. not using any CPU time? If yes, restarting the BOINC client should have fixed that.
ID: 74931 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gerry Prich

Send message
Joined: 4 Dec 14
Posts: 8
Credit: 87,956,155
RAC: 4,571
Message 74932 - Posted: 24 Jan 2023, 18:09:48 UTC - in response to Message 74930.  

I have another one at the moment and it is an Nbody. It says it is 6.255% through the task and has been running for 10 hours and 42 minutes with 6 days 16 hours remaining. It appears to be progressing, but tomorrow it will show even more time remaining. The due date is today.

All other tasks appear to be running normally, as I have a few other Projects queued up as well.

G
ID: 74932 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gerry Prich

Send message
Joined: 4 Dec 14
Posts: 8
Credit: 87,956,155
RAC: 4,571
Message 74933 - Posted: 24 Jan 2023, 18:11:48 UTC - in response to Message 74931.  

Stalled is just a description - it was still processing, but each time I looked at it, the time remaining had increased substantially. Other tasks were running normally.

G
ID: 74933 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gerry Prich

Send message
Joined: 4 Dec 14
Posts: 8
Credit: 87,956,155
RAC: 4,571
Message 74934 - Posted: 24 Jan 2023, 18:18:49 UTC - in response to Message 74931.  

All -

I just did a restart of BOINC on the one I described (Nbody) and it did appear to change the numbers it was reporting from hours to just minutes again. I will check back on it later and report whether it finished or not. I do restart the machine about once a week out of habit, so it does occasionally get a reboot.

Just confused on why this is occurring, as I had not had an issue before.

G
ID: 74934 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gerry Prich

Send message
Joined: 4 Dec 14
Posts: 8
Credit: 87,956,155
RAC: 4,571
Message 74935 - Posted: 24 Jan 2023, 20:29:03 UTC - in response to Message 74932.  

Thought the restart solved the issue, but no. Has run for 1.5 hours and now reporting 2.5 hours left. Like I said, it will just keep churning but never finish. So is there something about the Nbody tasks that is causing the issue on my hardware? (something not adequate to complete the task?). I am running a Dell machine, Windows 11 Pro (22H2), NVIDIA Model GeForce GTX 1660 Ti for graphics; i7 for processor and 16 GB RAM. Task id is de_nbody_11_14_2022_v182_40k__data__4_1666898646_819123. I have an older Win10 machine running also, and I see the same phenomena with an Nbody task showing running for days, but I don't think that is accurate. Many more of this type of task are in the list waiting to start, so I may have the same issue going on with all of them? I have an NVidia card on that machine also, is that possibly the common thread?

Just trying to understand the cause. Like I said, hate to abort the tasks.

G
ID: 74935 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 588
Credit: 18,923,561
RAC: 4,877
Message 74936 - Posted: 25 Jan 2023, 11:20:54 UTC - in response to Message 74933.  
Last modified: 25 Jan 2023, 11:23:38 UTC

Stalled is just a description - it was still processing, but each time I looked at it, the time remaining had increased substantially.

Stalled means task is showing in BOINC as processing, but not using any CPU time, i.e. it's not processing. You can check that in the Windows Task Manager. BOINC is simply not checking wether the application really is doing something or not, it starts the task and waits for the application to tell it when it's done. If it doesn't get updates from the application about the progress, it starts counting it by itself, that's why you see the remaining time increasing.

Task 642620787 was definitely stalled, lots of run time but no CPU time. If nBody causes lots of issues on your computers, simply disable it in your project preferences, than you will not need to abort them.
ID: 74936 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,285,846
RAC: 19,754
Message 74941 - Posted: 26 Jan 2023, 11:54:49 UTC - in response to Message 74935.  

Thought the restart solved the issue, but no. Has run for 1.5 hours and now reporting 2.5 hours left. Like I said, it will just keep churning but never finish. So is there something about the Nbody tasks that is causing the issue on my hardware? (something not adequate to complete the task?). I am running a Dell machine, Windows 11 Pro (22H2), NVIDIA Model GeForce GTX 1660 Ti for graphics; i7 for processor and 16 GB RAM. Task id is de_nbody_11_14_2022_v182_40k__data__4_1666898646_819123. I have an older Win10 machine running also, and I see the same phenomena with an Nbody task showing running for days, but I don't think that is accurate. Many more of this type of task are in the list waiting to start, so I may have the same issue going on with all of them? I have an NVidia card on that machine also, is that possibly the common thread?

Just trying to understand the cause. Like I said, hate to abort the tasks.

G


Are you limiting the n-body tasks to X number of cpu cores or are you letting MilkyWay decide how many to use?
ID: 74941 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gerry Prich

Send message
Joined: 4 Dec 14
Posts: 8
Credit: 87,956,155
RAC: 4,571
Message 74942 - Posted: 26 Jan 2023, 21:24:01 UTC - in response to Message 74936.  

I will take a look at that - thanks for the suggestion.

G
ID: 74942 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gerry Prich

Send message
Joined: 4 Dec 14
Posts: 8
Credit: 87,956,155
RAC: 4,571
Message 74943 - Posted: 26 Jan 2023, 21:28:07 UTC - in response to Message 74941.  

I was not aware that specific tasks could be limited. I had reduced my CPU access to 50% in BOINC, have now adjusted that up to 80% and will see how things run.

G
ID: 74943 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,285,846
RAC: 19,754
Message 74944 - Posted: 26 Jan 2023, 21:57:21 UTC - in response to Message 74943.  

I was not aware that specific tasks could be limited. I had reduced my CPU access to 50% in BOINC, have now adjusted that up to 80% and will see how things run.

G


That's not what i meant, what I meant was do you have an app_config.xml file like this in your MilkyWay folder"

<app_config>

<app_version>
<app_name>milkyway_nbody</app_name>
<max_concurrent>1</max_concurrent>
<plan_class>mt</plan_class>
<avg_ncpus>2</avg_ncpus>
<cmdline>--nthreads 2</cmdline>
</app_version>
</app_config>

This particular file only uses 2 cpu cores per N-Body task AND only runs 1 task at a time.

Your problem could be that you are using all your cpu cores on a single tasks and your pc is locking up, which you could have just made worse with your change in settings, Link has better info on this than I do and IF it applies to you or not.

Link also probably has a better file than I do as I haven't run N-Body tasks for awhile as I'm doing Separation tasks right now.
ID: 74944 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 588
Credit: 18,923,561
RAC: 4,877
Message 74945 - Posted: 27 Jan 2023, 15:05:20 UTC - in response to Message 74944.  

Link also probably has a better file than I do as I haven't run N-Body tasks for awhile as I'm doing Separation tasks right now.
In fact I have not run them since a while either. Your app_config looks right, for the 8 core machine I'd go with:

<app_config>
 <app_version>
  <app_name>milkyway_nbody</app_name>
  <max_concurrent>1</max_concurrent>
  <plan_class>mt</plan_class>
  <avg_ncpus>4</avg_ncpus>
  <cmdline>--nthreads 4</cmdline>
 </app_version>
</app_config>

ID: 74945 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Helldiver82

Send message
Joined: 26 Feb 23
Posts: 3
Credit: 12,724
RAC: 0
Message 75306 - Posted: 9 Apr 2023, 16:55:40 UTC

I have the same problems since march, described in my thread "task is running infinite like in a loop" and now I see that somebody else has reported exactly the same on 24. January here :-(
But still I dont have a solution

I have an i5-760 with 4 Cores and a GTX 1070 and want to use all % which is not needed by my own activities, so what would you recommend to configure?
ID: 75306 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ken

Send message
Joined: 18 Aug 20
Posts: 4
Credit: 40,954,603
RAC: 18,261
Message 75336 - Posted: 20 Apr 2023, 17:24:30 UTC

I have a question and possibly a problem with some tasks stalling

I am running an AMD Ryzen 7 5700U (16 cores supposedly) with Radeon Graphics CPU.

Question
Almost always, my system processes with a status of "Running (0.903 CPUs + 1 AMD/ATI GPU (device 0 or device 1))"
Does the Running 0.903 CPUs mean it is running 90.3% of all CPUs or 90.3% of one CPU?

Problem
At the same time, I have a task which reports 14.531% complete and the status is "Running (12 CPUs)". However, this task is stalled and doesn't process (it is an N-Body Sim task). I have had several task like this occur over the past couple of weeks. The latest one had a deadline date of 4/19, but as of today (4/20) the progress remains frozen at 14.531% and the status is running.

When I looked for the N-Body setting to check or uncheck (don't remember which) the setting doesn't show up. I don't know why the "Running (12 CPUs)" task stalls. Any answers or suggestions would be greatly appreciated. Thank you.[/img]
ID: 75336 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,285,846
RAC: 19,754
Message 75337 - Posted: 21 Apr 2023, 10:07:25 UTC - in response to Message 75336.  
Last modified: 21 Apr 2023, 10:10:15 UTC

I have a question and possibly a problem with some tasks stalling

I am running an AMD Ryzen 7 5700U (16 cores supposedly) with Radeon Graphics CPU.

Question
Almost always, my system processes with a status of "Running (0.903 CPUs + 1 AMD/ATI GPU (device 0 or device 1))"
Does the Running 0.903 CPUs mean it is running 90.3% of all CPUs or 90.3% of one CPU?


NO it means you are crunching gpu tasks with your Radeon Graphics CPU using 0.903% of a cpu core which you probably should not be on a laptop . GPU tasks create alot of heat and with a laptop being designed to be light and powerful the airflow is compromised more than in a desktop pc.

Problem
At the same time, I have a task which reports 14.531% complete and the status is "Running (12 CPUs)". However, this task is stalled and doesn't process (it is an N-Body Sim task). I have had several task like this occur over the past couple of weeks. The latest one had a deadline date of 4/19, but as of today (4/20) the progress remains frozen at 14.531% and the status is running.

When I looked for the N-Body setting to check or uncheck (don't remember which) the setting doesn't show up. I don't know why the "Running (12 CPUs)" task stalls. Any answers or suggestions would be greatly appreciated. Thank you.[/img]


Use Links file below, copy and paste it into Notepad, NOT a word processing program, and save it as "app_config.xml" without the quotes of course and place it in the Milkyway folder c:\program data\boinc\projects\milkyway.cs.rpi.edu_milkyway

After that go back into the Boinc Manager and click on Options and then click on read config files and Boinc will then be running the N-Body tasks with less than 16 cores per task.

Link's file says to run max 1 task at a time, the top line, then it says to only use 4 cpu cores per task. You can adjust that to fit what you want it to do, ie 2 tasks at a time each using 4 cpu cores or even 3 tasks at a time each using 4 cpu cores leaving 4 cpu cores available to you and whatever else you are doing on the pc.
ID: 75337 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dexter

Send message
Joined: 16 Feb 13
Posts: 1
Credit: 52,872,147
RAC: 283
Message 76316 - Posted: 30 Jul 2023, 15:05:39 UTC

My recent average has taken a dive recently and I thought I would take a look at why. But even using the above XML settings my computer still tries to do 4 tasks at a time.

I do work on two computing projects but this one is set as the priority.

I have the prefs set to 75% CPUs and 15% CPU time. I have to set this as even though I have a water cooler and a room at 20C if left to 100% I end up in the 90C zone and then the CPU starts to throttle the speed.


[/img]
ID: 76316 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 588
Credit: 18,923,561
RAC: 4,877
Message 76317 - Posted: 30 Jul 2023, 16:10:36 UTC - in response to Message 76316.  

I have the prefs set to 75% CPUs and 15% CPU time. I have to set this as even though I have a water cooler and a room at 20C if left to 100% I end up in the 90C zone and then the CPU starts to throttle the speed.
If that is what you need to keep a water cooled computer from throttling, there is either some serious issue with cooling system (like the cooler not properly attached, failing pump or similar), or you overclock to much. Not even an air cooled system should need that low settings if build and working properly. Your RAC took a dive likeliy because there are no more Separation tasks which payed a lot more than n-Body for the same amount of CPU time.
ID: 76317 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,285,846
RAC: 19,754
Message 76318 - Posted: 31 Jul 2023, 2:52:18 UTC - in response to Message 76316.  

My recent average has taken a dive recently and I thought I would take a look at why. But even using the above XML settings my computer still tries to do 4 tasks at a time.


This will continue until you have finished all the tasks you have in the cache that were gotten under the old 4 at a time settings, then you will change to the new settings.
ID: 76318 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ken

Send message
Joined: 18 Aug 20
Posts: 4
Credit: 40,954,603
RAC: 18,261
Message 76319 - Posted: 2 Aug 2023, 16:54:38 UTC - in response to Message 76318.  

I am also experiencing the stalling of N-Body simulation tasks. When I reboot my Win11 HP laptop with 16M of memory and a 1TB SSD it will run for maybe 5 min - 30 min and then it appears to stall and the number in the progress column then never changes. I can wait hours and it never changes. I resolve the "Progress" issue temporarily by
1) Restarting my computer
2) Closing and opening the BIONIC Manager Application
OR
3) I will go to Activity on the menu bar and select "Suspend" for the CPU section. All tasks then report a status of "Suspended - user request (8 CPUs)". When I select Activity on the menu bar and select "Run always" or "Run based on preferences", the numbers in the progress column starts counting up again. If I wait again, anywhere from the 5 min - 30 min the N-Body Simulation task stalls and ceases to progress.
4) When I go into Windows task manager and look at the BIONIC manager process percentage when the processing stops, the CPU % use is zero (0) or maybe .1%. Once I "Suspend" the task and select "Run based on preferences" under Activity on the menu bar, the task starts counting up and the CPU usage for Bionic manager returns to 40% usage (Which is what I have my preferences set for).

It almost seems like for whatever reason 1) A setting needs to be changed somewhere, 2) There is some sort of a memory leak in the application, or 3) The system runs out of resources and can't continue until a reboot, the BONIC Manager restarts, or the task is suspended and re-enabled.

Does any of this make sense?
ID: 76319 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Questions and Answers : Windows : Some tasks stalling

©2024 Astroinformatics Group