Welcome to MilkyWay@home

Huge 4 CPU task stuck at 17.188%, uses 0% of CPU, estimated time to finish 8 days?

Message boards : Number crunching : Huge 4 CPU task stuck at 17.188%, uses 0% of CPU, estimated time to finish 8 days?
Message board moderation

To post messages, you must log in.

AuthorMessage
Aberforth

Send message
Joined: 25 Jul 20
Posts: 3
Credit: 31,961
RAC: 0
Message 70012 - Posted: 4 Aug 2020, 7:19:33 UTC

Hi,

This task for 4 CPUs: https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=2000974954 even when suspending all other tasks, does not use my CPU at all anymore and appears stuck .

But it somehow got to 17.188%. Is my CPU too old for the rest of the task?

Should I cancel it?
ID: 70012 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,950,216
RAC: 22,077
Message 70013 - Posted: 4 Aug 2020, 18:06:04 UTC - in response to Message 70012.  

Hi,

This task for 4 CPUs: https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=2000974954 even when suspending all other tasks, does not use my CPU at all anymore and appears stuck .

But it somehow got to 17.188%. Is my CPU too old for the rest of the task?

Should I cancel it?


Suspend every other task on the pc, set all zero resource share projects to no new tasks too so you don't get tasks from them, and see if it starts up again it could just need more memory with the other things you are running. If that doesn't work try suspending it too then resuming it. You cpu is fine.
ID: 70013 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aberforth

Send message
Joined: 25 Jul 20
Posts: 3
Credit: 31,961
RAC: 0
Message 70014 - Posted: 4 Aug 2020, 18:48:02 UTC - in response to Message 70013.  
Last modified: 4 Aug 2020, 19:02:27 UTC

That did not work unfortunately: I suspended all other tasks and set them to no new tasks. I also closed every other application that used significant amounts of RAM (Browser, Spotify etc.). It (the process milkyway_nbody_1.76_windows_x86_64__mt.exe ) still uses 0% when supposedly running.

Here is the task and memory usage log of that try:
04.08.2020 20:41:14 | Milkyway@Home | task de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 resumed by user
04.08.2020 20:41:14 | Milkyway@Home | [task] task_state=EXECUTING for de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 from unsuspend
04.08.2020 20:42:20 |  | Re-reading cc_config.xml
04.08.2020 20:42:20 |  | log flags: file_xfer, sched_ops, task, http_debug, mem_usage_debug, task_debug
04.08.2020 20:42:21 |  | [mem_usage] enforce: available RAM 11385.52MB swap 14395.52MB
04.08.2020 20:42:24 |  | Re-reading cc_config.xml
04.08.2020 20:42:24 |  | log flags: file_xfer, sched_ops, task, http_debug, mem_usage_debug, task_debug
04.08.2020 20:42:24 |  | [mem_usage] enforce: available RAM 11385.52MB swap 14395.52MB
04.08.2020 20:42:28 | Milkyway@Home | task de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 suspended by user
04.08.2020 20:42:29 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.547, kernel CPU 71.594
04.08.2020 20:42:29 |  | [mem_usage] BOINC totals: WS 2008.83MB, smoothed 2004.52MB, swap 2670.34MB, 0.00 page faults/sec
04.08.2020 20:42:29 |  | [mem_usage] All others: WS 4481.16MB, swap 2952.17MB, user 27738.344s, kernel 43069.109s
04.08.2020 20:42:29 |  | [mem_usage] non-BOINC CPU usage: 8.45%
04.08.2020 20:42:29 |  | [mem_usage] enforce: available RAM 11385.52MB swap 14395.52MB
04.08.2020 20:42:29 | Milkyway@Home | [task] task_state=SUSPENDED for de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 from suspend
04.08.2020 20:42:30 | Milkyway@Home | task de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 resumed by user
04.08.2020 20:42:30 |  | [mem_usage] enforce: available RAM 11385.52MB swap 14395.52MB
04.08.2020 20:42:30 | Milkyway@Home | [task] task_state=EXECUTING for de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 from unsuspend
04.08.2020 20:42:39 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.594, kernel CPU 71.672
04.08.2020 20:42:39 |  | [mem_usage] BOINC totals: WS 2438.57MB, smoothed 2221.54MB, swap 3095.05MB, 0.00 page faults/sec
04.08.2020 20:42:39 |  | [mem_usage] All others: WS 4533.24MB, swap 2968.78MB, user 27739.672s, kernel 43070.578s
04.08.2020 20:42:39 |  | [mem_usage] non-BOINC CPU usage: 6.93%
04.08.2020 20:42:49 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.594, kernel CPU 71.672
04.08.2020 20:42:49 |  | [mem_usage] BOINC totals: WS 2438.57MB, smoothed 2330.06MB, swap 3095.05MB, 0.00 page faults/sec
04.08.2020 20:42:49 |  | [mem_usage] All others: WS 4532.83MB, swap 2969.05MB, user 27741.453s, kernel 43072.031s
04.08.2020 20:42:49 |  | [mem_usage] non-BOINC CPU usage: 8.06%
04.08.2020 20:42:59 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.594, kernel CPU 71.672
04.08.2020 20:42:59 |  | [mem_usage] BOINC totals: WS 2438.57MB, smoothed 2384.31MB, swap 3095.05MB, 0.00 page faults/sec
04.08.2020 20:42:59 |  | [mem_usage] All others: WS 4532.80MB, swap 2968.76MB, user 27743.094s, kernel 43073.344s
04.08.2020 20:42:59 |  | [mem_usage] non-BOINC CPU usage: 7.37%
04.08.2020 20:43:09 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.594, kernel CPU 71.672
04.08.2020 20:43:09 |  | [mem_usage] BOINC totals: WS 2438.57MB, smoothed 2411.44MB, swap 3095.05MB, 0.00 page faults/sec
04.08.2020 20:43:09 |  | [mem_usage] All others: WS 4495.30MB, swap 2954.97MB, user 27744.188s, kernel 43074.328s
04.08.2020 20:43:09 |  | [mem_usage] non-BOINC CPU usage: 5.15%

I have suspended the task for now since running it prevents other tasks from using my CPU.
ID: 70014 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aberforth

Send message
Joined: 25 Jul 20
Posts: 3
Credit: 31,961
RAC: 0
Message 70016 - Posted: 5 Aug 2020, 7:07:35 UTC

Nevermind, I found the fix when browsing in this forum: I just had to suspend task, restart BOINC and unsuspend task as described in problem with de_nbody tasks never finishing. The remaining time is now at 2 hours. Next time I'll browse the forums more before asking.

"Have you tried turning it off and on again?" proven true once more ...
ID: 70016 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jackie Kudron

Send message
Joined: 8 Dec 20
Posts: 3
Credit: 867,984
RAC: 0
Message 70307 - Posted: 4 Jan 2021, 7:23:11 UTC - in response to Message 70012.  

I have a similar situation. Any task that requires multiple CPU's gets locked up mid processing and the only way to clear out the machine is to abort ALL multi-CPU tasks. Very annoying and time consuming aborting them all. Not to mention lost DATA that researchers are missing out on. My only current workaround is to change settings when to no new tasks. Changing computer settings to use only 10% of processers then downloading only single CPU tasks. Once downloaded changing back to no new tasks and then restoring my CPU usage back to 90%.
ID: 70307 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jackie Kudron

Send message
Joined: 8 Dec 20
Posts: 3
Credit: 867,984
RAC: 0
Message 70308 - Posted: 4 Jan 2021, 7:24:56 UTC - in response to Message 70016.  

Didn't work for my problem. Still stuck every time.
ID: 70308 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,950,216
RAC: 22,077
Message 70309 - Posted: 4 Jan 2021, 12:53:42 UTC - in response to Message 70307.  

I have a similar situation. Any task that requires multiple CPU's gets locked up mid processing and the only way to clear out the machine is to abort ALL multi-CPU tasks. Very annoying and time consuming aborting them all. Not to mention lost DATA that researchers are missing out on. My only current workaround is to change settings when to no new tasks. Changing computer settings to use only 10% of processers then downloading only single CPU tasks. Once downloaded changing back to no new tasks and then restoring my CPU usage back to 90%.


Actually aborting tasks just puts them back in the queue for someone else to download.

You could write an app_config.xml file to limit the cpu cores available to a particular project ie MilkyWay rather than go thru all that.
ID: 70309 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jackie Kudron

Send message
Joined: 8 Dec 20
Posts: 3
Credit: 867,984
RAC: 0
Message 70333 - Posted: 10 Jan 2021, 22:09:43 UTC - in response to Message 70309.  

You could write an app_config.xml file to limit the cpu cores available to a particular project ie MilkyWay rather than go thru all that.

Write a what????
ID: 70333 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,950,216
RAC: 22,077
Message 70334 - Posted: 11 Jan 2021, 12:03:16 UTC - in response to Message 70333.  

You could write an app_config.xml file to limit the cpu cores available to a particular project ie MilkyWay rather than go thru all that.

Write a what????


An app_config.xml file tells a particular Project what to do and how to do it, some Projects let you do it on their websites others, like MilkyWay do not, so app_config files can come in handy. This one is not a new one I've had it for years and it may not work but it should:

<app_config>

<app>
<name>milkyway</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.5</cpu_usage>
</gpu_versions>
</app>
<app_version>
<app_name>milkyway_nbody</app_name>
<max_concurrent>1</max_concurrent>
<plan_class>mt</plan_class>
<avg_ncpus>2</avg_ncpus>
<cmdline>--nthreads 2</cmdline>
</app_version>
</app_config>

What it does is tell MilkyWay to run 2 tasks at the same time on your gpu(graphics card) and to only use 2 cpu cores to process the NBody tasks. It will only use the parts it needs so if you are not running any gpu tasks it will pop up a small error saying 'unknown task type' duh there aren't any on the machine so you can just ignore it.

What the NBody part tells Boinc is to only run one task at a time and to only use 2 cpu cores to do it. Your tasks WILL take longer so be VERY careful of your cache size as it's very easy to get waaaay more tasks than you can finish before they expire. I suggest cutting back to no more than a 1 and 1 setting and see how it goes from there, that's 1 day and 1 additional day in the settings.

You can change it to use 3 cpu cores or even 1 cpu core just by changing the <avg_cpus> line and the '<cmdline>--nthreads' line as well, just change the number and save it.

To use the file copy and paste the lines above into NOTEPAD in Windows or a text editor in Linux. Save the file into the folder in Windows at
c:\program files\Boinc\projects\miklyway.cs.rpi.edu_milkyway in windows. If you are a Linux user you will have to find it on your own. Your computers are hidden so I can't see them to know which OS you are using. When saving the file save it as a txt type file with the name app_config.xml and be sure Windows doesn't tack on a .txt when you save it. If it does just delete the .txt after the .xml I don't think this will not take affect until you get new tasks for Milkyway, I think the existing tasks are what they are, but if you are really lucky the very next NBody task you run could pick up the changes. to ensure Boinc sees the file go into Boinc Manager, down by the clock in Windows, and open it and then click on the Options tab at the top and then 'Read config files', this will prompt Boinc to give you are message you can read by going to the Tools tab and clicking on, Event Log and it should come up with a very long list of stuff, you only care about the most recent thing, which is at the bottom of the list, saying something like this except for MilkyWay "1/11/2021 7:00:34 AM | climateprediction.net | Found app_config.xml". Remember yours could give an unknown file type found which is normal since you aren't running the gpu tasks.

After that just wait and see what happens.
ID: 70334 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Huge 4 CPU task stuck at 17.188%, uses 0% of CPU, estimated time to finish 8 days?

©2024 Astroinformatics Group