Message boards :
Number crunching :
Huge 4 CPU task stuck at 17.188%, uses 0% of CPU, estimated time to finish 8 days?
Message board moderation
Author | Message |
---|---|
Send message Joined: 25 Jul 20 Posts: 3 Credit: 31,961 RAC: 0 |
Hi, This task for 4 CPUs: https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=2000974954 even when suspending all other tasks, does not use my CPU at all anymore and appears stuck . But it somehow got to 17.188%. Is my CPU too old for the rest of the task? Should I cancel it? |
Send message Joined: 8 May 09 Posts: 3315 Credit: 519,950,216 RAC: 22,077 |
Hi, Suspend every other task on the pc, set all zero resource share projects to no new tasks too so you don't get tasks from them, and see if it starts up again it could just need more memory with the other things you are running. If that doesn't work try suspending it too then resuming it. You cpu is fine. |
Send message Joined: 25 Jul 20 Posts: 3 Credit: 31,961 RAC: 0 |
That did not work unfortunately: I suspended all other tasks and set them to no new tasks. I also closed every other application that used significant amounts of RAM (Browser, Spotify etc.). It (the process milkyway_nbody_1.76_windows_x86_64__mt.exe ) still uses 0% when supposedly running. Here is the task and memory usage log of that try: 04.08.2020 20:41:14 | Milkyway@Home | task de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 resumed by user 04.08.2020 20:41:14 | Milkyway@Home | [task] task_state=EXECUTING for de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 from unsuspend 04.08.2020 20:42:20 | | Re-reading cc_config.xml 04.08.2020 20:42:20 | | log flags: file_xfer, sched_ops, task, http_debug, mem_usage_debug, task_debug 04.08.2020 20:42:21 | | [mem_usage] enforce: available RAM 11385.52MB swap 14395.52MB 04.08.2020 20:42:24 | | Re-reading cc_config.xml 04.08.2020 20:42:24 | | log flags: file_xfer, sched_ops, task, http_debug, mem_usage_debug, task_debug 04.08.2020 20:42:24 | | [mem_usage] enforce: available RAM 11385.52MB swap 14395.52MB 04.08.2020 20:42:28 | Milkyway@Home | task de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 suspended by user 04.08.2020 20:42:29 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.547, kernel CPU 71.594 04.08.2020 20:42:29 | | [mem_usage] BOINC totals: WS 2008.83MB, smoothed 2004.52MB, swap 2670.34MB, 0.00 page faults/sec 04.08.2020 20:42:29 | | [mem_usage] All others: WS 4481.16MB, swap 2952.17MB, user 27738.344s, kernel 43069.109s 04.08.2020 20:42:29 | | [mem_usage] non-BOINC CPU usage: 8.45% 04.08.2020 20:42:29 | | [mem_usage] enforce: available RAM 11385.52MB swap 14395.52MB 04.08.2020 20:42:29 | Milkyway@Home | [task] task_state=SUSPENDED for de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 from suspend 04.08.2020 20:42:30 | Milkyway@Home | task de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 resumed by user 04.08.2020 20:42:30 | | [mem_usage] enforce: available RAM 11385.52MB swap 14395.52MB 04.08.2020 20:42:30 | Milkyway@Home | [task] task_state=EXECUTING for de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1 from unsuspend 04.08.2020 20:42:39 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.594, kernel CPU 71.672 04.08.2020 20:42:39 | | [mem_usage] BOINC totals: WS 2438.57MB, smoothed 2221.54MB, swap 3095.05MB, 0.00 page faults/sec 04.08.2020 20:42:39 | | [mem_usage] All others: WS 4533.24MB, swap 2968.78MB, user 27739.672s, kernel 43070.578s 04.08.2020 20:42:39 | | [mem_usage] non-BOINC CPU usage: 6.93% 04.08.2020 20:42:49 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.594, kernel CPU 71.672 04.08.2020 20:42:49 | | [mem_usage] BOINC totals: WS 2438.57MB, smoothed 2330.06MB, swap 3095.05MB, 0.00 page faults/sec 04.08.2020 20:42:49 | | [mem_usage] All others: WS 4532.83MB, swap 2969.05MB, user 27741.453s, kernel 43072.031s 04.08.2020 20:42:49 | | [mem_usage] non-BOINC CPU usage: 8.06% 04.08.2020 20:42:59 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.594, kernel CPU 71.672 04.08.2020 20:42:59 | | [mem_usage] BOINC totals: WS 2438.57MB, smoothed 2384.31MB, swap 3095.05MB, 0.00 page faults/sec 04.08.2020 20:42:59 | | [mem_usage] All others: WS 4532.80MB, swap 2968.76MB, user 27743.094s, kernel 43073.344s 04.08.2020 20:42:59 | | [mem_usage] non-BOINC CPU usage: 7.37% 04.08.2020 20:43:09 | Milkyway@Home | [mem_usage] de_nbody_07_29_2020_v176_40k__data__1_1596028202_53172_1: WS 5.89MB, smoothed 5.89MB, swap 11.84MB, 0.00 page faults/sec, user CPU 4931.594, kernel CPU 71.672 04.08.2020 20:43:09 | | [mem_usage] BOINC totals: WS 2438.57MB, smoothed 2411.44MB, swap 3095.05MB, 0.00 page faults/sec 04.08.2020 20:43:09 | | [mem_usage] All others: WS 4495.30MB, swap 2954.97MB, user 27744.188s, kernel 43074.328s 04.08.2020 20:43:09 | | [mem_usage] non-BOINC CPU usage: 5.15% I have suspended the task for now since running it prevents other tasks from using my CPU. |
Send message Joined: 25 Jul 20 Posts: 3 Credit: 31,961 RAC: 0 |
Nevermind, I found the fix when browsing in this forum: I just had to suspend task, restart BOINC and unsuspend task as described in problem with de_nbody tasks never finishing. The remaining time is now at 2 hours. Next time I'll browse the forums more before asking. "Have you tried turning it off and on again?" proven true once more ... |
Send message Joined: 8 Dec 20 Posts: 3 Credit: 867,984 RAC: 0 |
I have a similar situation. Any task that requires multiple CPU's gets locked up mid processing and the only way to clear out the machine is to abort ALL multi-CPU tasks. Very annoying and time consuming aborting them all. Not to mention lost DATA that researchers are missing out on. My only current workaround is to change settings when to no new tasks. Changing computer settings to use only 10% of processers then downloading only single CPU tasks. Once downloaded changing back to no new tasks and then restoring my CPU usage back to 90%. |
Send message Joined: 8 Dec 20 Posts: 3 Credit: 867,984 RAC: 0 |
Didn't work for my problem. Still stuck every time. |
Send message Joined: 8 May 09 Posts: 3315 Credit: 519,950,216 RAC: 22,077 |
I have a similar situation. Any task that requires multiple CPU's gets locked up mid processing and the only way to clear out the machine is to abort ALL multi-CPU tasks. Very annoying and time consuming aborting them all. Not to mention lost DATA that researchers are missing out on. My only current workaround is to change settings when to no new tasks. Changing computer settings to use only 10% of processers then downloading only single CPU tasks. Once downloaded changing back to no new tasks and then restoring my CPU usage back to 90%. Actually aborting tasks just puts them back in the queue for someone else to download. You could write an app_config.xml file to limit the cpu cores available to a particular project ie MilkyWay rather than go thru all that. |
Send message Joined: 8 Dec 20 Posts: 3 Credit: 867,984 RAC: 0 |
You could write an app_config.xml file to limit the cpu cores available to a particular project ie MilkyWay rather than go thru all that. Write a what???? |
Send message Joined: 8 May 09 Posts: 3315 Credit: 519,950,216 RAC: 22,077 |
You could write an app_config.xml file to limit the cpu cores available to a particular project ie MilkyWay rather than go thru all that. An app_config.xml file tells a particular Project what to do and how to do it, some Projects let you do it on their websites others, like MilkyWay do not, so app_config files can come in handy. This one is not a new one I've had it for years and it may not work but it should: <app_config> <app> <name>milkyway</name> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>0.5</cpu_usage> </gpu_versions> </app> <app_version> <app_name>milkyway_nbody</app_name> <max_concurrent>1</max_concurrent> <plan_class>mt</plan_class> <avg_ncpus>2</avg_ncpus> <cmdline>--nthreads 2</cmdline> </app_version> </app_config> What it does is tell MilkyWay to run 2 tasks at the same time on your gpu(graphics card) and to only use 2 cpu cores to process the NBody tasks. It will only use the parts it needs so if you are not running any gpu tasks it will pop up a small error saying 'unknown task type' duh there aren't any on the machine so you can just ignore it. What the NBody part tells Boinc is to only run one task at a time and to only use 2 cpu cores to do it. Your tasks WILL take longer so be VERY careful of your cache size as it's very easy to get waaaay more tasks than you can finish before they expire. I suggest cutting back to no more than a 1 and 1 setting and see how it goes from there, that's 1 day and 1 additional day in the settings. You can change it to use 3 cpu cores or even 1 cpu core just by changing the <avg_cpus> line and the '<cmdline>--nthreads' line as well, just change the number and save it. To use the file copy and paste the lines above into NOTEPAD in Windows or a text editor in Linux. Save the file into the folder in Windows at c:\program files\Boinc\projects\miklyway.cs.rpi.edu_milkyway in windows. If you are a Linux user you will have to find it on your own. Your computers are hidden so I can't see them to know which OS you are using. When saving the file save it as a txt type file with the name app_config.xml and be sure Windows doesn't tack on a .txt when you save it. If it does just delete the .txt after the .xml I don't think this will not take affect until you get new tasks for Milkyway, I think the existing tasks are what they are, but if you are really lucky the very next NBody task you run could pick up the changes. to ensure Boinc sees the file go into Boinc Manager, down by the clock in Windows, and open it and then click on the Options tab at the top and then 'Read config files', this will prompt Boinc to give you are message you can read by going to the Tools tab and clicking on, Event Log and it should come up with a very long list of stuff, you only care about the most recent thing, which is at the bottom of the list, saying something like this except for MilkyWay "1/11/2021 7:00:34 AM | climateprediction.net | Found app_config.xml". Remember yours could give an unknown file type found which is normal since you aren't running the gpu tasks. After that just wait and see what happens. |
©2024 Astroinformatics Group