Welcome to MilkyWay@home

Some tasks stalling

Questions and Answers : Windows : Some tasks stalling
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Ken

Send message
Joined: 18 Aug 20
Posts: 4
Credit: 48,409,450
RAC: 48,097
Message 76320 - Posted: 2 Aug 2023, 17:07:13 UTC - in response to Message 76319.  

As a follow up, to my previous reply, here are the logs from the issues I have been experiencing today:

8/1/2023 12:52:10 PM | | - Store up to an additional 0.50 days of work
8/1/2023 12:52:10 PM | | - max disk usage: 321.13 GB
8/1/2023 12:52:10 PM | | - (to change preferences, visit a project web site or select Preferences in the Manager)
8/1/2023 1:41:44 PM | Milkyway@Home | Computation for task de_nbody_02_27_2023_v182_pal5__data__10_1688749648_608890_1 finished
8/1/2023 1:41:44 PM | Milkyway@Home | Starting task de_nbody_02_27_2023_v182_pal5__data__12_1688749648_617140_1
8/1/2023 2:42:23 PM | Milkyway@Home | Sending scheduler request: To report completed tasks.
8/1/2023 2:42:23 PM | Milkyway@Home | Reporting 1 completed tasks
8/1/2023 2:42:23 PM | Milkyway@Home | Requesting new tasks for CPU
8/1/2023 2:42:24 PM | Milkyway@Home | Scheduler request completed: got 4 new tasks
8/1/2023 2:42:24 PM | Milkyway@Home | Project requested delay of 91 seconds
8/2/2023 10:32:51 AM | | Suspending computation - user request
8/2/2023 10:32:54 AM | | Resuming computation
8/2/2023 10:38:36 AM | Milkyway@Home | Computation for task de_nbody_02_27_2023_v182_pal5__data__12_1688749648_617140_1 finished
8/2/2023 10:38:36 AM | Milkyway@Home | Starting task de_nbody_02_27_2023_v182_pal5__data__11_1688749648_606539_1
8/2/2023 11:39:11 AM | Milkyway@Home | Sending scheduler request: To report completed tasks.
8/2/2023 11:39:11 AM | Milkyway@Home | Reporting 1 completed tasks
8/2/2023 11:39:11 AM | Milkyway@Home | Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: )
8/2/2023 11:39:12 AM | Milkyway@Home | Scheduler request completed
8/2/2023 11:39:12 AM | Milkyway@Home | Project requested delay of 91 seconds
8/2/2023 12:23:20 PM | | Suspending computation - user request (At this point I noticed the processing stalled)
8/2/2023 12:23:29 PM | | Resuming computation (I re-enabled the processing per my preferences)
8/2/2023 12:32:47 PM | Milkyway@Home | Computation for task de_nbody_02_27_2023_v182_pal5__data__11_1688749648_606539_1 finished
8/2/2023 12:32:47 PM | Milkyway@Home | Starting task de_nbody_02_27_2023_v182_pal5__data__12_1688749648_617039_1
8/2/2023 12:36:40 PM | | Suspending computation - user request (I am stating at this point I noticed the processing stalled)
8/2/2023 12:39:16 PM | | Resuming computation (I re-enabled the processing per my preferences)
ID: 76320 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76321 - Posted: 2 Aug 2023, 21:23:34 UTC - in response to Message 76319.  

I am also experiencing the stalling of N-Body simulation tasks. When I reboot my Win11 HP laptop with 16M of memory and a 1TB SSD it will run for maybe 5 min - 30 min and then it appears to stall and the number in the progress column then never changes. I can wait hours and it never changes. I resolve the "Progress" issue temporarily by
1) Restarting my computer
2) Closing and opening the BIONIC Manager Application
OR
3) I will go to Activity on the menu bar and select "Suspend" for the CPU section. All tasks then report a status of "Suspended - user request (8 CPUs)". When I select Activity on the menu bar and select "Run always" or "Run based on preferences", the numbers in the progress column starts counting up again. If I wait again, anywhere from the 5 min - 30 min the N-Body Simulation task stalls and ceases to progress.
4) When I go into Windows task manager and look at the BIONIC manager process percentage when the processing stops, the CPU % use is zero (0) or maybe .1%. Once I "Suspend" the task and select "Run based on preferences" under Activity on the menu bar, the task starts counting up and the CPU usage for Bionic manager returns to 40% usage (Which is what I have my preferences set for).

It almost seems like for whatever reason 1) A setting needs to be changed somewhere, 2) There is some sort of a memory leak in the application, or 3) The system runs out of resources and can't continue until a reboot, the BONIC Manager restarts, or the task is suspended and re-enabled.

Does any of this make sense?


How many cpu cores is the task using when it's running?
ID: 76321 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 623
Credit: 19,258,826
RAC: 349
Message 76322 - Posted: 3 Aug 2023, 12:02:21 UTC - in response to Message 76319.  
Last modified: 3 Aug 2023, 12:04:30 UTC

1) A setting needs to be changed somewhere
Probably. Like it has been pointed out in this thread, n-Body tasks don't like using less than 100% of CPU time, so if you use less, set it to 100% and see if that solves the issue. For temperature management use less cores.
ID: 76322 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ken

Send message
Joined: 18 Aug 20
Posts: 4
Credit: 48,409,450
RAC: 48,097
Message 76323 - Posted: 4 Aug 2023, 15:13:09 UTC - in response to Message 76321.  

Requesting 8 Cpu's.
Number of CPUs: 1 physical CPU(s) ; 8 physical cores , 16 logical cores
Name: Ryzen 7 5700U with Radeon Graphics @ 1800 MHz
Clock frequency: 1800 Mhz
ID: 76323 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76324 - Posted: 4 Aug 2023, 20:43:34 UTC - in response to Message 76323.  

Requesting 8 Cpu's.
Number of CPUs: 1 physical CPU(s) ; 8 physical cores , 16 logical cores
Name: Ryzen 7 5700U with Radeon Graphics @ 1800 MHz
Clock frequency: 1800 Mhz


The question is it trying to use all 16 cpu cores for a single MilkyWay task?
Look in the Boinc Manager in the center column names Status and tell us what it says please.

The 2nd thing Link brought up is again the Boinc Manager but this time at the top under the tab Options, then Computing Preferences tell us what the "% of cpu time" is set for both 'when computer is in use' and also 'when computer is not in use'. The last question, for now, is do you have your settings set to not crunch 'when the computer is in use'.

As an example my laptop is set to use 'at most 25% of the cpu's' and to use 'at most 90% of cpu time' for both when the computer is in use and when the computer is not in use, but I have set it to not stop crunching when the computer is in use as those boxes are unchecked.
ID: 76324 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MishraMirabai

Send message
Joined: 23 Jan 21
Posts: 10
Credit: 277,217
RAC: 0
Message 76365 - Posted: 1 Sep 2023, 21:45:58 UTC - in response to Message 76322.  

This right here was what i needed to hear! Been having the same problem since last year, actually gave up running boinc for awhile there. I had gotten good advice on how to make an appconfig file in my post on same problem last year, https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4833&postid=71688#71688

limiting cps to %50 helped but setting cpu time to %100 has entirely solved to stalling n-body problem.
I suppose it just has to be that way with milky way tasks.. Now im going to try and figure out if there's a way to set that in a config file so I can bring the global cpu % back down. or vice versa, and use config files to limit other projects.

anyway, small thing, but your recommendation totally fixed the same problem I was having. can now leave Boinc to run, and get back into the project.
ID: 76365 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sealee

Send message
Joined: 15 Jul 20
Posts: 19
Credit: 144,844
RAC: 0
Message 76414 - Posted: 6 Oct 2023, 5:31:21 UTC - in response to Message 76365.  
Last modified: 6 Oct 2023, 6:10:46 UTC

Same here! i've suspended the Universe tasks and restarted BOINC several times since wednesday and it does it for a bit and then stalls and stops, but it's also extremely slow in doing tasks. Despite giving it 50% CPU and Memory to work with which i can still use my computer at the same time with, and i have a gaming laptop and 16GB memory so it can handle it. With other projects that's enough to get tasks done pretty quickly well before the deadline. I noticed how slow it was since i just started doing BOINC again.

It's definitely the N-Body tasks doing it the most, but other tasks aren't going as fast as they should be either. I've gotten done a few of these N-Body tasks so far and it wasn't THIS slow at first, so something on their end is holding it up hugely, i got the others done in less than 24 hours doing parts between two days, but i think this is going to take me a few days to complete with how slow it's now gone.

Edit: It just started going a bit faster putting free disk space up to 20GB downloading a little more chunks than before, so perhaps the N-Body tasks need more than other tasks. Normally i can get the others done at 5GB which was what others recommended last time i checked, gets tasks done with other projects with that really fast. It's still not as fast as it should be though.
ID: 76414 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 623
Credit: 19,258,826
RAC: 349
Message 76415 - Posted: 6 Oct 2023, 8:49:39 UTC - in response to Message 76414.  

Despite giving it 50% CPU
50% of CPUs or 50% of CPU time? The later is a well known issue, change to 50% of CPUs and 100% of CPU time. If that doesn't help completely, check also this.
ID: 76415 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76416 - Posted: 6 Oct 2023, 10:24:34 UTC - in response to Message 76414.  

Same here! i've suspended the Universe tasks and restarted BOINC several times since wednesday and it does it for a bit and then stalls and stops, but it's also extremely slow in doing tasks. Despite giving it 50% CPU and Memory to work with which i can still use my computer at the same time with, and i have a gaming laptop and 16GB memory so it can handle it. With other projects that's enough to get tasks done pretty quickly well before the deadline. I noticed how slow it was since i just started doing BOINC again.

It's definitely the N-Body tasks doing it the most, but other tasks aren't going as fast as they should be either. I've gotten done a few of these N-Body tasks so far and it wasn't THIS slow at first, so something on their end is holding it up hugely, i got the others done in less than 24 hours doing parts between two days, but i think this is going to take me a few days to complete with how slow it's now gone.

Edit: It just started going a bit faster putting free disk space up to 20GB downloading a little more chunks than before, so perhaps the N-Body tasks need more than other tasks. Normally i can get the others done at 5GB which was what others recommended last time i checked, gets tasks done with other projects with that really fast. It's still not as fast as it should be though.


The Boinc Project Programmers here are few and far between and things don't get as updated as other projects
ID: 76416 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sealee

Send message
Joined: 15 Jul 20
Posts: 19
Credit: 144,844
RAC: 0
Message 76417 - Posted: 6 Oct 2023, 21:22:26 UTC - in response to Message 76415.  
Last modified: 6 Oct 2023, 22:19:58 UTC

I'd never be able to use my computer if i did that, i don't run them when not using it otherwise i'd have to keep my charger in all the time. Suddenly speed is back up again to 20% - over 21% per hour *shrug* so i can now get one done in like 5 hours - 6 hours, or 3 hours one day and 2 or 3 hours the next day.

I think that's what i've been seeing is it'll speed up and then the speed will fall right back down again, perhaps it's an issue with BOINC i don't know but it may or may not work how it should, i think today is going to be a speed day :) although it still just stalled as i'm typing this. I guess i'll just have to keep suspending it.

Edit: Setting it to 100% CPU Time for just fo a couple mins gave it a boost, and then i put it down to 51% CPU Time just to give it a teeny bit more and the task now suddenly nearing the end is going faster. But i guarantee when speeds drop again for some reason it won't make a difference because it'll just drop.
ID: 76417 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sealee

Send message
Joined: 15 Jul 20
Posts: 19
Credit: 144,844
RAC: 0
Message 76418 - Posted: 8 Oct 2023, 21:58:40 UTC - in response to Message 76417.  
Last modified: 8 Oct 2023, 22:08:57 UTC

So i fixed the stalling by reducing the number of CPU Cores below 50% i have it set to 35%, i have 8 cores in my 4 CPUs but it just doesn't seem to like you using even half. I also set CPU Time to 51% which is keeping it going a little more consistently.

It's just super glitchy though none of the settings work how they should and the speed will continually go up then back down again until you get to 70% way through on the N Body Tasks, and all of a sudden i'm consistently doing 20% to over 21% per hour. So anyone having these problems just play around with it for your type of system, but keep CPU Cores under 50%, i'd like to give a bit more resources but it's not accepting it.

Have you guys at Milkyway ever thought of splitting the N Body tasks in half into two smaller tasks? i think that would help get these done faster if it doesn't have so much data to get through.
ID: 76418 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76419 - Posted: 9 Oct 2023, 11:32:10 UTC - in response to Message 76418.  

So i fixed the stalling by reducing the number of CPU Cores below 50% i have it set to 35%, i have 8 cores in my 4 CPUs but it just doesn't seem to like you using even half. I also set CPU Time to 51% which is keeping it going a little more consistently.

It's just super glitchy though none of the settings work how they should and the speed will continually go up then back down again until you get to 70% way through on the N Body Tasks, and all of a sudden i'm consistently doing 20% to over 21% per hour. So anyone having these problems just play around with it for your type of system, but keep CPU Cores under 50%, i'd like to give a bit more resources but it's not accepting it.


I'm glad it's working for you and yes giving it less than 100% of the cpu cores helps with ALOT of the problems here!!

Have you guys at Milkyway ever thought of splitting the N Body tasks in half into two smaller tasks? i think that would help get these done faster if it doesn't have so much data to get through.


The Admins here pretty much just keep things going as they are part-time and the people helping them are grad students who get their degrees and move on so they get a new grad student who has to figure out how everything works before anything gets done if something breaks. They haven't updated many things here in awhile.
ID: 76419 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sealee

Send message
Joined: 15 Jul 20
Posts: 19
Credit: 144,844
RAC: 0
Message 76420 - Posted: 10 Oct 2023, 22:36:34 UTC - in response to Message 76419.  
Last modified: 10 Oct 2023, 22:39:18 UTC

:) Milkyway seems to be really sensitive to CPU Time fluctuations too it's more easily automatically suspended or stalled, it's still going to stall once or twice because of it but you'll get through the rest of it without any stalling even when you're running it while using your computer giving it all the resources you never use while just doing things on the internet.

I would say it's time for an update if anyone can get around to it *pray hands* make it less sensitive to CPU Time fluctuations if that's not just how it has to be, and maybe if they can split the tasks up if the simulation doesn't need to be done all in one go that would help even a bit less powerful computers than mine do them, or even during slower internet days which happens often lately.
ID: 76420 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76421 - Posted: 11 Oct 2023, 10:42:38 UTC - in response to Message 76420.  

:) Milkyway seems to be really sensitive to CPU Time fluctuations too it's more easily automatically suspended or stalled, it's still going to stall once or twice because of it but you'll get through the rest of it without any stalling even when you're running it while using your computer giving it all the resources you never use while just doing things on the internet.

I would say it's time for an update if anyone can get around to it *pray hands* make it less sensitive to CPU Time fluctuations if that's not just how it has to be, and maybe if they can split the tasks up if the simulation doesn't need to be done all in one go that would help even a bit less powerful computers than mine do them, or even during slower internet days which happens often lately.


I agree with you but they have no money to do any upgrades at all, unfortunately upgrades would mean someone sitting down with the coding who knows both the code itself and alot of the intricacies of how Boinc works and no one at MilkyWay seems to have those skills anymore.
ID: 76421 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 623
Credit: 19,258,826
RAC: 349
Message 76423 - Posted: 11 Oct 2023, 18:25:38 UTC - in response to Message 76420.  
Last modified: 11 Oct 2023, 18:27:27 UTC

and maybe if they can split the tasks up if the simulation doesn't need to be done all in one go that would help even a bit less powerful computers than mine do them

No idea what you are using since your computers are hidden, but my Core 2 Duo didn't have any issues with n-Body, don't think there are significant numbers of computers slower than that. The longest task I got so far needed 153269.9 CPU seconds, running on both cores it completed in less than a day of run time. Not too bad for a ~14 years old dinosaur. ;-)

But yes, they should fix this issue with getting stuck when not getting 100% CPU time, apparently many people with modern computers need this for some reason.
ID: 76423 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sealee

Send message
Joined: 15 Jul 20
Posts: 19
Credit: 144,844
RAC: 0
Message 76425 - Posted: 12 Oct 2023, 22:41:59 UTC - in response to Message 76421.  
Last modified: 12 Oct 2023, 22:47:51 UTC

That i do know is true sadly. I guess that's as best we can get it then 35% Cores, 51% CPU Time, and no more than 50% Memory. It's much easier running any tasks while you aren't playing any games not under any heavier usage really, some people choose to run it over night but i find i get more tasks done running it while using my computer and you can keep an eye on it if it stalls. Although i can tell when it stalls when my fan stop going lol.

I set it to use 80% Memory while my computer isn't in use too which will give it a short boost while i'm away. You could set your energy saving settings to turn the screen off faster so it knows faster when the computer isn't in use.
ID: 76425 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sealee

Send message
Joined: 15 Jul 20
Posts: 19
Credit: 144,844
RAC: 0
Message 76428 - Posted: 14 Oct 2023, 6:54:10 UTC

Thank you Mikey for the the app_config advice btw, i created two files with for Milkyway and Universe and that helped quite a bit, so i can now set the preferences to 30% Cores and 100% CPU Time so it uses only 100% CPU Time of 2 Cores which is fast enough for doing any tasks. NFS@home and Rosetta it splits 50% of my CPU between two tasks, and then Milkyway and Universe you can only do 1 task at a time with those so it'll make it go along faster than it was.
ID: 76428 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76429 - Posted: 14 Oct 2023, 11:28:53 UTC - in response to Message 76428.  

Thank you Mikey for the the app_config advice btw, i created two files with for Milkyway and Universe and that helped quite a bit, so i can now set the preferences to 30% Cores and 100% CPU Time so it uses only 100% CPU Time of 2 Cores which is fast enough for doing any tasks. NFS@home and Rosetta it splits 50% of my CPU between two tasks, and then Milkyway and Universe you can only do 1 task at a time with those so it'll make it go along faster than it was.


You are very welcome, I'm glad it's making things better fo you.
ID: 76429 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sealee

Send message
Joined: 15 Jul 20
Posts: 19
Credit: 144,844
RAC: 0
Message 76430 - Posted: 14 Oct 2023, 19:58:57 UTC - in response to Message 76429.  
Last modified: 14 Oct 2023, 20:13:36 UTC

:D 20% Cores is 1 core, 30% Cores is 2 cores, and 40% Cores is 3 cores, so that means 50% Cores is 4 cores. So the lowest we can go is 20% which you can for NFS and 80% CPU Time because they're smaller tasks. So that's why it's better setting it under 50% because 50% would be using 4 Cores per task.

I'm not sure i've got the Universe name correct though, i put universe_bh2 in the app_config file but not sure that's correct, Do you have a list of task names for all the projects? i might create one for NFS too, but both Rosetta and NFS the preferences on the websites adjust almost correctly to what you set it to apart from it spiking a bit higher from where it's supposed to be, so if it spikes too much i'll probs create config files for those too.
ID: 76430 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76431 - Posted: 15 Oct 2023, 11:11:07 UTC - in response to Message 76430.  

:D 20% Cores is 1 core, 30% Cores is 2 cores, and 40% Cores is 3 cores, so that means 50% Cores is 4 cores. So the lowest we can go is 20% which you can for NFS and 80% CPU Time because they're smaller tasks. So that's why it's better setting it under 50% because 50% would be using 4 Cores per task.

I'm not sure i've got the Universe name correct though, i put universe_bh2 in the app_config file but not sure that's correct, Do you have a list of task names for all the projects? i might create one for NFS too, but both Rosetta and NFS the preferences on the websites adjust almost correctly to what you set it to apart from it spiking a bit higher from where it's supposed to be, so if it spikes too much i'll probs create config files for those too.


If you put the wrong name in the app_config file the Project will send a message saying these are the allowed names you can use, so you can use your MilkyWay one at Universe and it will come back with the acceptable names it's looking for.

For a 4 core pc the numbers should be 25% and below is 1 cpu core, 26% to 50% is 2 cores, 51% to 99% is 3 cores and 100% is 4 cores. The range for 3 cpu cores should be from 50% all the way to 99%. I have pc's with 8, 16 and even a 32 core pc and I use a cc_config.xml file to limit the number of tasks that can run at one time instead of limiting the total number of cpu cores in the Boinc Manager or website. It let's me keep my setting at 99% in the Boinc Manager or on the website, I always keep 1 cpu core free for gpu usage and so it doesn't take forever to do anything when I check on how the pc is doing, and then I fine tune Boinc itself thru the cc_config files I put in each Project folder. This lets me run 1 cpu task from this Project, 4 cpu tasks from that Project and all the rest of my cpu cores on a different Project. I do this because some tasks take ALOT of pc memory when they run, ie the Yoyo ECMp2 tasks take 8gb of ram for each task, so an 8 core pc with only 16gb of ram in it can't run 7 of those and a gpu project on the stand alone gpu in the box all at the same time, so I limit Yoyo to 1 task at a time and then run 6 cpu tasks from another Project that do not take alot of ram for each task.

If you go into the Boinc Manager and click once on a running tasks to hilite it, then go over to the left and click the box labelled 'properties' that will tell you how much ram that task is using, most tasks of the same kind from the same Project will take about the same amount of ram to run. You want to use the larger amount of ram used on the 2 lines listing the ram usage when deciding how many tasks you can run at one time.
ID: 76431 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Questions and Answers : Windows : Some tasks stalling

©2024 Astroinformatics Group