Questions and Answers :
Windows :
Some tasks stalling
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 18 Aug 20 Posts: 4 Credit: 48,409,450 RAC: 48,097 |
As a follow up, to my previous reply, here are the logs from the issues I have been experiencing today: 8/1/2023 12:52:10 PM | | - Store up to an additional 0.50 days of work 8/1/2023 12:52:10 PM | | - max disk usage: 321.13 GB 8/1/2023 12:52:10 PM | | - (to change preferences, visit a project web site or select Preferences in the Manager) 8/1/2023 1:41:44 PM | Milkyway@Home | Computation for task de_nbody_02_27_2023_v182_pal5__data__10_1688749648_608890_1 finished 8/1/2023 1:41:44 PM | Milkyway@Home | Starting task de_nbody_02_27_2023_v182_pal5__data__12_1688749648_617140_1 8/1/2023 2:42:23 PM | Milkyway@Home | Sending scheduler request: To report completed tasks. 8/1/2023 2:42:23 PM | Milkyway@Home | Reporting 1 completed tasks 8/1/2023 2:42:23 PM | Milkyway@Home | Requesting new tasks for CPU 8/1/2023 2:42:24 PM | Milkyway@Home | Scheduler request completed: got 4 new tasks 8/1/2023 2:42:24 PM | Milkyway@Home | Project requested delay of 91 seconds 8/2/2023 10:32:51 AM | | Suspending computation - user request 8/2/2023 10:32:54 AM | | Resuming computation 8/2/2023 10:38:36 AM | Milkyway@Home | Computation for task de_nbody_02_27_2023_v182_pal5__data__12_1688749648_617140_1 finished 8/2/2023 10:38:36 AM | Milkyway@Home | Starting task de_nbody_02_27_2023_v182_pal5__data__11_1688749648_606539_1 8/2/2023 11:39:11 AM | Milkyway@Home | Sending scheduler request: To report completed tasks. 8/2/2023 11:39:11 AM | Milkyway@Home | Reporting 1 completed tasks 8/2/2023 11:39:11 AM | Milkyway@Home | Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: ) 8/2/2023 11:39:12 AM | Milkyway@Home | Scheduler request completed 8/2/2023 11:39:12 AM | Milkyway@Home | Project requested delay of 91 seconds 8/2/2023 12:23:20 PM | | Suspending computation - user request (At this point I noticed the processing stalled) 8/2/2023 12:23:29 PM | | Resuming computation (I re-enabled the processing per my preferences) 8/2/2023 12:32:47 PM | Milkyway@Home | Computation for task de_nbody_02_27_2023_v182_pal5__data__11_1688749648_606539_1 finished 8/2/2023 12:32:47 PM | Milkyway@Home | Starting task de_nbody_02_27_2023_v182_pal5__data__12_1688749648_617039_1 8/2/2023 12:36:40 PM | | Suspending computation - user request (I am stating at this point I noticed the processing stalled) 8/2/2023 12:39:16 PM | | Resuming computation (I re-enabled the processing per my preferences) |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
I am also experiencing the stalling of N-Body simulation tasks. When I reboot my Win11 HP laptop with 16M of memory and a 1TB SSD it will run for maybe 5 min - 30 min and then it appears to stall and the number in the progress column then never changes. I can wait hours and it never changes. I resolve the "Progress" issue temporarily by How many cpu cores is the task using when it's running? |
Send message Joined: 19 Jul 10 Posts: 623 Credit: 19,258,826 RAC: 349 |
1) A setting needs to be changed somewhereProbably. Like it has been pointed out in this thread, n-Body tasks don't like using less than 100% of CPU time, so if you use less, set it to 100% and see if that solves the issue. For temperature management use less cores. |
Send message Joined: 18 Aug 20 Posts: 4 Credit: 48,409,450 RAC: 48,097 |
Requesting 8 Cpu's. Number of CPUs: 1 physical CPU(s) ; 8 physical cores , 16 logical cores Name: Ryzen 7 5700U with Radeon Graphics @ 1800 MHz Clock frequency: 1800 Mhz |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Requesting 8 Cpu's. The question is it trying to use all 16 cpu cores for a single MilkyWay task? Look in the Boinc Manager in the center column names Status and tell us what it says please. The 2nd thing Link brought up is again the Boinc Manager but this time at the top under the tab Options, then Computing Preferences tell us what the "% of cpu time" is set for both 'when computer is in use' and also 'when computer is not in use'. The last question, for now, is do you have your settings set to not crunch 'when the computer is in use'. As an example my laptop is set to use 'at most 25% of the cpu's' and to use 'at most 90% of cpu time' for both when the computer is in use and when the computer is not in use, but I have set it to not stop crunching when the computer is in use as those boxes are unchecked. |
Send message Joined: 23 Jan 21 Posts: 10 Credit: 277,217 RAC: 0 |
This right here was what i needed to hear! Been having the same problem since last year, actually gave up running boinc for awhile there. I had gotten good advice on how to make an appconfig file in my post on same problem last year, https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4833&postid=71688#71688 limiting cps to %50 helped but setting cpu time to %100 has entirely solved to stalling n-body problem. I suppose it just has to be that way with milky way tasks.. Now im going to try and figure out if there's a way to set that in a config file so I can bring the global cpu % back down. or vice versa, and use config files to limit other projects. anyway, small thing, but your recommendation totally fixed the same problem I was having. can now leave Boinc to run, and get back into the project. |
Send message Joined: 15 Jul 20 Posts: 19 Credit: 144,844 RAC: 0 |
Same here! i've suspended the Universe tasks and restarted BOINC several times since wednesday and it does it for a bit and then stalls and stops, but it's also extremely slow in doing tasks. Despite giving it 50% CPU and Memory to work with which i can still use my computer at the same time with, and i have a gaming laptop and 16GB memory so it can handle it. With other projects that's enough to get tasks done pretty quickly well before the deadline. I noticed how slow it was since i just started doing BOINC again. It's definitely the N-Body tasks doing it the most, but other tasks aren't going as fast as they should be either. I've gotten done a few of these N-Body tasks so far and it wasn't THIS slow at first, so something on their end is holding it up hugely, i got the others done in less than 24 hours doing parts between two days, but i think this is going to take me a few days to complete with how slow it's now gone. Edit: It just started going a bit faster putting free disk space up to 20GB downloading a little more chunks than before, so perhaps the N-Body tasks need more than other tasks. Normally i can get the others done at 5GB which was what others recommended last time i checked, gets tasks done with other projects with that really fast. It's still not as fast as it should be though. |
Send message Joined: 19 Jul 10 Posts: 623 Credit: 19,258,826 RAC: 349 |
Despite giving it 50% CPU50% of CPUs or 50% of CPU time? The later is a well known issue, change to 50% of CPUs and 100% of CPU time. If that doesn't help completely, check also this. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Same here! i've suspended the Universe tasks and restarted BOINC several times since wednesday and it does it for a bit and then stalls and stops, but it's also extremely slow in doing tasks. Despite giving it 50% CPU and Memory to work with which i can still use my computer at the same time with, and i have a gaming laptop and 16GB memory so it can handle it. With other projects that's enough to get tasks done pretty quickly well before the deadline. I noticed how slow it was since i just started doing BOINC again. The Boinc Project Programmers here are few and far between and things don't get as updated as other projects |
Send message Joined: 15 Jul 20 Posts: 19 Credit: 144,844 RAC: 0 |
I'd never be able to use my computer if i did that, i don't run them when not using it otherwise i'd have to keep my charger in all the time. Suddenly speed is back up again to 20% - over 21% per hour *shrug* so i can now get one done in like 5 hours - 6 hours, or 3 hours one day and 2 or 3 hours the next day. I think that's what i've been seeing is it'll speed up and then the speed will fall right back down again, perhaps it's an issue with BOINC i don't know but it may or may not work how it should, i think today is going to be a speed day :) although it still just stalled as i'm typing this. I guess i'll just have to keep suspending it. Edit: Setting it to 100% CPU Time for just fo a couple mins gave it a boost, and then i put it down to 51% CPU Time just to give it a teeny bit more and the task now suddenly nearing the end is going faster. But i guarantee when speeds drop again for some reason it won't make a difference because it'll just drop. |
Send message Joined: 15 Jul 20 Posts: 19 Credit: 144,844 RAC: 0 |
So i fixed the stalling by reducing the number of CPU Cores below 50% i have it set to 35%, i have 8 cores in my 4 CPUs but it just doesn't seem to like you using even half. I also set CPU Time to 51% which is keeping it going a little more consistently. It's just super glitchy though none of the settings work how they should and the speed will continually go up then back down again until you get to 70% way through on the N Body Tasks, and all of a sudden i'm consistently doing 20% to over 21% per hour. So anyone having these problems just play around with it for your type of system, but keep CPU Cores under 50%, i'd like to give a bit more resources but it's not accepting it. Have you guys at Milkyway ever thought of splitting the N Body tasks in half into two smaller tasks? i think that would help get these done faster if it doesn't have so much data to get through. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
So i fixed the stalling by reducing the number of CPU Cores below 50% i have it set to 35%, i have 8 cores in my 4 CPUs but it just doesn't seem to like you using even half. I also set CPU Time to 51% which is keeping it going a little more consistently. I'm glad it's working for you and yes giving it less than 100% of the cpu cores helps with ALOT of the problems here!! Have you guys at Milkyway ever thought of splitting the N Body tasks in half into two smaller tasks? i think that would help get these done faster if it doesn't have so much data to get through. The Admins here pretty much just keep things going as they are part-time and the people helping them are grad students who get their degrees and move on so they get a new grad student who has to figure out how everything works before anything gets done if something breaks. They haven't updated many things here in awhile. |
Send message Joined: 15 Jul 20 Posts: 19 Credit: 144,844 RAC: 0 |
:) Milkyway seems to be really sensitive to CPU Time fluctuations too it's more easily automatically suspended or stalled, it's still going to stall once or twice because of it but you'll get through the rest of it without any stalling even when you're running it while using your computer giving it all the resources you never use while just doing things on the internet. I would say it's time for an update if anyone can get around to it *pray hands* make it less sensitive to CPU Time fluctuations if that's not just how it has to be, and maybe if they can split the tasks up if the simulation doesn't need to be done all in one go that would help even a bit less powerful computers than mine do them, or even during slower internet days which happens often lately. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
:) Milkyway seems to be really sensitive to CPU Time fluctuations too it's more easily automatically suspended or stalled, it's still going to stall once or twice because of it but you'll get through the rest of it without any stalling even when you're running it while using your computer giving it all the resources you never use while just doing things on the internet. I agree with you but they have no money to do any upgrades at all, unfortunately upgrades would mean someone sitting down with the coding who knows both the code itself and alot of the intricacies of how Boinc works and no one at MilkyWay seems to have those skills anymore. |
Send message Joined: 19 Jul 10 Posts: 623 Credit: 19,258,826 RAC: 349 |
and maybe if they can split the tasks up if the simulation doesn't need to be done all in one go that would help even a bit less powerful computers than mine do them No idea what you are using since your computers are hidden, but my Core 2 Duo didn't have any issues with n-Body, don't think there are significant numbers of computers slower than that. The longest task I got so far needed 153269.9 CPU seconds, running on both cores it completed in less than a day of run time. Not too bad for a ~14 years old dinosaur. ;-) But yes, they should fix this issue with getting stuck when not getting 100% CPU time, apparently many people with modern computers need this for some reason. |
Send message Joined: 15 Jul 20 Posts: 19 Credit: 144,844 RAC: 0 |
That i do know is true sadly. I guess that's as best we can get it then 35% Cores, 51% CPU Time, and no more than 50% Memory. It's much easier running any tasks while you aren't playing any games not under any heavier usage really, some people choose to run it over night but i find i get more tasks done running it while using my computer and you can keep an eye on it if it stalls. Although i can tell when it stalls when my fan stop going lol. I set it to use 80% Memory while my computer isn't in use too which will give it a short boost while i'm away. You could set your energy saving settings to turn the screen off faster so it knows faster when the computer isn't in use. |
Send message Joined: 15 Jul 20 Posts: 19 Credit: 144,844 RAC: 0 |
Thank you Mikey for the the app_config advice btw, i created two files with for Milkyway and Universe and that helped quite a bit, so i can now set the preferences to 30% Cores and 100% CPU Time so it uses only 100% CPU Time of 2 Cores which is fast enough for doing any tasks. NFS@home and Rosetta it splits 50% of my CPU between two tasks, and then Milkyway and Universe you can only do 1 task at a time with those so it'll make it go along faster than it was. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Thank you Mikey for the the app_config advice btw, i created two files with for Milkyway and Universe and that helped quite a bit, so i can now set the preferences to 30% Cores and 100% CPU Time so it uses only 100% CPU Time of 2 Cores which is fast enough for doing any tasks. NFS@home and Rosetta it splits 50% of my CPU between two tasks, and then Milkyway and Universe you can only do 1 task at a time with those so it'll make it go along faster than it was. You are very welcome, I'm glad it's making things better fo you. |
Send message Joined: 15 Jul 20 Posts: 19 Credit: 144,844 RAC: 0 |
:D 20% Cores is 1 core, 30% Cores is 2 cores, and 40% Cores is 3 cores, so that means 50% Cores is 4 cores. So the lowest we can go is 20% which you can for NFS and 80% CPU Time because they're smaller tasks. So that's why it's better setting it under 50% because 50% would be using 4 Cores per task. I'm not sure i've got the Universe name correct though, i put universe_bh2 in the app_config file but not sure that's correct, Do you have a list of task names for all the projects? i might create one for NFS too, but both Rosetta and NFS the preferences on the websites adjust almost correctly to what you set it to apart from it spiking a bit higher from where it's supposed to be, so if it spikes too much i'll probs create config files for those too. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
:D 20% Cores is 1 core, 30% Cores is 2 cores, and 40% Cores is 3 cores, so that means 50% Cores is 4 cores. So the lowest we can go is 20% which you can for NFS and 80% CPU Time because they're smaller tasks. So that's why it's better setting it under 50% because 50% would be using 4 Cores per task. If you put the wrong name in the app_config file the Project will send a message saying these are the allowed names you can use, so you can use your MilkyWay one at Universe and it will come back with the acceptable names it's looking for. For a 4 core pc the numbers should be 25% and below is 1 cpu core, 26% to 50% is 2 cores, 51% to 99% is 3 cores and 100% is 4 cores. The range for 3 cpu cores should be from 50% all the way to 99%. I have pc's with 8, 16 and even a 32 core pc and I use a cc_config.xml file to limit the number of tasks that can run at one time instead of limiting the total number of cpu cores in the Boinc Manager or website. It let's me keep my setting at 99% in the Boinc Manager or on the website, I always keep 1 cpu core free for gpu usage and so it doesn't take forever to do anything when I check on how the pc is doing, and then I fine tune Boinc itself thru the cc_config files I put in each Project folder. This lets me run 1 cpu task from this Project, 4 cpu tasks from that Project and all the rest of my cpu cores on a different Project. I do this because some tasks take ALOT of pc memory when they run, ie the Yoyo ECMp2 tasks take 8gb of ram for each task, so an 8 core pc with only 16gb of ram in it can't run 7 of those and a gpu project on the stand alone gpu in the box all at the same time, so I limit Yoyo to 1 task at a time and then run 6 cpu tasks from another Project that do not take alot of ram for each task. If you go into the Boinc Manager and click once on a running tasks to hilite it, then go over to the left and click the box labelled 'properties' that will tell you how much ram that task is using, most tasks of the same kind from the same Project will take about the same amount of ram to run. You want to use the larger amount of ram used on the 2 lines listing the ram usage when deciding how many tasks you can run at one time. |
©2024 Astroinformatics Group