Message boards :
Number crunching :
Milkyway@home N-Body Simulation v1.82 (mt) windows_x86_64 stuck
Message board moderation
Author | Message |
---|---|
Send message Joined: 25 May 20 Posts: 2 Credit: 1,657,784 RAC: 728 ![]() ![]() |
These tasks seem to get "stuck". Time remaining counting up, CPU % going to nothing |
![]() ![]() Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 ![]() ![]() ![]() |
These tasks seem to get "stuck". Time remaining counting up, CPU % going to nothing Well your pc's are hidden so it's very hard to help you without more info so here are some questions for you: 1: how many cpu cores are in your pc and 2: did you restrict MilkyWay from using all of them or did you use the defaults? 3: are you using the cpu graphics capabilities to crunch as well or do you have a stand alone gpu? 4: in your setting is Boinc using 100% of the cpu time or did you restrict it to less that that? |
Send message Joined: 25 May 20 Posts: 2 Credit: 1,657,784 RAC: 728 ![]() ![]() |
129.84 1,376,814 7.20.2 GenuineIntel Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz [Family 6 Model 158 Stepping 10] (12 processors) INTEL Intel(R) UHD Graphics 630 (4862MB) OpenCL: 3.0 Microsoft Windows 11 Core x64 Edition, (10.00.22621.0 0) To keep the CPU temperature down (I'm in Phoenix) usually limit to 2 cpus and 50% time. |
![]() ![]() Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 ![]() ![]() ![]() |
129.84 1,376,814 7.20.2 GenuineIntel Okay that's a start...try upping that to 100% of the cpu time and if you need to drop it to a single cpu core to keep the temp under control, future tasks will drop to a single core not any existing tasks already on your pc. If you click on my name you can then click on my computers and see what's shared, nothing personal just stats and lots of info about the tasks. You won't find any tasks for me because I'm not currently crunching MilkyWay tasks but you will get the idea if you decide to share you pc('s) |
![]() Send message Joined: 22 Oct 10 Posts: 17 Credit: 144,999,780 RAC: 1,264 ![]() ![]() |
I have observed similar behavior Win10 Pro, Core i-7 11700K, 8 cores hyper threaded. Milkyway limited to 3 threads to allow other projects to share the system Most, but not all, of the new CPU Nbody jobs start fine, but after a while, the time remaining starts climbing, and Resource Monitor shows the job using .01% CPU I've left several to run for quite a while, and none of them ever end. When looking at my returned tasks on the this site, you can see that processor time of the aborted tasks is nowhere near 3x wall clock time |
![]() Send message Joined: 19 Jul 10 Posts: 649 Credit: 19,507,605 RAC: 1,544 ![]() ![]() |
And are you as well not letting BOINC use 100% of CPU time? If yes, there's the issue, N-Body doesn't like it. ![]() |
![]() Send message Joined: 22 Oct 10 Posts: 17 Credit: 144,999,780 RAC: 1,264 ![]() ![]() |
Yes, I have CPU set to 100%, but BOINC limited to 3 "processors" Two WUs have completed, most still "runaway" |
![]() Send message Joined: 22 Oct 10 Posts: 17 Credit: 144,999,780 RAC: 1,264 ![]() ![]() |
And, of course, now that I've posted here, 6 of the last 7 WUs have completed successfully |
![]() Send message Joined: 22 Oct 10 Posts: 17 Credit: 144,999,780 RAC: 1,264 ![]() ![]() |
So I am averaging about one bad WU in 4-5 But if I don't catch it and abort it, I will tie up 3 "CPUs" forever Sorry, but this project isn't worth the hassle I'll check back periodically to see if you've fixed it |
Send message Joined: 15 Feb 21 Posts: 3 Credit: 17,586,820 RAC: 2,537 ![]() ![]() ![]() |
My problem is that the N-Body task is taking up almost 100% of my computer along with an Einstein task. These two are blocking all my other programs. I have well over a dozen of each waiting in the queue, it's been over a week since any others have run. I've now suspended each one and now have 16 tasks over 6 programs. |
![]() ![]() Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 ![]() ![]() ![]() |
My problem is that the N-Body task is taking up almost 100% of my computer along with an Einstein task. These two are blocking all my other programs. I have well over a dozen of each waiting in the queue, it's been over a week since any others have run. I've now suspended each one and now have 16 tasks over 6 programs. It sounds like you to need to use an app_config.xml file in each c:\program data\boinc\projects folder to limit the total number of tasks each project can run at one itme, I use one like this: <app_config> <project_max_concurrent>1</project_max_concurrent> </app_config> That says that only 1 task is allowed to run at a time from that Project, you can of course change the number to reflect your own preferences and if you use a zero then it will use all available cpu and gpu's to run on the pc for that project. You use the file by copying and pasting the above into Notepad in Windows and then saving it in the folder for the Project you want I have one in every Project folder and adjust them as my crunching needs change. Be sure to save the file as "app_config.xml" no quotes and when you are done go into the Boinc Manager and click on Options, read config files so it takes effect right away. Just be aware that it means MAX_CONCURRENT for the Project, not just a type of task so if you are running Einstein both cpu and gpu tasks you would need to have at least 2 in the line but it does not specify one of each kind of task, that requires a more involved set of parameters. For me it doesn't matter as I almost never run cpu and gpu tasks from the same project at the same time as the cache settings don't split the cpu and gpu tasks and sometimes I will get a cache full of cpu tasks and my gpu will sit idle. |
![]() Send message Joined: 22 Oct 10 Posts: 17 Credit: 144,999,780 RAC: 1,264 ![]() ![]() |
That will indeed limit Milkyway to 1 task But it will STILL use all "CPUs" This is my app_config, which also limits Milky way to three "CPUs" <app_config> <app> <name>milkyway_nbody</name> <max_concurrent>1</max_concurrent> <report_results_immediately/> <fraction_done_exact/> <gpu_versions> <gpu_usage>1</gpu_usage> <cpu_usage>.5</cpu_usage> </gpu_versions> </app> <app_version> <app_name>milkyway_nbody</app_name> <plan_class>mt</plan_class> <avg_ncpus>3</avg_ncpus> <cmdline>--nthreads 3 </cmdline> </app_version> <report_results_immediately/> </app |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 ![]() ![]() |
I had one that was stuck. I eventually shut BOINC down to perform a OS update and GPU driver update. After rebooting, the stuck task suddenly rushed towards completion and quickly finished. Maybe all that you needed to do is to quit BOINC with the option to shut down all tasks with it, and then restart BOINC. |
Send message Joined: 15 Feb 21 Posts: 3 Credit: 17,586,820 RAC: 2,537 ![]() ![]() ![]() |
The one MW task uses all my computing power. I'm not a computer geek that can do all the above. I'm just suspending MW until all my uncompleted tasks expire. I'll try later and if these N-Body tasks show up, I'll delete MW. |
![]() Send message Joined: 19 Jul 10 Posts: 649 Credit: 19,507,605 RAC: 1,544 ![]() ![]() |
I had one that was stuck. I eventually shut BOINC down to perform a OS update and GPU driver update. After rebooting, the stuck task suddenly rushed towards completion and quickly finished. Maybe all that you needed to do is to quit BOINC with the option to shut down all tasks with it, and then restart BOINC.Yes, restarting BOINC is the easiest way to get them running again. ![]() |
![]() Send message Joined: 19 Jul 10 Posts: 649 Credit: 19,507,605 RAC: 1,544 ![]() ![]() |
The one MW task uses all my computing power. Unless you've configured it differently in cc_confing.xml, they are using only that computing power, which you don't need for anything else. As long as you don't want to squeeze out the last bit of computing power, there's no need to configure anything, just let BOINC do it's job, in worst case you'll need to restart it sometimes, but that should not happen often if you simply let BOINC use 100% of CPU time. ![]() |
![]() Send message Joined: 22 Oct 10 Posts: 17 Credit: 144,999,780 RAC: 1,264 ![]() ![]() |
No, it is not I've rebooted and the stuck WUs remain stuck |
Send message Joined: 24 Sep 23 Posts: 7 Credit: 198,130 RAC: 210 ![]() ![]() |
I'm new to this project and run several old laptops rather than one good processor. I've seen the problem on two windows computers and zero Linux. Someone mentioned to allow Boinc 100% CPU cycles. This wasn't a complete fix. I also tried increasing the checkpoint time significantly. Not really conclusive either. What seems to help the most is, bumping the Milkyway... task priority to Above Normal in Task Manager Details tab. This seems better than giving Boinc 100% CPU cycles because you can't get cycles if you're a low priority and nnntube's playing. This has worked best, but stalls aren't at zero if you use the computer for other purposes. It is near zero though. This also only works if the task is using CPU. It won't recover a stalled task. You still have to restart Boinc Manager with Stop Running Tasks... checked when you exit. I haven't seen the problem lately so I don't know if you can just Suspend and Resume the task. It didn't affect normal computer operations, but they're slow computers and I don't expect much. I manually changed the task priority. I don't know how to implement this automatically, but the tasks take so long even when working, I won't be overworked making manual tweaks. |
![]() Send message Joined: 19 Jul 10 Posts: 649 Credit: 19,507,605 RAC: 1,544 ![]() ![]() |
If higher priority helps, than this entry in cc_config.xml can be used for it: <cc_config> <options> <process_priority>N</process_priority> </options> </cc_config> Possible values are 0 (lowest priority, the default), 1 (below normal), 2 (normal), 3 (high) and 4 (highest). ![]() |
©2025 Astroinformatics Group