1)
Message boards :
Number crunching :
How much CPU cache milkyway@home cpu tasks occupy?
(Message 77206)
Posted 14 Aug 2024 by KeithBriggs Post: I can't see from HWiNFO 64 (as of yet) how much L1 and L2 cache's are being utilized. Generally tasks are high cpu and low memory so I'm guessing that includes the cache memory. The most important change to make is to run one task per core and not MT. |
2)
Message boards :
Number crunching :
Tasks stuck after a few minutes and run indefinitely
(Message 77205)
Posted 14 Aug 2024 by KeithBriggs Post: I'm a novice at all the details but regarding the slot question, I believe it has to do with the directory slot on the drive and nothing to do with which core its attached to. I only have 32 tasks running at any one time. I discovered that the frozen jobs have to do with a weakness in the suspend and restart feature. About 1% of restarts don't actually restart. I was starting 32 tasks then deleting the tasks that give unfair credit and grabbing 32 more enough to get thru the night. The credit weakness revealed the suspend weakness. |
3)
Message boards :
Number crunching :
Tasks stuck after a few minutes and run indefinitely
(Message 77173)
Posted 9 Jul 2024 by KeithBriggs Post: And another one: Application Milkyway@home N-Body Simulation with Orbit Fitting 1.87 (mt) Name de_nbody_07_01_2024_v186_pal5__data__13_1718187602_1773297 State Running Received 7/7/2024 8:29:03 PM Report deadline 7/19/2024 8:29:01 PM Estimated computation size 59,585 GFLOPs CPU time 00:03:58 CPU time since checkpoint 00:00:00 Elapsed time 00:52:45 Estimated time remaining 00:56:32 Fraction done 2.777% Virtual memory size 15.51 MB Working set size 19.82 MB Directory slots/71 Process ID 38312 Progress rate 3.240% per hour Executable milkyway_nbody_orbit_fitting_1.87_windows_x86_64__mt.exe |
4)
Message boards :
Number crunching :
Tasks stuck after a few minutes and run indefinitely
(Message 77172)
Posted 9 Jul 2024 by KeithBriggs Post: I've had about 20 and here's an example that follows with two timestamps. I've suspended and restarted it without any affect. Application Milkyway@home N-Body Simulation with Orbit Fitting 1.87 (mt) Name de_nbody_07_01_2024_v186_pal5__data__13_1718187602_1773294 State Running Received 7/7/2024 8:29:03 PM Report deadline 7/19/2024 8:29:01 PM Estimated computation size 61,319 GFLOPs CPU time 00:04:01 CPU time since checkpoint 00:00:00 Elapsed time 00:08:51 Estimated time remaining 00:58:04 Fraction done 2.980% Virtual memory size 15.50 MB Working set size 19.73 MB Directory slots/78 Process ID 33716 Progress rate 22.680% per hour Executable milkyway_nbody_orbit_fitting_1.87_windows_x86_64__mt.exe 10 minutes later: Application Milkyway@home N-Body Simulation with Orbit Fitting 1.87 (mt) Name de_nbody_07_01_2024_v186_pal5__data__13_1718187602_1773294 State Running Received 7/7/2024 8:29:03 PM Report deadline 7/19/2024 8:29:01 PM Estimated computation size 61,319 GFLOPs CPU time 00:04:01 CPU time since checkpoint 00:00:00 Elapsed time 00:18:49 Estimated time remaining 00:58:04 Fraction done 2.980% Virtual memory size 15.50 MB Working set size 19.73 MB Directory slots/78 Process ID 33716 Progress rate 10.080% per hour Executable milkyway_nbody_orbit_fitting_1.87_windows_x86_64__mt.exe |
5)
Message boards :
Number crunching :
Tasks slow to start
(Message 77171)
Posted 3 Jul 2024 by KeithBriggs Post: I did switch to from 2 to 1 core per work unit. Thanks again. |
6)
Message boards :
Number crunching :
Run Times Weird?????
(Message 77164)
Posted 27 Jun 2024 by KeithBriggs Post: I run one task per core instead of multicore. My run times are about 9% greater than my cpu times. I wasn't exactly clear on your question though. |
7)
Message boards :
Number crunching :
Tasks slow to start
(Message 77148)
Posted 29 May 2024 by KeithBriggs Post: Great catch btw. I switched from 16 cores to 2 cores per WU. Odd that the delay went from ~30 sec to ~60 sec so the other cores were doing something "during idle" when 16 were process each WU. With 32 cores, I have 16 running now. Maybe I'll try one core per WU later but it seems to be much more efficient per task manager. |
8)
Message boards :
News :
New Separation Runs
(Message 60402)
Posted 16 Nov 2013 by KeithBriggs Post: My HD7870 cards are not running nearly as well as a month ago. I see % utilization on the cards consistently dropping below 100%. Something in the main work algorithm is quite off. They also do not allow the cards to run in overdrive like Seti does. |
9)
Message boards :
News :
New MilkyWay Separation Modified Fit Runs
(Message 60158)
Posted 17 Oct 2013 by KeithBriggs Post: When you run 4 simultaneous tasks per GPU it kind of makes more sense based on credit and time to complete: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=517443&offset=0&show_names=0&state=4&appid= but when you run just 1 task per gpu, I'm seeing the same discrepancy: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=518108&offset=0&show_names=0&state=4&appid= Different machines but both have a pair of Sapphire HD 7870's |
10)
Message boards :
News :
Users Auto-Aborting Work Units
(Message 60060)
Posted 30 Sep 2013 by KeithBriggs Post: The key is to not count "Abort by User" or "Error While Computing" zero second runs toward the error count. |
11)
Message boards :
News :
Users Auto-Aborting Work Units
(Message 60040)
Posted 28 Sep 2013 by KeithBriggs Post: Probably the CAL driver message. Just disregard. On the main page is Statistics and under that is the GPU list. http://milkyway.cs.rpi.edu/milkyway/gpu_list.php. |
12)
Message boards :
News :
Users Auto-Aborting Work Units
(Message 60035)
Posted 27 Sep 2013 by KeithBriggs Post: Hey AMueller91, glad you figured it out. I have not seen any benefit beyond 4 tasks per gpu. Key is no down time and the chances that 4 tasks finish at the same time is minimal. If they are running in tandem, just pause one then start it back up. All you need for cpus is set logical cores = physical cores plus 1. |
13)
Message boards :
News :
Users Auto-Aborting Work Units
(Message 60032)
Posted 27 Sep 2013 by KeithBriggs Post: The boinc manager wont delete wu's you've already received. Watch the newly downloaded ones and see if it is working correctly. If your computer is listed as "school" or "home" you'll have to change the acceptable apps for each class or computers you have. |
14)
Message boards :
News :
Users Auto-Aborting Work Units
(Message 60024)
Posted 27 Sep 2013 by KeithBriggs Post: First go to my account, then under MilkywayPreferences you'll see Use CPU Enforced by version 6.10+ yes Use ATI GPU Enforced by version 6.10+ yes Use NVIDIA GPU Enforced by version 6.10+ yes A few more lines down you'll see: Run only the selected applications MilkyWay@Home: yes MilkyWay@Home N-Body Simulation: no Milkyway@Home Separation: no Milkyway@Home Separation (Modified Fit): no |
15)
Message boards :
News :
Users Auto-Aborting Work Units
(Message 60013)
Posted 27 Sep 2013 by KeithBriggs Post: Yes, that's right. Good point. Here's my app_config <app_config> <app> <name>milkyway_nbody</name> <max_concurrent>0</max_concurrent> <gpu_versions> <gpu_usage>.1</gpu_usage> <cpu_usage>1</cpu_usage> </gpu_versions> </app> <app> <name>milkyway</name> <max_concurrent>8</max_concurrent> <gpu_versions> <gpu_usage>.25</gpu_usage> <cpu_usage>.11</cpu_usage> </gpu_versions> </app> <app> <name>milkyway_separation__modified_fit</name> <max_concurrent>8</max_concurrent> <gpu_versions> <gpu_usage>.25</gpu_usage> <cpu_usage>.12</cpu_usage> </gpu_versions> </app> </app_config> and here's my cc_config <cc_config> <log_flags> </log_flags> <options> <ncpus>4</ncpus> <max_file_xfers>30</max_file_xfers> <max_file_xfers_per_project>30</max_file_xfers_per_project> <http_transfer_timeout>30</http_transfer_timeout> <rec_half_life_days>10</rec_half_life_days> <report_results_immediately>0</report_results_immediately> </options> </cc_config> so one gpu is running 4 WU at a time. Then no down time. Particular machine has two cpu cores but I have 4 virtual cores. Again, no cpu down time. 4 cpu wu and 4 gpu wu which is about 10% more work than letting them cycle down. Also have constant fan speeds and more stable temperatures. Holler if any wants my 2 gpu xml files. |
16)
Message boards :
News :
Users Auto-Aborting Work Units
(Message 60009)
Posted 26 Sep 2013 by KeithBriggs Post: I also use app_config but its easiest to just do it in the preferences. |
17)
Message boards :
News :
Users Auto-Aborting Work Units
(Message 60006)
Posted 26 Sep 2013 by KeithBriggs Post: Here's some major aborters: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=104692 2900 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=529892 8800 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=322721 4300 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=520641 15000 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=529525 3400 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=366486 2800 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=485608 5000 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=484725 1600 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=452569 3700 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=532562 3400 aborts |
18)
Message boards :
News :
Users Auto-Aborting Work Units
(Message 60005)
Posted 26 Sep 2013 by KeithBriggs Post: Maybe new users should have to opt into beta projects. http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=437270 has about 5373 aborted WU's and counting. |
19)
Message boards :
News :
Separation Modified Fit v1.28 Release
(Message 60000)
Posted 26 Sep 2013 by KeithBriggs Post: I agree with Conan. When there's a known fault with an app depending on execution source, the least you can do is adjust the max errors. I think I've mentioned it before but aborting by a user should not count toward errors either. |
20)
Message boards :
News :
Separation Modified Fit v1.28 Release
(Message 59941)
Posted 20 Sep 2013 by KeithBriggs Post: Invalids are down considerably. Thanks. I have one cpu 1.26 still showing in history. 8150 secs and about 100 secs more for the new 1.28 (cpu). Intel. |
©2024 Astroinformatics Group