Message boards :
News :
Users Auto-Aborting Work Units
Message board moderation
Author | Message |
---|---|
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hello all, It has come to our attention that some users have been setting their BOINC clients to auto abort work units from specific applications. Doing this sends an error results back to our server which then causes some work units to be unable to validate. Essentially, it prevents some of our hard working crunchers from getting their due credits. The proper way to prevent yourself from getting work units from a specific applications such as our beta applications N-Body or Modified Fit, is to go to your account page on our website (http://milkyway.cs.rpi.edu/milkyway/home.php). Under the Preferences section please select the link for your preferences for this project. There will then be a link to edit these preferences on this page. Halfway down your preferences, there will be some check boxes in the "Run only the selected applications" section. You will only receive work units for the applications you have check marks next to. For reference: Milkyway@home is our flagship application and is considered stable and in its final released state; Milkyway@home N-body Simulation is our beta version N-body simulation and orbit fit program; Milkyway@home Separation is an, as of now, unused application; Milkyway@home Separation (Modified Fit) is our beta version separation code testing new models for both streams and background in the Milky Way Halo. As usual if you have any issues with this method or questions about it please post them here. We appreciate your cooperation and understanding in this. Thank you, Jake W. TL;DR: If you are auto-aborting work units please stop and use the method above to prevent users from losing credits and to prevent problems in our algorithms. |
Send message Joined: 28 Apr 11 Posts: 36 Credit: 283,587,354 RAC: 4 |
Maybe new users should have to opt into beta projects. http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=437270 has about 5373 aborted WU's and counting. |
Send message Joined: 28 Apr 11 Posts: 36 Credit: 283,587,354 RAC: 4 |
Here's some major aborters: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=104692 2900 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=529892 8800 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=322721 4300 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=520641 15000 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=529525 3400 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=366486 2800 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=485608 5000 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=484725 1600 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=452569 3700 aborts http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=532562 3400 aborts |
Send message Joined: 17 Aug 13 Posts: 3 Credit: 336,920,753 RAC: 0 |
Folks: I would like to just do GPU work units for Milkyway@home, and to that end I have been using an app_info.xml to make my FirePro do two workunits at a time. However, I do often get messages that state Message from server: Your app_info.xml file doesn't have a usable version of Milkyway@Home Separation (Modified Fit). I sure hope I'm not causing any problems. Which check boxes should I clear if I only want to do GPU processing? I didn't even know this 'Preferences' page existed for this project - good news for me. Thanks |
Send message Joined: 18 Jul 09 Posts: 300 Credit: 303,565,482 RAC: 0 |
Check the MilkyWay@home box and the MilkyWay@home Separation (Modified fit) boxes. Stop using the app_info file and use an app_config file instead. This one works well for me: <app_config> <app> <name>milkyway</name> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>0.05</cpu_usage> </gpu_versions> </app> </app_config> |
Send message Joined: 28 Apr 11 Posts: 36 Credit: 283,587,354 RAC: 4 |
I also use app_config but its easiest to just do it in the preferences. |
Send message Joined: 18 Jul 09 Posts: 300 Credit: 303,565,482 RAC: 0 |
He wants to crunch two at a time, he'll need the config file for that. |
Send message Joined: 28 Apr 11 Posts: 36 Credit: 283,587,354 RAC: 4 |
Yes, that's right. Good point. Here's my app_config <app_config> <app> <name>milkyway_nbody</name> <max_concurrent>0</max_concurrent> <gpu_versions> <gpu_usage>.1</gpu_usage> <cpu_usage>1</cpu_usage> </gpu_versions> </app> <app> <name>milkyway</name> <max_concurrent>8</max_concurrent> <gpu_versions> <gpu_usage>.25</gpu_usage> <cpu_usage>.11</cpu_usage> </gpu_versions> </app> <app> <name>milkyway_separation__modified_fit</name> <max_concurrent>8</max_concurrent> <gpu_versions> <gpu_usage>.25</gpu_usage> <cpu_usage>.12</cpu_usage> </gpu_versions> </app> </app_config> and here's my cc_config <cc_config> <log_flags> </log_flags> <options> <ncpus>4</ncpus> <max_file_xfers>30</max_file_xfers> <max_file_xfers_per_project>30</max_file_xfers_per_project> <http_transfer_timeout>30</http_transfer_timeout> <rec_half_life_days>10</rec_half_life_days> <report_results_immediately>0</report_results_immediately> </options> </cc_config> so one gpu is running 4 WU at a time. Then no down time. Particular machine has two cpu cores but I have 4 virtual cores. Again, no cpu down time. 4 cpu wu and 4 gpu wu which is about 10% more work than letting them cycle down. Also have constant fan speeds and more stable temperatures. Holler if any wants my 2 gpu xml files. |
Send message Joined: 18 Aug 09 Posts: 123 Credit: 21,154,396 RAC: 2,270 |
Any news as to when 1.38 will be out? I have opted out of modfit 1.28 because it just crashes on my system. |
Send message Joined: 28 Apr 11 Posts: 36 Credit: 283,587,354 RAC: 4 |
First go to my account, then under MilkywayPreferences you'll see Use CPU Enforced by version 6.10+ yes Use ATI GPU Enforced by version 6.10+ yes Use NVIDIA GPU Enforced by version 6.10+ yes A few more lines down you'll see: Run only the selected applications MilkyWay@Home: yes MilkyWay@Home N-Body Simulation: no Milkyway@Home Separation: no Milkyway@Home Separation (Modified Fit): no |
Send message Joined: 6 Jul 12 Posts: 4 Credit: 12,385,544 RAC: 0 |
Yes, that's right. Good point. Hey, this is exactly what i want, but it dosent work. I got a i7-3930k CPU with 6 Cores and 12 Threads & two GTX Titan. If i let run MW@Home without any app_configs or something, all 12 Threads and both GPUs are working. If i add the app_config, only GPUs are working correctly (2 WUs per GPU), but the CPU does nothing... BUT if i drag n drop the app_config.xml file out of my MW@home folder and restart BOINC, all works fine (12 CPU WUs and 2 GPU WUs on each card) for around 10min! After this 10 minutes, the GPUs automatically stops the additional WUs and keeps processing one WU on each card. How can i make all or even 8 to 10 threads working while using the multiple WU App_config? Sorry for my bad english, im from Fondue-Switzerland :) PS: My app_config.xml <app_config> <app> <name>milkyway</name> <max_concurrent>4</max_concurrent> <gpu_versions> <gpu_usage>.500</gpu_usage> <cpu_usage>0.25</cpu_usage> </gpu_versions> </app> </app_config> |
Send message Joined: 6 Jul 12 Posts: 4 Credit: 12,385,544 RAC: 0 |
ok... after searching for a solution since 5 hours, it works now. After setting the value <max_concurrent>4</max_concurrent> to <max_concurrent>14</max_concurrent> inside the app_config.xml and <ncpus>4</ncpus> to <ncpus>14</ncpus> inside the cc_config.xml it works fine. 4 GPU WUs (2 WUs/GPU @ 0.25 CPU/GPU-WU) and 10 CPU WUs are active. Hope it will hold longer than 10 minutes :D Edit: With optimized app_config and cc_config, i can run 23 WUs at the same time. 12 on GPU (6 per card with Double Precision enabled) and 11 WUs on CPU. Each GPU WU take around 2 minutes to complete, CPU WUs run between 1-2 Hours. <app_config> <app> <name>milkyway</name> <max_concurrent>23</max_concurrent> <gpu_versions> <gpu_usage>.15</gpu_usage> <cpu_usage>0.05</cpu_usage> </gpu_versions> </app> </app_config> <cc_config> <log_flags> </log_flags> <options> <ncpus>23</ncpus> <max_file_xfers>30</max_file_xfers> <max_file_xfers_per_project>30</max_file_xfers_per_project> <http_transfer_timeout>30</http_transfer_timeout> <rec_half_life_days>10</rec_half_life_days> <report_results_immediately>0</report_results_immediately> </options> </cc_config> |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey all, Thank you for posting examples of good configuration options. Jake W |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
I know how to edit my preferences but how do I stop the "de_separation_DR_8_rev_3_1_2" units!! It is NOT labeled that way on my list and EVERY SINGLE ONE is failing!!! These are my choices: Run only the selected applications MilkyWay@Home: yes MilkyWay@Home N-Body Simulation: yes Milkyway@Home Separation: yes Milkyway@Home Separation (Modified Fit): no It seems to me the project has a problem and we users are being blamed for it, and the project is NOT helping to solve the problem!! Label the choices as to the units you are sending out and I WILL uncheck them!!! Until then deal with the problem, just like I am!!! |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey there, Any runs named _separation_ are coming from Milkyway@Home and runs named _modfit_ are coming from Milkyway@home Separation (Modified Fit). This run may have a slightly more complicated data set so it might actually just take longer to run them. Those are Jeff's runs and I am meeting with him in 10 minutes. I will let him know about your problem and see he thinks is going on. Sorry, Jake W |
Send message Joined: 28 Apr 11 Posts: 36 Credit: 283,587,354 RAC: 4 |
The boinc manager wont delete wu's you've already received. Watch the newly downloaded ones and see if it is working correctly. If your computer is listed as "school" or "home" you'll have to change the acceptable apps for each class or computers you have. |
Send message Joined: 28 Apr 11 Posts: 36 Credit: 283,587,354 RAC: 4 |
Hey AMueller91, glad you figured it out. I have not seen any benefit beyond 4 tasks per gpu. Key is no down time and the chances that 4 tasks finish at the same time is minimal. If they are running in tandem, just pause one then start it back up. All you need for cpus is set logical cores = physical cores plus 1. |
Send message Joined: 6 Jul 12 Posts: 4 Credit: 12,385,544 RAC: 0 |
Hey AMueller91, Exactly :) After it starts working fine with 6 tasks per GPU, i tested the maximum number of WUs to my Titan Cards. So without Double Precision, they can only handle a maximum of 3 WUs per Card to get a GPU load of 99%. But with Double Precision enabled, i get a maximum of 8 WUs per Card (16 GPU Tasks simultaneously) at a 99% GPU load. I let it run for around 5 minutes, finished nearby 30 tasks but also the card heat up to 90°C. So im fine with 6 WUs/card. It runs stable, without errors and temps around 85°C. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Hey there, I really don't see the difference, they BOTH say Milkyway@home!! Are you trying to say you are getting units from a 3rd party supplier, putting the MilkyWay@home name on them, and are still not responsible if they are bad or don't work? Today I got a message from MW saying the driver I am using, the AMD 13.10 Beta, is not supported here. Okay that's fine, but I can't find a list of which ones ARE supported here? Is this a trial and error thing until I stop getting the message, or am I just not seeing the list of approved drivers somewhere? |
Send message Joined: 28 Apr 11 Posts: 36 Credit: 283,587,354 RAC: 4 |
Probably the CAL driver message. Just disregard. On the main page is Statistics and under that is the GPU list. http://milkyway.cs.rpi.edu/milkyway/gpu_list.php. |
©2024 Astroinformatics Group