Message boards :
News :
New Modfit Runs
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 31 Mar 09 Posts: 1 Credit: 101,045,077 RAC: 0 |
the file has to be named app_config.xml not app-config.xml |
Send message Joined: 6 Apr 13 Posts: 8 Credit: 215,367,305 RAC: 0 |
I'm running multiple projects, and it seems to me that having a non-zero minimum work buffer leads to erratic scheduling. |
Send message Joined: 13 Feb 11 Posts: 31 Credit: 1,403,524,537 RAC: 0 |
I found I needed 0.5 cpus per 0.5 GPUs to maintain a decent workload. Dropped CPU to 88% in BOINC manager. At 100% each WU took 50% longer to process. HTH dunx |
Send message Joined: 31 Aug 11 Posts: 20 Credit: 529,335,116 RAC: 0 |
the file has to be named app_config.xml not app-config.xml Cha-ching! Yep, now it's running 2 WU per GPU. Thanks. And I don't smell smoke...yet. :) Just curious, instead of .5 for double workunits, what would one put for triple WUs? EDIT: Oh Jeez, I get it. It's MATH. :P So don't need that answered anymore. Hans: 0 should maximize the work being buffered, it's odd that you're running out of work. UNLESS you are only running out of work for the GPU. In which case I agree we need more computationally intensive workunits, and less network activity. I've noticed if I set the project to "no new work" I run out of GPU tasks in 10-20 minutes but still have a day's worth of CPU processing to do. I couldn't hazard a guess as to that being a BOINC issue, not differentiating between CPU and GPU work, or BOINC having a limit on the maximum number of tasks, or Milkyway simply not sending as much as possible for the GPU. Hans if you do have "0" (unlimited) set for the work buffer, try something smaller such as .5 for half a days work. I've noticed when running multiple projects that sometimes one of the projects will refuse to get new work, because it sees that the other projects have already filled up the buffer. Not sure a smaller buffer will help that, but anything is possible. |
Send message Joined: 18 Jul 09 Posts: 300 Credit: 303,571,619 RAC: 479 |
the file has to be named app_config.xml not app-config.xml Oops, glad it's running for you though! |
Send message Joined: 2 Dec 10 Posts: 12 Credit: 107,787,192 RAC: 0 |
@swiftmallard At the risk of hijacking the thread, I tried experimenting with your app_config entries without success. I have an AMD R9 290X, so I have to use an app_info file just to get the applications running in the first place. I was unable to get any more than the one WU running at a time regardless of my efforts. I also have another project running on the Intel iGPU as well as a third project running on the CPU. I am trying to optimise the overall output from the machine as a whole. What would you recommend to set for CPU usage for the GPU projects? Do they each really need a whole CPU core, or can the GPUs share a single core between them? |
Send message Joined: 18 Jul 09 Posts: 300 Credit: 303,571,619 RAC: 479 |
What would you recommend to set for CPU usage for the GPU projects? Do they each really need a whole CPU core, or can the GPUs share a single core between them? My experience is that leaving a core free to feed the GPU adds so little output that it's not worth doing, I prefer to run a single CPU project on all cores and a single GPU project on the card. I run as few projects as possible at any one time to eliminate any issues with Boinc wanting to switch between them, KISS at work. It seems to me that your measurement of optimization of the overall output includes receiving the maximum credit from a given project, a nice objective goal. But many major crunchers will tell you that the easiest way achieve this is to: A. keep it simple B. set it and forget it You've done what you can. Let it run for a year and come back to see where you are. I have no doubt you will be well pleased. |
Send message Joined: 2 Dec 10 Posts: 12 Credit: 107,787,192 RAC: 0 |
Thanks for the swift reply. The reason for trying to fiddle is because I am seeing way less than 100% utilization in GPU-Z and was looking to improve efficiency. The AMD card is currently crunching through modfit-fast WUs in about 13 seconds, but appears to get to 100% after about 10 seconds. I was guessing that this overrun was a small amount of CPU tidy-up, hence why the GPU usage drops to 0%. I was hoping that if I could double up the WU crunching, I could paper over this gap. When the CPU is fully occupied with crunching another project, the MW WUs run to about 25 seconds - I haven't compared the WUs on the iGPU yet. I was wondering if a single CPU core would be enough to mother 2 WUs on the AMD card (MW@H) and one on the iGPU. Of course things would be much easier to test if the sysadmins would sort out native support of the AMD card instead of me having to populate an app_info. |
Send message Joined: 2 Dec 10 Posts: 12 Credit: 107,787,192 RAC: 0 |
After some more tinkering, I have discovered another parameter in the app_info that seems to have done the trick. Specifically, I found I had to adjust <count>x</count> where x corresponds to the number or fraction of GPUs running a task. I set this to 0.333 and the client happily crunched 3 concurrent tasks in about 30 seconds each instead of the 3 x 13 seconds for the equivalent individual tasks. I then set it to 0.25 and the client crunches 4 concurrent tasks in about 39 seconds. While I was there, I tried adding an entry for <flops> but while this changed the "Remaining: estimated" time, it was not a dynamic change, ie gave the same estimated time regardless of whether the client was running 1 or 4 concurrent WUs. It would be useful if there were a fully documented list of app_info parameters and their likely effect - a definitive list rather than the piecemeal guesswork I have stumbled across in various forums. Oh well... |
Send message Joined: 4 Sep 12 Posts: 219 Credit: 456,474 RAC: 0 |
It would be useful if there were a fully documented list of app_info parameters and their likely effect - a definitive list rather than the piecemeal guesswork I have stumbled across in various forums. Oh well... Start with http://boinc.berkeley.edu/wiki/Anonymous_platform |
Send message Joined: 31 Aug 11 Posts: 20 Credit: 529,335,116 RAC: 0 |
...My experience is that leaving a core free to feed the GPU adds so little output that it's not worth doing, ... I would have to very very strongly disagree to that as a general rule, my HD4850 crunch time on a full size (not fast) WU went from 6 minutes to 2 minutes on a dual core machine, by leaving one core free. And I notice tens of seconds difference on my 8 thread machines with 280X's if I give them a bit of CPU power. |
Send message Joined: 31 Aug 11 Posts: 20 Credit: 529,335,116 RAC: 0 |
///Begin playful yet serious rant to Jake/// Jake, I appreciate what you guys are up to and am honored to be able to contribute to the cause. BUT, (there's always a 'butt' in every crowd), we'd like to hear from you. You opened this thread telling us to let you know any problems, etc. Let us know you're listening, that you're either considering our ideas/problems, or that you're just ignoring them. :) It's ok, this is part computer science after all, and if NOT utilizing the system efficiently is part of the research, just let us know. Many have complained that the fast units are TOO fast, and they complete in almost as little time as it takes the CPU to send it back to you, and for the server to send a replacement. Some run out they are so fast. Solution to problem A, the idle time while the CPU packages the result, is to run multiple tasks per GPU. But you really have to do something about the quanity of GPU tasks sent at once. Or at least tell us NO, we're not going to address that at this time. SETI gladly sends a half day's worth at a time, or however many BOINC is set to receive, so we know it's possible. I assume you're catering to those poor Nvidia folks whose double precision scores, frankly, are embarassing (except the mighty Titan series, repect due there). I suppose 10 WU would meet the goal of the standard .5 day's worth of work for those with Nvidia cards. But here's a screenshot of my (new) primary MW@H cruncher. It's running about what, 30-40 minutes an hour, maybe? And I've not even added the final GPU yet, while waiting for the new PSU. http://www.facebook.com/photo.php?fbid=10153140531602074&l=8c7c80e012 Shameful waste! (Actually I'm just crying because this was supposed to put me over the top and enable me to overcome/outscore some teamates. :P ) ///END playful yet serious rant to Jake/// Now for the rest of you, as you've helped tremendously in getting me manually set up to run multiple tasks per GPU, can anyone help me manually configure to update more frequently and or get more GPU tasks per update? Maybe this is my fault, not Mr. Weiss's. Thanks. |
Send message Joined: 31 Aug 11 Posts: 20 Credit: 529,335,116 RAC: 0 |
Hmmmmm. Magically I now have tens of minutes of WU to crunch (vs 1-2 minutes). Perhaps the Weiss IS listening in? |
Send message Joined: 13 Feb 11 Posts: 31 Credit: 1,403,524,537 RAC: 0 |
Hmmmmm. Magically I now have tens of minutes of WU to crunch (vs 1-2 minutes). Perhaps the Weiss IS listening in? Try :- <max_file_xfers>N</max_file_xfers> Maximum number of simultaneous file transfers (default 8). <max_file_xfers_per_project>N</max_file_xfers_per_project> Maximum number of simultaneous file transfers per project (default 2). dunx P.S. Not sure I've found the right options TBH.... |
Send message Joined: 31 Aug 11 Posts: 20 Credit: 529,335,116 RAC: 0 |
Thanks Dunx. I'm a GUI kinda guy, so bear with me. Does that go in the app_config file? Do I replace the "N" with a number or does that stand for, ahem, "N"finite? :D PS: Everything is working as expected today, lot's of work units for all my machines. And they are even (correctly) getting priority over other projects as I've instructed. Woot woot! |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey everyone, Sorry I have been extremely silent on the forums the last couple months. Your concerns are being noted, and will be addressed. In the Fall I had much more time to work on the project as I was not taking classes and focused 100% on research. Starting at the end of January I began taking classes again and as such my focus is a bit split. In other news, we have been getting wonderful results off of the Modfit program especially with the new work unit sizes. I know the may seem a bit inefficient on your ends, but results that used to take 3 weeks to get back are now done in 1-2 weeks with 2-3 times more work being done. It has made our results much more accurate and reliable. Thank you guys for hanging in there, Jake W. |
©2024 Astroinformatics Group