New Modfit Runs
log in

Advanced search

Message boards : News : New Modfit Runs

Previous · 1 · 2
Author Message
crunchin'xPU
Send message
Joined: 31 Mar 09
Posts: 1
Credit: 101,041,760
RAC: 1

Message 62942 - Posted: 3 Jan 2015, 23:20:09 UTC - in response to Message 62938.

the file has to be named app_config.xml not app-config.xml

hans dorn
Send message
Joined: 6 Apr 13
Posts: 8
Credit: 215,331,980
RAC: 198

Message 62945 - Posted: 4 Jan 2015, 7:54:18 UTC - in response to Message 62923.


You may need to update/increase the "Minimum Work Buffer" under the Tools/Computing Preferences/Network settings tab. Add .5 to 1 day to whatever it currently is set at.


I'm running multiple projects, and it seems to me that having a non-zero minimum work buffer leads to erratic scheduling.

Dunx
Send message
Joined: 13 Feb 11
Posts: 22
Credit: 321,978,570
RAC: 1

Message 62947 - Posted: 4 Jan 2015, 13:35:58 UTC
Last modified: 4 Jan 2015, 13:37:48 UTC

I found I needed 0.5 cpus per 0.5 GPUs to maintain a decent workload.
Dropped CPU to 88% in BOINC manager. At 100% each WU took 50% longer to process.

HTH

dunx

10esseeTony
Send message
Joined: 31 Aug 11
Posts: 19
Credit: 367,946,597
RAC: 114,590

Message 62948 - Posted: 4 Jan 2015, 13:51:50 UTC - in response to Message 62942.
Last modified: 4 Jan 2015, 14:41:43 UTC

the file has to be named app_config.xml not app-config.xml



Cha-ching! Yep, now it's running 2 WU per GPU. Thanks. And I don't smell smoke...yet. :)

Just curious, instead of .5 for double workunits, what would one put for triple WUs? EDIT: Oh Jeez, I get it. It's MATH. :P So don't need that answered anymore.




Hans: 0 should maximize the work being buffered, it's odd that you're running out of work. UNLESS you are only running out of work for the GPU. In which case I agree we need more computationally intensive workunits, and less network activity.

I've noticed if I set the project to "no new work" I run out of GPU tasks in 10-20 minutes but still have a day's worth of CPU processing to do. I couldn't hazard a guess as to that being a BOINC issue, not differentiating between CPU and GPU work, or BOINC having a limit on the maximum number of tasks, or Milkyway simply not sending as much as possible for the GPU.


Hans if you do have "0" (unlimited) set for the work buffer, try something smaller such as .5 for half a days work. I've noticed when running multiple projects that sometimes one of the projects will refuse to get new work, because it sees that the other projects have already filled up the buffer. Not sure a smaller buffer will help that, but anything is possible.

swiftmallard
Avatar
Send message
Joined: 18 Jul 09
Posts: 289
Credit: 302,980,648
RAC: 0

Message 62954 - Posted: 4 Jan 2015, 23:06:36 UTC - in response to Message 62948.

the file has to be named app_config.xml not app-config.xml


Oops, glad it's running for you though!

Floyd
Send message
Joined: 2 Dec 10
Posts: 12
Credit: 107,787,192
RAC: 0

Message 63084 - Posted: 1 Feb 2015, 6:36:30 UTC - in response to Message 62954.

@swiftmallard

At the risk of hijacking the thread, I tried experimenting with your app_config entries without success.

I have an AMD R9 290X, so I have to use an app_info file just to get the applications running in the first place.

I was unable to get any more than the one WU running at a time regardless of my efforts.

I also have another project running on the Intel iGPU as well as a third project running on the CPU.

I am trying to optimise the overall output from the machine as a whole. What would you recommend to set for CPU usage for the GPU projects? Do they each really need a whole CPU core, or can the GPUs share a single core between them?

swiftmallard
Avatar
Send message
Joined: 18 Jul 09
Posts: 289
Credit: 302,980,648
RAC: 0

Message 63085 - Posted: 1 Feb 2015, 15:05:40 UTC - in response to Message 63084.

What would you recommend to set for CPU usage for the GPU projects? Do they each really need a whole CPU core, or can the GPUs share a single core between them?

My experience is that leaving a core free to feed the GPU adds so little output that it's not worth doing, I prefer to run a single CPU project on all cores and a single GPU project on the card. I run as few projects as possible at any one time to eliminate any issues with Boinc wanting to switch between them, KISS at work.

It seems to me that your measurement of optimization of the overall output includes receiving the maximum credit from a given project, a nice objective goal. But many major crunchers will tell you that the easiest way achieve this is to:
A. keep it simple
B. set it and forget it

You've done what you can. Let it run for a year and come back to see where you are. I have no doubt you will be well pleased.

Floyd
Send message
Joined: 2 Dec 10
Posts: 12
Credit: 107,787,192
RAC: 0

Message 63087 - Posted: 2 Feb 2015, 0:35:00 UTC - in response to Message 63085.

Thanks for the swift reply.

The reason for trying to fiddle is because I am seeing way less than 100% utilization in GPU-Z and was looking to improve efficiency.

The AMD card is currently crunching through modfit-fast WUs in about 13 seconds, but appears to get to 100% after about 10 seconds. I was guessing that this overrun was a small amount of CPU tidy-up, hence why the GPU usage drops to 0%. I was hoping that if I could double up the WU crunching, I could paper over this gap.

When the CPU is fully occupied with crunching another project, the MW WUs run to about 25 seconds - I haven't compared the WUs on the iGPU yet. I was wondering if a single CPU core would be enough to mother 2 WUs on the AMD card (MW@H) and one on the iGPU.

Of course things would be much easier to test if the sysadmins would sort out native support of the AMD card instead of me having to populate an app_info.

Floyd
Send message
Joined: 2 Dec 10
Posts: 12
Credit: 107,787,192
RAC: 0

Message 63090 - Posted: 2 Feb 2015, 9:13:07 UTC - in response to Message 63087.

After some more tinkering, I have discovered another parameter in the app_info that seems to have done the trick.

Specifically, I found I had to adjust <count>x</count> where x corresponds to the number or fraction of GPUs running a task.

I set this to 0.333 and the client happily crunched 3 concurrent tasks in about 30 seconds each instead of the 3 x 13 seconds for the equivalent individual tasks.

I then set it to 0.25 and the client crunches 4 concurrent tasks in about 39 seconds.

While I was there, I tried adding an entry for <flops> but while this changed the "Remaining: estimated" time, it was not a dynamic change, ie gave the same estimated time regardless of whether the client was running 1 or 4 concurrent WUs.

It would be useful if there were a fully documented list of app_info parameters and their likely effect - a definitive list rather than the piecemeal guesswork I have stumbled across in various forums. Oh well...

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 63091 - Posted: 2 Feb 2015, 10:36:14 UTC - in response to Message 63090.

It would be useful if there were a fully documented list of app_info parameters and their likely effect - a definitive list rather than the piecemeal guesswork I have stumbled across in various forums. Oh well...

Start with

http://boinc.berkeley.edu/wiki/Anonymous_platform

10esseeTony
Send message
Joined: 31 Aug 11
Posts: 19
Credit: 367,946,597
RAC: 114,590

Message 63107 - Posted: 6 Feb 2015, 10:49:46 UTC - in response to Message 63085.

...My experience is that leaving a core free to feed the GPU adds so little output that it's not worth doing, ...


I would have to very very strongly disagree to that as a general rule, my HD4850 crunch time on a full size (not fast) WU went from 6 minutes to 2 minutes on a dual core machine, by leaving one core free. And I notice tens of seconds difference on my 8 thread machines with 280X's if I give them a bit of CPU power.

10esseeTony
Send message
Joined: 31 Aug 11
Posts: 19
Credit: 367,946,597
RAC: 114,590

Message 63113 - Posted: 6 Feb 2015, 23:23:34 UTC
Last modified: 6 Feb 2015, 23:50:22 UTC

///Begin playful yet serious rant to Jake///

Jake, I appreciate what you guys are up to and am honored to be able to contribute to the cause.

BUT, (there's always a 'butt' in every crowd), we'd like to hear from you. You opened this thread telling us to let you know any problems, etc. Let us know you're listening, that you're either considering our ideas/problems, or that you're just ignoring them. :) It's ok, this is part computer science after all, and if NOT utilizing the system efficiently is part of the research, just let us know.

Many have complained that the fast units are TOO fast, and they complete in almost as little time as it takes the CPU to send it back to you, and for the server to send a replacement. Some run out they are so fast.

Solution to problem A, the idle time while the CPU packages the result, is to run multiple tasks per GPU.

But you really have to do something about the quanity of GPU tasks sent at once. Or at least tell us NO, we're not going to address that at this time. SETI gladly sends a half day's worth at a time, or however many BOINC is set to receive, so we know it's possible.

I assume you're catering to those poor Nvidia folks whose double precision scores, frankly, are embarassing (except the mighty Titan series, repect due there). I suppose 10 WU would meet the goal of the standard .5 day's worth of work for those with Nvidia cards.

But here's a screenshot of my (new) primary MW@H cruncher. It's running about what, 30-40 minutes an hour, maybe? And I've not even added the final GPU yet, while waiting for the new PSU.

http://www.facebook.com/photo.php?fbid=10153140531602074&l=8c7c80e012

Shameful waste! (Actually I'm just crying because this was supposed to put me over the top and enable me to overcome/outscore some teamates. :P )


///END playful yet serious rant to Jake///

Now for the rest of you, as you've helped tremendously in getting me manually set up to run multiple tasks per GPU, can anyone help me manually configure to update more frequently and or get more GPU tasks per update? Maybe this is my fault, not Mr. Weiss's.

Thanks.

10esseeTony
Send message
Joined: 31 Aug 11
Posts: 19
Credit: 367,946,597
RAC: 114,590

Message 63114 - Posted: 7 Feb 2015, 1:51:34 UTC
Last modified: 7 Feb 2015, 1:51:45 UTC

Hmmmmm. Magically I now have tens of minutes of WU to crunch (vs 1-2 minutes). Perhaps the Weiss IS listening in?

Dunx
Send message
Joined: 13 Feb 11
Posts: 22
Credit: 321,978,570
RAC: 1

Message 63115 - Posted: 7 Feb 2015, 8:43:09 UTC - in response to Message 63114.
Last modified: 7 Feb 2015, 8:55:53 UTC

Hmmmmm. Magically I now have tens of minutes of WU to crunch (vs 1-2 minutes). Perhaps the Weiss IS listening in?


Try :-

<max_file_xfers>N</max_file_xfers> Maximum number of simultaneous file transfers (default 8).
<max_file_xfers_per_project>N</max_file_xfers_per_project> Maximum number of simultaneous file transfers per project (default 2).

dunx

P.S. Not sure I've found the right options TBH....

10esseeTony
Send message
Joined: 31 Aug 11
Posts: 19
Credit: 367,946,597
RAC: 114,590

Message 63121 - Posted: 7 Feb 2015, 23:12:42 UTC - in response to Message 63115.
Last modified: 7 Feb 2015, 23:14:17 UTC

Thanks Dunx.

I'm a GUI kinda guy, so bear with me.

Does that go in the app_config file?

Do I replace the "N" with a number or does that stand for, ahem, "N"finite? :D


PS: Everything is working as expected today, lot's of work units for all my machines. And they are even (correctly) getting priority over other projects as I've instructed. Woot woot!

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 441
Credit: 11,474,117
RAC: 240,486

Message 63312 - Posted: 2 Apr 2015, 1:21:36 UTC

Hey everyone,

Sorry I have been extremely silent on the forums the last couple months. Your concerns are being noted, and will be addressed. In the Fall I had much more time to work on the project as I was not taking classes and focused 100% on research. Starting at the end of January I began taking classes again and as such my focus is a bit split.

In other news, we have been getting wonderful results off of the Modfit program especially with the new work unit sizes. I know the may seem a bit inefficient on your ends, but results that used to take 3 weeks to get back are now done in 1-2 weeks with 2-3 times more work being done. It has made our results much more accurate and reliable.

Thank you guys for hanging in there,

Jake W.

Previous · 1 · 2
Post to thread

Message boards : News : New Modfit Runs


Main page · Your account · Message boards


Copyright © 2017 AstroInformatics Group