Welcome to MilkyWay@home

Posts by JerWA

41) Message boards : Number crunching : ATI GPU app 0.19f on Vista ulimate 64 + 4870 x2 (Message 26750)
Posted 30 Jun 2009 by Profile JerWA
Post:
PoorBoy you're running 6.5.0, which I haven't tested. The settings I'm using are all for 6.4.7. Have you tried that manager yet?
42) Message boards : Number crunching : ATI GPU app 0.19f on Vista ulimate 64 + 4870 x2 (Message 26642)
Posted 28 Jun 2009 by Profile JerWA
Post:
You can tweak the GPU app in the folder you extracted it to by editing the app_info.xml file, in between the <cmdline></cmdline> tags.

n# is the max # of units (per GPU) actually getting GPU time at once. It's default value is 3. You will see lots marked "Running" but only this many should actually be running. 6.6.36 is bugged in that all of them marked running are counting time even when not getting any GPU time (did for me anyways). This was actually fixed in .19f but 6.6.36 seems to have unfixed it for them.

I run n2, and will show up to 3 "Running," and will sometimes only have 1 Running real or otherwise, but never more than 2 actually getting GPU time. Your 4870X2 counts as 2 GPUs (from your task logs: Found 2 CAL devices), so it should be running 6 at once which should be ok on a 2GB card which yours is (also from the task logs: Device 0: ATI Radeon HD 4800 (RV770) 1024 MB local RAM, and the other device also shows 1024).

w# sets the wait time on the CPU thread between GPU WUs. In effect it's the app trying to be nice and not tie up the CPU while doing GPU work. Default is 1.0. If you see too little GPU load decrease this. If you see too much CPU load increase it. In your case if you're intentionally trying to slow it down a bit you could try increasing this very slowly (i.e. add w1.1 and let it run awhile to see if that unloads the GPUs any).

x# excludes a GPU from processing entirely. They start at 0, so yours are 0 and 1. While this could be used to drop all load from one of the GPUs I'd actually recommend against it if you're trying to control heat because this will push all load onto just one GPU, localizing heat and not the whole card so the fan profiles would probably not spin up as much, leading to more heat, etc. You'd probably be better off dropping the n# and increasing the w# so that both cores are loaded relatively evenly, just less than 100%.

I'm a newbie here, so hopefully someone with more in-depth tweaking experience will chime in, but I'd think a safe place to start would be...

<cmdline>n1 w1.1</cmdline>

Then let it run awhile and see what that does. On my 4870 n1 gets me 75-85% load, may be lower on yours (Control Center will tell you), and the slightly increased wait should slow it down a bit too. With my fan locked at 55% my load temps are below my stock idle temps. You may want to bounce your fans up too if you haven't, as this whole family of cards is notorious for running hot (within design spec but way too high for me to be happy). 55% is a bit extreme, and very loud (it's audible over my 133cfm 3k RPM 120mm CPU and exhaust fans), but on my card 40% was quite livable and dropped temps over 20C.

For any change to the app_info.xml file you'll need to stop BOINC and exit it completely, make your change, then start BOINC back up.

If you don't have one already I'd highly recommend an app to monitor temps. GPU-Z works well with 9.5 (http://www.techpowerup.com/gpuz) but makes my system very cranky (long GPU pauses) while actually running so it's kind've open it, check temps and load, and close it. I also use the Everest Ultimate Vista sidebar app to monitor temps (CPU, MB, Case and GPU) which doesn't cause the same issue as GPU-Z.

Hope this helps, I'm still learning it all myself. :-)
43) Message boards : Number crunching : ATI GPU app 0.19f on Vista ulimate 64 + 4870 x2 (Message 26633)
Posted 28 Jun 2009 by Profile JerWA
Post:
I've had to weight the project (200 vs 100 for everything else), set connect every to 0, queue 2 days worth, AND setup an automated task to reset project debt every hour or it stops getting work.

How much it downloads each time seems to be based on debt and predicted run-times. As both go totally haywire seemingly at random (most WUs finish under 90 seconds for me, but every once in awhile I'll check and they're all predicting 10 minutes or something similarly outrageous) it's hard to keep it running smoothly. Likewise, even when properly predicting the 1:00-1:30 time-to-finish "2 days" of queued work for the project is never more than 20 WUs, or just about 30 minutes. Go figure.

Mine right now is usually doing a cycle something like this:
20 WUs Waiting, Complete 4 WUs, report, Download 2 WUs.
16 WUs Waiting, Complete 8 WUs, report, Download 4 WUs.
10 WUs Waiting, Complete 4 WUs, report, Download 2 WUs.
6 WUs Waiting, Complete 2 WUs, report, Download *

Now, depending on when in this cycle my debt reset runs it will either continue to shrink until it runs dry and then it spends the rest of the hour downloading 2 every time 2 finish (not friendly to the server for sure), or it will realize it's outrunning the pace at which it's requesting work and with clear debts will happily reload up to 20 WUs or so.

Unfortunately this is my main system and I use it for other things and suspend applications every now and then. Once you do that all bets are off, the managers response has yet to be predictable. Sometimes it comes back fine and everything goes back to what it was doing, sometimes the MW app gets stuck in a wait loop (i.e. they say running but aren't getting any GPU time) and I have to restart the manager.
44) Message boards : Number crunching : Milestones II (Message 26632)
Posted 28 Jun 2009 by Profile JerWA
Post:
I can hardly believe no Knights (pronounced Kah Nig Its!) are in this thread yet! While my own shrubbery contribution is paltry at best (but gaining speed), our team isn't doing too bad! Hitting over 1,000,000 credits/day pretty regularly now and should pass 50,000,000 in MW in a few days. Currently in 12th but 7th by RAC. Not too shabby for only 8 people on the GPU app. Guess I need some more graphics cards hehe.

45) Message boards : Number crunching : ATI GPU app 0.19f on Vista ulimate 64 + 4870 x2 (Message 26630)
Posted 28 Jun 2009 by Profile JerWA
Post:
Actually depending on how many "normal" (i.e. CPU apps) projects you're running you'll probably have more issues keeping the scheduler running right than getting work from the server.

See my thread about tripping MW cycles for more info on all the stuff I've been doing trying to keep mine going. Can't wait for the manager to play nice with GPU apps that aren't CUDA based.
46) Message boards : Number crunching : ATI GPU app 0.19f on Vista ulimate 64 + 4870 x2 (Message 26627)
Posted 28 Jun 2009 by Profile JerWA
Post:
The easiest way I know to check for protected mode is to open the Services msc (Start, Control Panel, Administrative Tools, Services or Start->Run->Services.msc) and look for BOINC. If it's there, that's protected mode.

The only way I know of to switch is to uninstall and reinstall, but you don't lose anything (projects or tasks) doing so. I've switched versions several times trying to get the MW GPU app to run happily with everything else.

You may also run into a lot of scheduler issues with 6.6.36, I did at least. So far I'm having the best luck with 6.4.7 but even that's pretty touchy (having to do a lot to keep it running predictably).

I did read from someone else on the forum here that the app should work with 9.6, it just crashes the video driver a lot. I don't think you're getting that far, the tasks reports don't show any output in stderr and it should.

As for the black box that's definitely not happening on mine. Are you putting the x86 or x64 app in?
47) Message boards : Number crunching : ATI GPU app 0.19f on Vista ulimate 64 + 4870 x2 (Message 26624)
Posted 28 Jun 2009 by Profile JerWA
Post:
If that's the 920 listed under your computers it's running the CPU app still it looks like.

Make sure to stop the BOINC manager (and all apps) completely, exit, then extract the GPU app in the folder for Milky Way (lemme know if you need help finding it). If you've done everything else and it's not even trying to run the app then it's likely you're not exiting all the way (especially in protected mode, where you may have to stop the service manually).

Did you copy the DLLs in System32 like the readme says?

If you're running in protected mode that causes problems too. I had to reinstall the manager w/o protected mode and then force-run it as administrator to get it running on mine. But even without that it was trying to run the GPU app at least, just failing to. Your tasks, on the other hand, don't look like they're coming through the GPU app at all.
48) Message boards : Number crunching : Is there a way to force "trip" the next MW cycle in .36? (Message 26621)
Posted 28 Jun 2009 by Profile JerWA
Post:
I presume you could use a script like this for the duration_correction_factor being reset to 0.000000 on a regular basis from the usual 100.000000?

Not sure, I see that value in client_state but not in boinccmd.

I've run into some odd behavior today again. MW app got stuck in the manager, showed 3 running but none getting any GPU time. Had to restart the manager to sort that. Bumped up to n2, and set my debt reset to run every hour after watching it stop pulling work again today already.

Also set it to debt 100 100 instead of 0 0 just to see if it made much difference in terms of how long before it ran into a problem again.
49) Message boards : Number crunching : Is there a way to force "trip" the next MW cycle in .36? (Message 26593)
Posted 27 Jun 2009 by Profile JerWA
Post:
I ran 3 days full out before it got enough negative debt accrued to stop pulling work again. This is with manager 6.4.7, 200 resource share (everything else at 100), the only GPU app, with everything in MW default except n1 is set (I saw no gain letting it run concurrent WUs, think the 512GB of RAM is a limiting factor but may revisit).

Easily worked-around (and now automated as a Windows scheduler task for me)

(Run from your BOINC install directory, as running it from any other path will make it prompt you for your BOINC client auth information)

boinccmd --set_debts http://milkyway.cs.rpi.edu/milkyway 0 0

Can be run while the client is active.

Was sitting at around -18000 debt, no longer downloading MW work (which = idle GPU). Reset debts for just MW (which does nudge other debts around, so don't change it too drastically) and within 60 seconds it had contacted MW for more work and has been running since. I'll see what this does long-term and go from there.

Pushing ~60,000/day from a 512GB HD4870.
50) Message boards : Number crunching : MW's WUs crunched at lower priority. (Message 26392)
Posted 24 Jun 2009 by Profile JerWA
Post:
Since you can set debts using boinccmd, would a timed script event forcing MWs debts back to whatever they need to be work?

--set_debts URL1 STD1 LTD1 {URL2 STD2 LTD2 ...}
    Set the short- and long-term debts of one or more projects.
Note: if you adjust the debts of a project, the debts of other projects are changed, so if you want to set the debts of multiple projects, do it in a single command.


And is there a flavor of 6.2.x that's preferred? I'm going to try 6.2.19 the last documented release before 6.3.x and see what it does.

Edit: 6.2.19 looks to be doing to exact same thing 6.3.36 is. If I cycle the client it will fire MW units along side the 4 CPU apps but as soon as it finishes them and downloads more it goes back to idling until it trips their "turn" again even though they shouldn't be CPU bound.
51) Message boards : Number crunching : Is there a way to force "trip" the next MW cycle in .36? (Message 26388)
Posted 24 Jun 2009 by Profile JerWA
Post:
I'm still bashing my setup here trying to get everything to play nicely together but not having much luck.

I was running the .21 manager and thought all was going well until I got up this morning to find 15 instances of MW trying to run at once (with absolutely no changes from last night when it was only running between 1 and 3) and causing the BOINC manager to display incorrect info (showed ready to report units that were already gone) and crash when I tried to close it. I'm thinking that's a manager bug, not a MW one.

So I upgraded to .36 again and it definitely behaves differently. On the plus side it seems way more stable and predictable than .21 for getting work. Unfortunately it seems to request much less work, and each time it runs dry and requests the next batch it doesn't start it. It's like it thinks MW is a CPU app and makes it wait for one of the others to hit it's timer, then trips the first MW unit and goes back to running 4 CPU apps as it should. As long as it's got MW work queued it will keep it running along side the other apps, starting up a new WU every time it finishes one, but each download batch ends up waiting again.

So my question is: is there a way to "trip" that next cycle somehow? Or, is there a manager we like more than .21?

Update:

Tried 6.2.19 and it was even worse than 6.6.36 because it would run just 1 WU and then idle MW, wouldn't even make it through the whole batch.

Trying 6.4.7 now and it seems like a happy medium so far. It's keeping MW running along with the 4 CPU projects and keeping the queue full. It does panic occasionally and high priority MW units but nothing is competing for GPU time so I don't care. I'll update after it's had some time to run and mess up the debts. It's also not incrementing the timer when a unit is "running" but not getting any GPU time yet because I'm on n1. 6.6.21 was incrementing all timers, so when it flipped out and had 13 WUs "running" they were showing 30-40 minutes to complete, which was messing up the prediction and queuing (because BOINC thought they were taking 30+ minutes each). It's also more "chatty" than .6.36 and .6.21 were, in that it's updating the project every 2-3 WUs and getting more, which I like but is probably adding overhead to the way 6.6.x was handling it by updating in large batches.
52) Questions and Answers : Windows : Help a Knight out please. (Message 26255)
Posted 22 Jun 2009 by Profile JerWA
Post:
Look I get to reply myself now. Thanks for all the help Cluster. Don't think I would've stumbled onto the permissions problem without it even giving a hint.


Previous 20

©2024 Astroinformatics Group