ATI application updated to 0.60

Author	Message
ExtraTerrestrial Apes Send message Joined: 1 Sep 08 Posts: 204 Credit: 219,354,537 RAC: 0	Message 47922 - Posted: 16 Apr 2011, 12:33:26 UTC Update to my strange ~30s screen update lag: it disappeared after a reboot.. doh MrS Scanning for our furry friends since Jan 2002 ID: 47922 · Rating: 0 · rate: / Reply Quote

[TA]Assimilator1 Send message Joined: 22 Jan 11 Posts: 377 Credit: 64,707,164 RAC: 0	Message 47925 - Posted: 16 Apr 2011, 17:21:59 UTC Last modified: 16 Apr 2011, 17:24:34 UTC So how do I set the polling mode &/or target frequency then?? (couldn't find an FAQ about it). My rig specs:- Q6600 @3.25 GHz (running F@H SMP mode) HD4830 (GPU @600MHz) Win XP SP3 32bit Was Cat 10.12 now on 11.3 Since recent MW updates I'm getting severe GUI lag making the machine un-useable :( (whereas before it was moderately laggy but useable if I didn't have to scroll much). Driver change made no difference. Hopefully the options mentioned above can sort this out :). ID: 47925 · Rating: 0 · rate: / Reply Quote

Len LE/GE Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0	Message 47930 - Posted: 16 Apr 2011, 20:45:05 UTC target frequency - into how many packets per second the data is split polling mode - how many milliseconds the cpu 'sleeps' before it is checking and waiting for data back from the gpu defaults are target frequency 30 and polling mode 1 There are still some problems with the settings and Matt said he will look into it coming week. If you want to reduce the gui lag you should increase the frequency, to reduce a lag of your system you need to increase the polling. In theory you can calculate 1000 / frequency = polling Take it as a rough calculation and add a little to frequency or polling, the higher you go in frequency the more you have to add. ID: 47930 · Rating: 0 · rate: / Reply Quote

kashi Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0	Message 47949 - Posted: 17 Apr 2011, 4:29:28 UTC Last modified: 17 Apr 2011, 5:12:57 UTC Still getting "Completed, can't validate" invalids. This is due to wingmen exceeding the maximum number of errors of 3. Some of these errors are due to ATI GPUs using older Catalyst drivers that are not compatible with the new application. Others are from CPUs which either do not work with the new application or are using old optimised applications that will no longer work without a parameter file. These computers are trashing a lot of tasks and some of the owners are just letting them run. Here are 4 such computers from the last "Completed, can't validate" I had before it was rapidly cleared from the database: hostid=264221 hostid=200293 hostid=102667 hostid=211500 Hostid 102667 in particular currently has over 7,000 tasks listed. It is still using v0.18 speedimic_sse3_64 As I requested a week ago, could the maximum number of errors be increased from 3 please until the number of these computers producing errors decreases or they are blocked from receiving work if they return many invalid results. ID: 47949 · Rating: 0 · rate: / Reply Quote

Zydor Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0	Message 47950 - Posted: 17 Apr 2011, 7:03:07 UTC The problem can be vertually killed off at source by implementing the Server-Choke to stop runaways Regards Zy ID: 47950 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Send message Joined: 1 Sep 08 Posts: 204 Credit: 219,354,537 RAC: 0	Message 47953 - Posted: 17 Apr 2011, 12:13:18 UTC - in response to Message 47925. @Assimilator1: it should get better if you: - report all finished MW tasks (assuming you're running 0.62 already) - stop your BOINC - place a file called "app_info.xml" in your MW folder with the following content: <app_info> <app> <name>milkyway</name> </app> <file_info> <name>milkyway_0.62_windows_intelx86__ati14.exe</name> <executable/> </file_info> <app_version> <app_name>milkyway</app_name> <version_num>62</version_num> <plan_class>ati14</plan_class> <flops>1.0e11</flops> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>1</max_ncpus> <coproc> <type>ATI</type> <count>0.5</count> </coproc> <cmdline>--gpu-target-frequency 60</cmdline> <file_ref> <file_name>milkyway_0.62_windows_intelx86__ati14.exe</file_name> <main_program/> </file_ref> </app_version> </app_info> This will increase the target screen refresh time from 30 Hz to 60 Hz, which in my case is enough to make it almost smooth again. You can also adjust "<count>0.5</count>" to "<count>1</count>" if you want to run 1 WU at a time rather than 2 WUs at a time (as I'm currently doing to avoid the idle GPU time between WUs). I don't expect you'd need to adjust polling with this. MrS Scanning for our furry friends since Jan 2002 ID: 47953 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 15 Jul 08 Posts: 384 Credit: 738,255,290 RAC: 32,715	Message 47954 - Posted: 17 Apr 2011, 12:27:42 UTC - in response to Message 47949. Still getting "Completed, can't validate" invalids. This is due to wingmen exceeding the maximum number of errors of 3. Some of these errors are due to ATI GPUs using older Catalyst drivers that are not compatible with the new application. Others are from CPUs which either do not work with the new application or are using old optimised applications that will no longer work without a parameter file. These computers are trashing a lot of tasks and some of the owners are just letting them run. Here are 4 such computers from the last "Completed, can't validate" I had before it was rapidly cleared from the database: hostid=264221 hostid=200293 hostid=102667 hostid=211500 Hostid 102667 in particular currently has over 7,000 tasks listed. It is still using v0.18 speedimic_sse3_64 As I requested a week ago, could the maximum number of errors be increased from 3 please until the number of these computers producing errors decreases or they are blocked from receiving work if they return many invalid results. I started a thread a couple days ago about this problem: http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2361 People have posted that their RAC has gone down about 20-25%. Mines down about the same. The client speed loss accounts for only about 5% of this. The rest I believe is (as you point out) due to good results being marked invalid because of the multitude of clients spewing out bad results at about 1 per second, causing valid results to be discarded as the WUs are marked as bad due to too many invalids. Sure would be nice (and should be easy) to resolve this. ID: 47954 · Rating: 0 · rate: / Reply Quote

Starfire Send message Joined: 19 Feb 09 Posts: 32 Credit: 32,843,308 RAC: 0	Message 47958 - Posted: 17 Apr 2011, 12:53:13 UTC - in response to Message 47878. The latest version (0.62) is running very smooth for me (Win7_64 Catalyst 11.3) - thanks for getting it to work this fast :) Yesterday I noticed a validate error - unfortunately the task is already gone from the account overview. Run time for the task was within normal boundaries compared to my other tasks. In the result the stderr was empty. If I remember it looked something like this: <core_client_version>6.12.22</core_client_version> <![CDATA[ ]]> It wasn't much more than that - so the whole output was somehow missing. I only noticed it at this one task, but with completed tasks being removed very quickly it's hard to be sure. Yes I just noticed one invalid the same: Stderr output <core_client_version>6.12.22</core_client_version> <![CDATA[ <stderr_txt> </stderr_txt> ]]> Hope I haven't been getting many of these. They clear quickly so could have had a few. Win 7 64, HD 5970, Cat 11.4. Just noticed another one: 7320292 <core_client_version>6.12.22</core_client_version> <![CDATA[ <stderr_txt> </stderr_txt> ]]> [/url] Starfire ID: 47958 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Send message Joined: 1 Sep 08 Posts: 204 Credit: 219,354,537 RAC: 0	Message 47974 - Posted: 17 Apr 2011, 22:22:13 UTC - in response to Message 47922. Update to my strange ~30s screen update lag: it disappeared after a reboot.. doh MrS And re-appeared after approximately 1 day of run time. Switching to Win 7 classic lets me move the mouse during the break but everything else is still on hold. Switching to 1 WU at a time solves the problem. Waiting for that CPU time optimization ;) MrS Scanning for our furry friends since Jan 2002 ID: 47974 · Rating: 0 · rate: / Reply Quote

RadonPL Send message Joined: 10 Aug 10 Posts: 4 Credit: 57,345,700 RAC: 0	Message 48045 - Posted: 19 Apr 2011, 15:13:09 UTC I just recieved a bunch of 0.62 WU's for Linux, graphics are very smooth now and 1WU finishes in 2 minutes flat. Thanks! OpenSUSE 11.4 AMD64 with 5850 ID: 48045 · Rating: 0 · rate: / Reply Quote

Len LE/GE Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0	Message 48085 - Posted: 20 Apr 2011, 21:48:19 UTC - in response to Message 47863. It looks like I might need to do some work on the time estimates for preventing lag on different GPUs Tried myself on an analysis, based on different WUs and settings. How it should work: frequency 30 -> 1000ms / 30 blocks => max 33.3333ms per block. gpu flops per second / 30 blocks => max gpu flops per block. Round those max values down a little and you can calculate the size of the parts to send to the gpu and how long the cpu should sleep (a positive polling value can be used to add to this sleep time). So what is not working here? After testing with different params on different WUs several times ... There seem to be 2 pattern: 1) Based on estimated iteration time the number of chunks is calculated 1 too low. Think I found it in the code of deviceChunkEstimate. Looks like a simple mistake, type conversion to integer cutting the fragment part, adding 1.0 should fix this (and saves the test for zero) since it is very unlikely that the division (estimated time / time allowed per chunk) will have no fragment. 2) The estimated iteration time is always too low. The difference to the average iteration time is increasing with higher number of chunks. Expecting that high numbers of small chunks will kill the average by many data transfers. WUs with 152.839606 ms estimated iteration time 1 chunk(s): 6.0 - 6.4 ms too low 2 chunk(s): 7.3 - 7.5 ms too low 4 chunk(s): 8.6 - 9.1 ms too low 8 chunk(s): 11.9 - 12.8 ms too low This looks like 1 fix time offset + 1 per chunk (along the lines of 5.5ms + 0.9ms * chunks, might be more complicated for an exact calculation for different gpus and clocks). The shorter ones with 109.923835 ms estimated iteration time are off even more: 1 chunk: 9.62 - 10.1 ms (got too few WUs for enough tests on higher number of chunks) (The difference to the average iteration time seems to increase with lower gpu clock too but haven't tested this one enough to proof it.) Polling mode The way the polling mode works right now makes it impossible to find a good value. Setting it to work best for WUs with high estimated time leads to low gpu utilization for WUs with low estimated time. Setting it for the faster WUs will lead to high polling and more overhead for the slower WUs. Keep polling mode < 0 like it is. Polling mode >= 0 should add it's value to a calculated sleep time slightly below the time per chunk. This way the fine tuning for a machine is far easier and independent of the WU size. ID: 48085 · Rating: 0 · rate: / Reply Quote

Mad Matt Send message Joined: 19 Sep 09 Posts: 16 Credit: 218,390,676 RAC: 0	Message 48090 - Posted: 21 Apr 2011, 18:25:55 UTC - in response to Message 47953. @Assimilator1: it should get better if you: - report all finished MW tasks (assuming you're running 0.62 already) - stop your BOINC - place a file called "app_info.xml" in your MW folder with the following content: <app_info> <app> <name>milkyway</name> </app> <file_info> <name>milkyway_0.62_windows_intelx86__ati14.exe</name> <executable/> </file_info> <app_version> <app_name>milkyway</app_name> <version_num>62</version_num> <plan_class>ati14</plan_class> <flops>1.0e11</flops> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>1</max_ncpus> <coproc> <type>ATI</type> <count>0.5</count> </coproc> <cmdline>--gpu-target-frequency 60</cmdline> <file_ref> <file_name>milkyway_0.62_windows_intelx86__ati14.exe</file_name> <main_program/> </file_ref> </app_version> </app_info> This will increase the target screen refresh time from 30 Hz to 60 Hz, which in my case is enough to make it almost smooth again. You can also adjust "<count>0.5</count>" to "<count>1</count>" if you want to run 1 WU at a time rather than 2 WUs at a time (as I'm currently doing to avoid the idle GPU time between WUs). I don't expect you'd need to adjust polling with this. MrS Cheers, ETA. I tried this app_info, but in my case it did not really help. I found some odd values for tolerable lag: 4770 2 WUs at a time 5970 3-4 WUs (does not make a huge difference) 5870 4 WUs (2 and 3 WUs can show massive lag). ID: 48090 · Rating: 0 · rate: / Reply Quote

[TA]Assimilator1 Send message Joined: 22 Jan 11 Posts: 377 Credit: 64,707,164 RAC: 0	Message 48091 - Posted: 21 Apr 2011, 18:27:45 UTC - in response to Message 47930. target frequency - into how many packets per second the data is split polling mode - how many milliseconds the cpu 'sleeps' before it is checking and waiting for data back from the gpu defaults are target frequency 30 and polling mode 1 There are still some problems with the settings and Matt said he will look into it coming week. If you want to reduce the gui lag you should increase the frequency, to reduce a lag of your system you need to increase the polling. In theory you can calculate 1000 / frequency = polling Take it as a rough calculation and add a little to frequency or polling, the higher you go in frequency the more you have to add. Err thanks Len but I still don't know where to change those options as I said earlier ;). ExtraTerrestrial Apes Thanks, that should sort things out :) ID: 48091 · Rating: 0 · rate: / Reply Quote

[TA]Assimilator1 Send message Joined: 22 Jan 11 Posts: 377 Credit: 64,707,164 RAC: 0	Message 48124 - Posted: 22 Apr 2011, 15:28:18 UTC Wot no editing?? ETA >>>You can also adjust "<count>0.5</count>" to "<count>1</count>" if you want to run 1 WU at a time rather than 2 WUs at a time (as I'm currently doing to avoid the idle GPU time between WUs). I don't expect you'd need to adjust polling with this.<<< My count setting was already on 1, why would you want to run 2 at a time? It switches from 1 WU to the next nearly instantly anytime I've watched it ...... ID: 48124 · Rating: 0 · rate: / Reply Quote

Sunny129 Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0	Message 48126 - Posted: 22 Apr 2011, 16:06:02 UTC - in response to Message 48124. actually, for many of us, there appears to be some dead time between tasks when running one at a time (its not much, only on the order of seconds). but some folks feel that the extra few percentage points increase in production/crunching efficiency is worth running 2 tasks at a time, as it does seem to minimize that small (but still significant to some) pause between tasks. i know its a bit pf a pain at this point b/c this thread is so long, but the benefits of running 2 tasks at a time are discussed somewhere above. i know the app_info file i posted for you on the AT forums contains <count>1</count>, but i actually run 2 tasks at a time as well. ID: 48126 · Rating: 0 · rate: / Reply Quote

Mad Matt Send message Joined: 19 Sep 09 Posts: 16 Credit: 218,390,676 RAC: 0	Message 48207 - Posted: 24 Apr 2011, 19:47:59 UTC Not sure what was happening now, but my output has gone down the drain, most likely because of inconclusive validations and also a good number of invalid validations. Is this linked to optimized apps settings or are there stock users concerned as well? ID: 48207 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 15 Jul 08 Posts: 384 Credit: 738,255,290 RAC: 32,715	Message 48212 - Posted: 25 Apr 2011, 3:29:17 UTC - in response to Message 48207. Not sure what was happening now, but my output has gone down the drain, most likely because of inconclusive validations and also a good number of invalid validations. Is this linked to optimized apps settings or are there stock users concerned as well? Yes, many. See this thread: http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2361 ID: 48212 · Rating: 0 · rate: / Reply Quote

[TA]Assimilator1 Send message Joined: 22 Jan 11 Posts: 377 Credit: 64,707,164 RAC: 0	Message 48216 - Posted: 25 Apr 2011, 12:38:41 UTC - in response to Message 48212. I've had enough of the massive lag that MW causes me, & since I also found out that about the same time (ish,11/4) as the MW update my F@H output has nose dived I've stopped running MW, at least until I can find settings that counter those problems. There's not really a clear explanation of what either the GPU target freq does nor the GPU polling mode. Like wth does 'busy waiting' refer to??? Nevertheless I did try increasing the target freq to 40, which stopped it crunching!, 60,90 & 120 all of which had no significant impact, still huge GUI lag, although 120 helped very slightly. I'll try increasing the polling mode, but again their is no guidance on how much I should increase it, in whole numbers or tenths?? Sunny Got ya :), probably my HD 4830 isn't quick enough to worry about that ;), plus I get enough lag running just 1 WU. ID: 48216 · Rating: 0 · rate: / Reply Quote

JAMC Send message Joined: 9 Sep 08 Posts: 96 Credit: 336,443,946 RAC: 0	Message 48217 - Posted: 25 Apr 2011, 13:11:39 UTC - in response to Message 48207. Last modified: 25 Apr 2011, 13:13:51 UTC Not sure what was happening now, but my output has gone down the drain, most likely because of inconclusive validations and also a good number of invalid validations. Is this linked to optimized apps settings or are there stock users concerned as well? My theoretical daily RAC for a 5870 on W7_64 here is now ~202,243cr, which is near as makes no difference to it's RAC at PG ~202,663(coincidence?), and you can actually buffer an entire day of work there, so I am easing back to PG for now... ID: 48217 · Rating: 0 · rate: / Reply Quote

Matt Arsenault Volunteer moderator Project developer Project tester Project scientist Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0	Message 48219 - Posted: 25 Apr 2011, 13:55:21 UTC - in response to Message 48216. I've had enough of the massive lag that MW causes me, & since I also found out that about the same time (ish,11/4) as the MW update my F@H output has nose dived I've stopped running MW, at least until I can find settings that counter those problems. There's not really a clear explanation of what either the GPU target freq does nor the GPU polling mode. Like wth does 'busy waiting' refer to??? The target frequency influences how large the work sent to the GPU at once is. Higher reduces it, which decreases lag. The frequency number controls how long it attempts to make the GPU tasks. The default of 30 tries to reach (1000 (milliseconds / second) / 30) = 33.3ms for the maximum time the GPU will take which should allow a chance for the screen to redraw at ~30fps. The flag is meant as a fallback if the time estimates aren't good for your GPU. It looks like the time estimates you're getting are a bit low for what you actually get. I'm going to do work on the time estimates (particular for non-5xxx GPUs) before the next release. Ideally you shouldn't need to set anything. Doubling it should theoretically mean half the lag. In another thread, someone set it to 300 to reduce their GPU usage to ~10%. Busy waiting is constantly checking if the GPU is done, which causes 100% CPU usage but is least likely to waste extra time waiting after a packet is done one the GPU. Nevertheless I did try increasing the target freq to 40, which stopped it crunching!, 60,90 & 120 all of which had no significant impact, still huge GUI lag, although 120 helped very slightly. I'll try increasing the polling mode, but again their is no guidance on how much I should increase it, in whole numbers or tenths?? Sunny Got ya :), probably my HD 4830 isn't quick enough to worry about that ;), plus I get enough lag running just 1 WU. ID: 48219 · Rating: 0 · rate: / Reply Quote