Message boards :
Number crunching :
79XX Dont Run
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next
Author | Message |
---|---|
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
It blue screened after my last post Sure is good to have someone (else) on the bleeding edge :) |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
on a side note, i've never been able to eliminate invalids altogether on my HD 5870 Looks like from Matt's post above there is an issue lurking still with the app, so it may not be all your problem. and it never occurred to me that it might be a memory clock (or more appropriately a memory bandwidth issue) At MW it usually is not a memory clock issue. The bandwidth needs at MW are tiny, so memory bandwidth can be at rock bottom. Over at Moo I run that WU with 2x5970s via app_info with 1 per GPU, and card memory at 175 - MW memory needs are way less than Moo's, so 175 is rock solid take it down there and leave it there, its not an issue. (The 5970 and 5870 use the same GPU) . .... yet i wonder if i should be concerned about this and continue to troubleshoot it like i have, or just let it go. Probably let it go, especially given Matt's post above re stderr. However, its good housekeeping to make sure all other issues are solved, so worth testing out in a methodical way before accepting the 1% as such. i increased the memory clock from 600mhz to 700mhz (since underclocking the VRAM too low can also yield adverse affects just like overclocking the VRAM too much can) and let it run for several hours, and it actually increased the number of invalids i usually get. i then tried VRAM at 800mhz, and the number of invalids came back down to about as many as i was getting when VRAM was set to 600mhz. perhaps if i continue to raise the VRAM frequency closer to the stock 1200mhz in 100mhz increments might yield positive results, but for now i've stopped testing... I suggest as a way ahead, go read the Guru3D review on the 5870, and get to know the inside of the card from the review. In particular note the overclocking session, voltages and results. Then you will know whats possible - and save time by backing off from their values a little as a start point for your test, knowing your 90% there - they will have done the hard work for you. Its worth vesting time in Guru3D reviews - they are very very good. http://www.guru3d.com/category/Videocards/ .... and the page on 5870 overclock is : http://www.guru3d.com/article/radeon-hd-5870-review-test/26 You will never reproduce their result as they only test against a game and we hammer the hell out of the GPU with a Compute application, so back off from their end point, and give yourself space. Probably 890 GPU / 175 memory would be a start point. Then step up GPU by increments of 5. Once you get it invalid free or as much as you can, step back five and leave it at that. Difficult to be precise, play the end game by ear as you see it. To get it down to 175 will mean manually editing the profile file, if you dont fancy that, just turn down memory as low as it will go inside CCC, thats fine. Then Use CCC up to the point where you cant increase GPU any more, or it falls over anyway. If you are still going when reached CCC GPU limits, to go further will mean voltage changes, dont do voltage changes unless you really do know what your doing on that ..... dont take risks with voltage, its a fast route to burning a card ..... apologies if you already do know ... I'm just being cautious, hate to see a card burnt, and you not knowing the risks. Stay inside an unmodified CCC with any changes you make, and you will not burn it, cant happen, blue screen maybe, but that goes with the territory, but .... please ....dont over-volt unless you have done it before. Regards Zy |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
thanks for all the tips Zy... i think for now i'm going to hold off on overclocking the GPU clock and focus on underclocking the VRAM. i especially appreciate the tips on using CCC, as i'm just now finding that it'll allow me to take VRAM lower than the minimum of 600mhz allowed by MSI Afterburner (beta version w/ unofficial OCing unlocked of course). i've always just assumed that CCC's range of adjustments would never be broad enough for my needs, and so i've always used MSI Afterburner in its place. i'll start off by seeing how low it'll allow me to take VRAM, and if i'm not happy with the minimum value CCC allows, then i'll worry about manually editing the profile file... *EDIT* - i just realized that the VRAM clock slider tells you the minimum, and 300mhz is as low as CCC will take it. so i'll let her run @ 850mhz/300mhz for now and start a new invalid/error count. |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Sure is good to have someone (else) on the bleeding edge :) I dont mind the bleeding bit ...... its hemorrhaging that gets tedious .... I leave the latter to extreme overclock lunatics, and those insane freaks who play with LN2. I have never understood the attraction of the latter .... its so freakin dangerous its unreal, they are mental, its the only conclusion I can come up with :) Regards Zy |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
i've always just assumed that CCC's range of adjustments would never be broad enough for my needs, and so i've always used MSI Afterburner in its place. i'll start off by seeing how low it'll allow me to take VRAM, and if i'm not happy with the minimum value CCC allows, then i'll worry about manually editing the profile file... If you are into Afterburner ... great.... sounds good to me. Unoffical mode gives the ability to change voltages though .... Satan has a way of rewarding its reckless use :) EDIT: *EDIT* - i just realized that the VRAM clock slider tells you the minimum, and 300mhz is as low as CCC will take it. so i'll let her run @ 850mhz/300mhz for now and start a new invalid/error count. Thats normal. Leave it 300 its fine, ok you will save some heat and power by going to 175 with manual profile change .... but it can be a pain to redo after driver reloads, and at the end of the day there is not much saving 300->175 to fuss over frankly Regards Zy |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
Sure is good to have someone (else) on the bleeding edge :) Not sure what OCing with LN2 proves, but humans like to push the limits: a game like so many others we play. Similar to top fuel dragsters I guess. |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
..... Similar to top fuel dragsters .... They are just barking mad, no question, they all need permanent theropy rofl :) Regards Zy |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
I keep my 5830 at 800/500, that is as low as Afterburner will go on the card. Of course the GTX560 is 1620/810. |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Was about to hit the pit .... and did a last check, the completion times were slowly creeping up, albeit only by 2 secs or so, however it was at around the usual timeframe 3hrs(ish). There also had been two invalids. So I've rebooted (at 0330 UTC), put voltage up a notch. Settings are now: 1.224v / 1220 GPU / 1375 Memory / 53 fan card temps 65 & 66 degrees Lets hope its still up when I get up :) It is looking better after that voltage change, because after the first couple of runs settling in, so far, mostly, WUs have been within a few hundreds of a second of each other, and that's more stable than before. Its settled slightly higher at around +/- 45.6 secs per WU. So ... to my pit, fingers crossed :) Regards Zy |
Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0 |
any particular reason you're crunching with the cards in X-fire? seeing as how neither X-fire nor SLI scales perfectly, 2 AMD/nVidia GPUs in X-fire/SLI will never produce twice the performance of one of those GPUs. 2 AMD cards (or nVidia cards) not in X-fire (or SLI) on the other hand will have twice the compute power. i'm assuming you game part of the time and don't want to hassle with regularly enabling and disabling X-fire, or have some other good reason for running those GPUs in X-fire even though its generally counterproductive to GPGPU computing? I've got two HD5870 in Crossfire and I disabled it after reading this post. Run times are now only about 1 second shorter (from 62 to 61 secs in average). |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Not too much difference overnight, still getting about 2 to 4 per hour average in total for both cards, which is around 0.01% error rate. I backed off to 1210, and going to leave it at that, could well be the error issue that Matt posted above, dont know. Still, ended up much better than start of the session yesterday, so take it and run as they say :) settings now 1.218v / 1210 GPU / 1375 Memory / fan 53 Regards Zy |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 1 |
Not too much difference overnight, still getting about 2 to 4 per hour average in total for both cards, which is around 0.01% error rate. Hey I was over on Collatz this morning and they got the 7990 software working! http://boinc.thesonntags.com/collatz/forum_thread.php?id=831 Maybe Matt can talk to Slicker and see what he did. In the users pc's over there it now says: "CAL Tahiti (3072MB) driver: 1.4.1658" for his gpu! |
Send message Joined: 25 Jan 11 Posts: 271 Credit: 346,072,284 RAC: 0 |
any particular reason you're crunching with the cards in X-fire? seeing as how neither X-fire nor SLI scales perfectly, 2 AMD/nVidia GPUs in X-fire/SLI will never produce twice the performance of one of those GPUs. 2 AMD cards (or nVidia cards) not in X-fire (or SLI) on the other hand will have twice the compute power. i'm assuming you game part of the time and don't want to hassle with regularly enabling and disabling X-fire, or have some other good reason for running those GPUs in X-fire even though its generally counterproductive to GPGPU computing? i suppose some DC projects are more adversely affected by crunching in SLI/X-fire than others. plus i've never had more than one 5870 in a single machine to test them in X-fire and then separately. in theory though, there should be an increase in DC productivity if you un-crossfire your GPUs. unlike games, where multiple GPUs must be synchronized via SLI/X-fire in order to contribute to driving the single graphical output of a game, those same GPUs do not have to be in SLI/X-fire in order for DC projects to take full advantage of their compute power (thus allowing perfect scalability of GPU compute power). i suppose the reason your MW@H performance hardly improved when you un-crossfired your GPUs is b/c X-fire wasn't holding DC productivity back that much in the first place. in other words, even though in theory DC should be slower in X-fire b/c X-fire does not scale perfectly, the fact that MW@H still involves crunching through data with massive amounts of ILP (instruction-level parallelism) means that individual tasks are still generally sent to one GPU or the other (and not split somehow between the two), allowing the GPUs to crunch like individuals, even though they're still in X-fire... ...this is all just speculation though. i imagine it would take someone far more knowledgeable than me to help us understand what's really going on here... |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
In the users pc's over there it now says: "CAL Tahiti (3072MB) driver: 1.4.1658" for his gpu! <core_client_version>7.0.12</core_client_version> <stderr_txt> Collatz Conjecture v3.06 for OpenCL Based on the AMD Brook kernels by Gipsel Device 0 Device Vendor Advanced Micro Devices, Inc. Name Tahiti Driver version CAL 1.4.1658 (VM) Version OpenCL 1.1 AMD-APP (851.6) Start 2373716052793973516648 Checking 824,633,720,832 numbers Threads 256 Numbers/Kernel 4,194,304 Kernels/Reduction 256 Numbers/Reduction 1,073,741,824 Reductions/WU 768 Highest Steps 1,829 for 2373716052868342915369 Total Steps 418,663,445,214,577 GPU time 1,082.99 seconds CPU time 0.998406 seconds Total time 1,083.49 seconds Looks like it's (the 7970) about 30% faster than the 5850 and 20% faster than the 5870 at this point with the OpenCl app. Now if someone can get a CAL app going... |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Going to try an extended run with crossfire @1210/1375 1.2v .... I know what should happen, quite a bit slower, but its claimed they have improved it, lets see .... Regards Zy |
Send message Joined: 1 Sep 08 Posts: 204 Credit: 219,354,537 RAC: 0 |
@Crossfire: does that matter at all? It's for games and only used if AMD gives you a profile for the game. This should have nothing to do with GP-GPU, where you usually run 1 pogram per GPU (unless specified otherwise). If XFire was working as intended you'd have to see halved times per WU, but only running 1 WU at a time per divce, compared to several WUs at once in the normal mode. @Testing: with the recent server upgrade and insta-purge being gone we've finally got a good method of testing stability: at the bottom of "show tasks for your computer" it shows valid results, invalids and errors. For me about 3000 tasks are kept in this record, which is statistically relevant. At about 1600 WUs/day it takes 2 days for this statistic to update completely after I change something. I suggest you guys use this instead of "only 1 or 2 errors in x hours". That number is too small to judge stability. On my Cayman I observed a failure rate of 0.7 % at 900 MHz @ 1.10 V. That's what I settled at in painful hand-tuning. Now I increased the GPU clock to 905 MHz and observe an increased failure rate to 1.4 %. That's a significant increase and certainly not worth it. Trying 1.11 V now. MrS Scanning for our furry friends since Jan 2002 |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Matt Are there any clues yet, as such, as to how many of the E Truncated errors may in fact be valids underneath it all? At present I have the 2x7970 cards stable as such - just playing with preferred voltage level 1.218 to 1.23. Where it stays obviously depends on the invalid rate being at or near zero. Thats difficult to judge whilst still having to take into account the E Truncation as its not known, on the face of it, how many of those are actually invalids, and how many are valids masked by that error. At present am working on worst case where the majority are genuine invalids masked by the E Truncation error. Whats your gut feeling as to the proportion of genuine invalids inside those affected by E Truncation? To a degree its a bit like picking the lottery numbers I am aware, but a gut feeling (however that turns out, I know its impossible to guage accurately) would be helpful. Regards Zy |
Send message Joined: 29 Aug 07 Posts: 486 Credit: 576,548,171 RAC: 0 |
You can find the applications right here on Arkayn´s page: http://www.arkayn.us/forum/index.php?action=downloads;cat=11 This Thread has gotten so long it's a Nightmare to try & figure out how to get the Wu's to run. Anyway I must be doing something wrong, I just get Wu's that are going to take over 2 Hr's to run ??? They say their ATI Wu's though ... ??? STE\/E |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
app_info is fine. Running correctly ..... Task Manager should show one instance of the openCL.exe per WU, the name column in BAM should show ps_separation_82 .... , and the application column should show "local: milkyway 0.82 (ati14ati)" Did you restart BAM after putting the app_info into project directory? Once the app_info is there, and the names above check out, its then a case of what is wrong at your end re setup. Suggest you start by setting cache to zero to prevent trashing, and CCC wack it up to 1125 GPU, 20% power. Once all thats checked out, not much else as such, will need head scratching on hardware config your end. Initially they will show long completion times due to the classic BOINC counting, other than that they are ok showing up the normal stuff fine. As always given a couple of dozen running through , they set fine for reality time. Yell with any symptoms, I try and scratch the 'ol brain. EDIT: ahhh what BAM version you running, you need a 7.XX, suggest 7.0.8 to start with, dont update to AMD 12.1 that does not yet support 7970s, stay with the release day RC drivers. If you need new AMD drivers, I will put them up in my webspace for you to grab. Regards Zy |
Send message Joined: 29 Aug 07 Posts: 486 Credit: 576,548,171 RAC: 0 |
Running 7.0.3 Client, 12.1 Drivers, what are release day RC drivers ??? The Wu's ran over 2 Minutes & hadn't shown any signs of Progression ... What is the app_info I should be using ??? Any other files I need ??? Thanks STE\/E |
©2024 Astroinformatics Group