Welcome to MilkyWay@home

79XX Dont Run

Message boards : Number crunching : 79XX Dont Run
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 52697 - Posted: 28 Jan 2012, 1:25:27 UTC - in response to Message 52696.  

It blue screened after my last post

Sure is good to have someone (else) on the bleeding edge :)
ID: 52697 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 52698 - Posted: 28 Jan 2012, 1:48:42 UTC - in response to Message 52692.  
Last modified: 28 Jan 2012, 1:53:01 UTC

on a side note, i've never been able to eliminate invalids altogether on my HD 5870


Looks like from Matt's post above there is an issue lurking still with the app, so it may not be all your problem.

and it never occurred to me that it might be a memory clock (or more appropriately a memory bandwidth issue)


At MW it usually is not a memory clock issue. The bandwidth needs at MW are tiny, so memory bandwidth can be at rock bottom. Over at Moo I run that WU with 2x5970s via app_info with 1 per GPU, and card memory at 175 - MW memory needs are way less than Moo's, so 175 is rock solid take it down there and leave it there, its not an issue. (The 5970 and 5870 use the same GPU)

. .... yet i wonder if i should be concerned about this and continue to troubleshoot it like i have, or just let it go.


Probably let it go, especially given Matt's post above re stderr. However, its good housekeeping to make sure all other issues are solved, so worth testing out in a methodical way before accepting the 1% as such.

i increased the memory clock from 600mhz to 700mhz (since underclocking the VRAM too low can also yield adverse affects just like overclocking the VRAM too much can) and let it run for several hours, and it actually increased the number of invalids i usually get. i then tried VRAM at 800mhz, and the number of invalids came back down to about as many as i was getting when VRAM was set to 600mhz. perhaps if i continue to raise the VRAM frequency closer to the stock 1200mhz in 100mhz increments might yield positive results, but for now i've stopped testing...


I suggest as a way ahead, go read the Guru3D review on the 5870, and get to know the inside of the card from the review. In particular note the overclocking session, voltages and results. Then you will know whats possible - and save time by backing off from their values a little as a start point for your test, knowing your 90% there - they will have done the hard work for you. Its worth vesting time in Guru3D reviews - they are very very good.

http://www.guru3d.com/category/Videocards/

.... and the page on 5870 overclock is :
http://www.guru3d.com/article/radeon-hd-5870-review-test/26

You will never reproduce their result as they only test against a game and we hammer the hell out of the GPU with a Compute application, so back off from their end point, and give yourself space. Probably 890 GPU / 175 memory would be a start point. Then step up GPU by increments of 5. Once you get it invalid free or as much as you can, step back five and leave it at that. Difficult to be precise, play the end game by ear as you see it.

To get it down to 175 will mean manually editing the profile file, if you dont fancy that, just turn down memory as low as it will go inside CCC, thats fine. Then Use CCC up to the point where you cant increase GPU any more, or it falls over anyway. If you are still going when reached CCC GPU limits, to go further will mean voltage changes, dont do voltage changes unless you really do know what your doing on that ..... dont take risks with voltage, its a fast route to burning a card ..... apologies if you already do know ... I'm just being cautious, hate to see a card burnt, and you not knowing the risks.

Stay inside an unmodified CCC with any changes you make, and you will not burn it, cant happen, blue screen maybe, but that goes with the territory, but .... please ....dont over-volt unless you have done it before.

Regards
Zy
ID: 52698 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 52699 - Posted: 28 Jan 2012, 2:05:37 UTC
Last modified: 28 Jan 2012, 2:08:51 UTC

thanks for all the tips Zy...

i think for now i'm going to hold off on overclocking the GPU clock and focus on underclocking the VRAM. i especially appreciate the tips on using CCC, as i'm just now finding that it'll allow me to take VRAM lower than the minimum of 600mhz allowed by MSI Afterburner (beta version w/ unofficial OCing unlocked of course). i've always just assumed that CCC's range of adjustments would never be broad enough for my needs, and so i've always used MSI Afterburner in its place. i'll start off by seeing how low it'll allow me to take VRAM, and if i'm not happy with the minimum value CCC allows, then i'll worry about manually editing the profile file...

*EDIT* - i just realized that the VRAM clock slider tells you the minimum, and 300mhz is as low as CCC will take it. so i'll let her run @ 850mhz/300mhz for now and start a new invalid/error count.
ID: 52699 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 52700 - Posted: 28 Jan 2012, 2:05:52 UTC

Sure is good to have someone (else) on the bleeding edge :)


I dont mind the bleeding bit ...... its hemorrhaging that gets tedious .... I leave the latter to extreme overclock lunatics, and those insane freaks who play with LN2.

I have never understood the attraction of the latter .... its so freakin dangerous its unreal, they are mental, its the only conclusion I can come up with :)

Regards
Zy
ID: 52700 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 52701 - Posted: 28 Jan 2012, 2:09:22 UTC
Last modified: 28 Jan 2012, 2:14:54 UTC

i've always just assumed that CCC's range of adjustments would never be broad enough for my needs, and so i've always used MSI Afterburner in its place. i'll start off by seeing how low it'll allow me to take VRAM, and if i'm not happy with the minimum value CCC allows, then i'll worry about manually editing the profile file...


If you are into Afterburner ... great.... sounds good to me. Unoffical mode gives the ability to change voltages though .... Satan has a way of rewarding its reckless use :)

EDIT:
*EDIT* - i just realized that the VRAM clock slider tells you the minimum, and 300mhz is as low as CCC will take it. so i'll let her run @ 850mhz/300mhz for now and start a new invalid/error count.

Thats normal. Leave it 300 its fine, ok you will save some heat and power by going to 175 with manual profile change .... but it can be a pain to redo after driver reloads, and at the end of the day there is not much saving 300->175 to fuss over frankly

Regards
Zy
ID: 52701 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 52702 - Posted: 28 Jan 2012, 2:34:13 UTC - in response to Message 52700.  

Sure is good to have someone (else) on the bleeding edge :)

I dont mind the bleeding bit ...... its hemorrhaging that gets tedious .... I leave the latter to extreme overclock lunatics, and those insane freaks who play with LN2.
I have never understood the attraction of the latter .... its so freakin dangerous its unreal, they are mental, its the only conclusion I can come up with :) Regards Zy

Not sure what OCing with LN2 proves, but humans like to push the limits: a game like so many others we play. Similar to top fuel dragsters I guess.
ID: 52702 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 52703 - Posted: 28 Jan 2012, 2:42:07 UTC

..... Similar to top fuel dragsters ....

They are just barking mad, no question, they all need permanent theropy rofl :)

Regards
Zy
ID: 52703 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 52704 - Posted: 28 Jan 2012, 3:28:49 UTC

I keep my 5830 at 800/500, that is as low as Afterburner will go on the card.

Of course the GTX560 is 1620/810.
ID: 52704 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 52705 - Posted: 28 Jan 2012, 3:59:22 UTC

Was about to hit the pit .... and did a last check, the completion times were slowly creeping up, albeit only by 2 secs or so, however it was at around the usual timeframe 3hrs(ish). There also had been two invalids. So I've rebooted (at 0330 UTC), put voltage up a notch. Settings are now:

1.224v / 1220 GPU / 1375 Memory / 53 fan card temps 65 & 66 degrees

Lets hope its still up when I get up :) It is looking better after that voltage change, because after the first couple of runs settling in, so far, mostly, WUs have been within a few hundreds of a second of each other, and that's more stable than before. Its settled slightly higher at around +/- 45.6 secs per WU.

So ... to my pit, fingers crossed :)

Regards
Zy
ID: 52705 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Vortac

Send message
Joined: 22 Apr 09
Posts: 95
Credit: 4,808,181,963
RAC: 0
Message 52709 - Posted: 28 Jan 2012, 8:55:33 UTC - in response to Message 52666.  

any particular reason you're crunching with the cards in X-fire? seeing as how neither X-fire nor SLI scales perfectly, 2 AMD/nVidia GPUs in X-fire/SLI will never produce twice the performance of one of those GPUs. 2 AMD cards (or nVidia cards) not in X-fire (or SLI) on the other hand will have twice the compute power. i'm assuming you game part of the time and don't want to hassle with regularly enabling and disabling X-fire, or have some other good reason for running those GPUs in X-fire even though its generally counterproductive to GPGPU computing?

I've got two HD5870 in Crossfire and I disabled it after reading this post. Run times are now only about 1 second shorter (from 62 to 61 secs in average).
ID: 52709 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 52710 - Posted: 28 Jan 2012, 10:40:16 UTC

Not too much difference overnight, still getting about 2 to 4 per hour average in total for both cards, which is around 0.01% error rate.

I backed off to 1210, and going to leave it at that, could well be the error issue that Matt posted above, dont know. Still, ended up much better than start of the session yesterday, so take it and run as they say :)

settings now 1.218v / 1210 GPU / 1375 Memory / fan 53

Regards
Zy
ID: 52710 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3329
Credit: 524,005,258
RAC: 29,507
Message 52711 - Posted: 28 Jan 2012, 12:13:18 UTC - in response to Message 52710.  

Not too much difference overnight, still getting about 2 to 4 per hour average in total for both cards, which is around 0.01% error rate.

I backed off to 1210, and going to leave it at that, could well be the error issue that Matt posted above, dont know. Still, ended up much better than start of the session yesterday, so take it and run as they say :)

settings now 1.218v / 1210 GPU / 1375 Memory / fan 53

Regards
Zy


Hey I was over on Collatz this morning and they got the 7990 software working!
http://boinc.thesonntags.com/collatz/forum_thread.php?id=831

Maybe Matt can talk to Slicker and see what he did. In the users pc's over there it now says: "CAL Tahiti (3072MB) driver: 1.4.1658" for his gpu!
ID: 52711 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 52713 - Posted: 28 Jan 2012, 13:16:02 UTC - in response to Message 52709.  

any particular reason you're crunching with the cards in X-fire? seeing as how neither X-fire nor SLI scales perfectly, 2 AMD/nVidia GPUs in X-fire/SLI will never produce twice the performance of one of those GPUs. 2 AMD cards (or nVidia cards) not in X-fire (or SLI) on the other hand will have twice the compute power. i'm assuming you game part of the time and don't want to hassle with regularly enabling and disabling X-fire, or have some other good reason for running those GPUs in X-fire even though its generally counterproductive to GPGPU computing?

I've got two HD5870 in Crossfire and I disabled it after reading this post. Run times are now only about 1 second shorter (from 62 to 61 secs in average).

i suppose some DC projects are more adversely affected by crunching in SLI/X-fire than others. plus i've never had more than one 5870 in a single machine to test them in X-fire and then separately. in theory though, there should be an increase in DC productivity if you un-crossfire your GPUs. unlike games, where multiple GPUs must be synchronized via SLI/X-fire in order to contribute to driving the single graphical output of a game, those same GPUs do not have to be in SLI/X-fire in order for DC projects to take full advantage of their compute power (thus allowing perfect scalability of GPU compute power). i suppose the reason your MW@H performance hardly improved when you un-crossfired your GPUs is b/c X-fire wasn't holding DC productivity back that much in the first place. in other words, even though in theory DC should be slower in X-fire b/c X-fire does not scale perfectly, the fact that MW@H still involves crunching through data with massive amounts of ILP (instruction-level parallelism) means that individual tasks are still generally sent to one GPU or the other (and not split somehow between the two), allowing the GPUs to crunch like individuals, even though they're still in X-fire...

...this is all just speculation though. i imagine it would take someone far more knowledgeable than me to help us understand what's really going on here...
ID: 52713 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 52714 - Posted: 28 Jan 2012, 14:25:01 UTC - in response to Message 52711.  
Last modified: 28 Jan 2012, 14:26:36 UTC

In the users pc's over there it now says: "CAL Tahiti (3072MB) driver: 1.4.1658" for his gpu!

<core_client_version>7.0.12</core_client_version>
<stderr_txt>
Collatz Conjecture v3.06 for OpenCL
Based on the AMD Brook kernels by Gipsel
Device 0
Device Vendor Advanced Micro Devices, Inc.
Name Tahiti
Driver version CAL 1.4.1658 (VM)
Version OpenCL 1.1 AMD-APP (851.6)
Start 2373716052793973516648
Checking 824,633,720,832 numbers
Threads 256
Numbers/Kernel 4,194,304
Kernels/Reduction 256
Numbers/Reduction 1,073,741,824
Reductions/WU 768
Highest Steps 1,829 for 2373716052868342915369
Total Steps 418,663,445,214,577
GPU time 1,082.99 seconds
CPU time 0.998406 seconds
Total time 1,083.49 seconds

Looks like it's (the 7970) about 30% faster than the 5850 and 20% faster than the 5870 at this point with the OpenCl app. Now if someone can get a CAL app going...
ID: 52714 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 52715 - Posted: 28 Jan 2012, 15:51:18 UTC

Going to try an extended run with crossfire @1210/1375 1.2v .... I know what should happen, quite a bit slower, but its claimed they have improved it, lets see ....

Regards
Zy
ID: 52715 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
Message 52717 - Posted: 28 Jan 2012, 23:30:19 UTC

@Crossfire: does that matter at all? It's for games and only used if AMD gives you a profile for the game. This should have nothing to do with GP-GPU, where you usually run 1 pogram per GPU (unless specified otherwise). If XFire was working as intended you'd have to see halved times per WU, but only running 1 WU at a time per divce, compared to several WUs at once in the normal mode.

@Testing: with the recent server upgrade and insta-purge being gone we've finally got a good method of testing stability: at the bottom of "show tasks for your computer" it shows valid results, invalids and errors.

For me about 3000 tasks are kept in this record, which is statistically relevant. At about 1600 WUs/day it takes 2 days for this statistic to update completely after I change something. I suggest you guys use this instead of "only 1 or 2 errors in x hours". That number is too small to judge stability.

On my Cayman I observed a failure rate of 0.7 % at 900 MHz @ 1.10 V. That's what I settled at in painful hand-tuning. Now I increased the GPU clock to 905 MHz and observe an increased failure rate to 1.4 %. That's a significant increase and certainly not worth it. Trying 1.11 V now.

MrS
Scanning for our furry friends since Jan 2002
ID: 52717 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 52720 - Posted: 29 Jan 2012, 12:32:45 UTC
Last modified: 29 Jan 2012, 12:35:17 UTC

Matt

Are there any clues yet, as such, as to how many of the E Truncated errors may in fact be valids underneath it all? At present I have the 2x7970 cards stable as such - just playing with preferred voltage level 1.218 to 1.23. Where it stays obviously depends on the invalid rate being at or near zero. Thats difficult to judge whilst still having to take into account the E Truncation as its not known, on the face of it, how many of those are actually invalids, and how many are valids masked by that error.

At present am working on worst case where the majority are genuine invalids masked by the E Truncation error. Whats your gut feeling as to the proportion of genuine invalids inside those affected by E Truncation? To a degree its a bit like picking the lottery numbers I am aware, but a gut feeling (however that turns out, I know its impossible to guage accurately) would be helpful.

Regards
Zy
ID: 52720 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 29 Aug 07
Posts: 486
Credit: 576,548,171
RAC: 10
Message 52721 - Posted: 29 Jan 2012, 14:26:40 UTC - in response to Message 52421.  
Last modified: 29 Jan 2012, 14:27:10 UTC

You can find the applications right here on Arkayn´s page: http://www.arkayn.us/forum/index.php?action=downloads;cat=11

thanks it works:)
dowload NVidia and edit app_info to


<app_info>
  <app>
  <name>milkyway</name>
  </app>
  <file_info>
    <name>milkyway_separation_0.82_windows_x86_64__cuda_opencl.exe</name>
    <executable />
  </file_info>
  <app_version>
    <app_name>milkyway</app_name>
    <version_num>82</version_num>
    <flops>1.0e11</flops>
    <avg_ncpus>0.05</avg_ncpus>
    <max_ncpus>1</max_ncpus>
    <plan_class>ati14ati</plan_class>
    <coproc>
      <type>ATI</type>
      <count>1</count>
    </coproc>
	<cmdline></cmdline>
    <file_ref>   <file_name>milkyway_separation_0.82_windows_x86_64__cuda_opencl.exe</file_name>
      <main_program/>
    </file_ref>
</app_version>
</app_info>


This Thread has gotten so long it's a Nightmare to try & figure out how to get the Wu's to run. Anyway I must be doing something wrong, I just get Wu's that are going to take over 2 Hr's to run ??? They say their ATI Wu's though ... ???
STE\/E
ID: 52721 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 52723 - Posted: 29 Jan 2012, 16:43:48 UTC - in response to Message 52721.  
Last modified: 29 Jan 2012, 16:55:33 UTC

app_info is fine.

Running correctly ..... Task Manager should show one instance of the openCL.exe per WU, the name column in BAM should show ps_separation_82 .... , and the application column should show "local: milkyway 0.82 (ati14ati)"

Did you restart BAM after putting the app_info into project directory?

Once the app_info is there, and the names above check out, its then a case of what is wrong at your end re setup. Suggest you start by setting cache to zero to prevent trashing, and CCC wack it up to 1125 GPU, 20% power.

Once all thats checked out, not much else as such, will need head scratching on hardware config your end. Initially they will show long completion times due to the classic BOINC counting, other than that they are ok showing up the normal stuff fine. As always given a couple of dozen running through , they set fine for reality time.

Yell with any symptoms, I try and scratch the 'ol brain.

EDIT: ahhh what BAM version you running, you need a 7.XX, suggest 7.0.8 to start with, dont update to AMD 12.1 that does not yet support 7970s, stay with the release day RC drivers. If you need new AMD drivers, I will put them up in my webspace for you to grab.

Regards
Zy
ID: 52723 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 29 Aug 07
Posts: 486
Credit: 576,548,171
RAC: 10
Message 52724 - Posted: 29 Jan 2012, 16:56:35 UTC
Last modified: 29 Jan 2012, 16:56:51 UTC

Running 7.0.3 Client, 12.1 Drivers, what are release day RC drivers ???

The Wu's ran over 2 Minutes & hadn't shown any signs of Progression ...

What is the app_info I should be using ???

Any other files I need ???

Thanks
STE\/E
ID: 52724 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : 79XX Dont Run

©2024 Astroinformatics Group