Welcome to MilkyWay@home

How can I crunch on both GPUs in a ATI4870x2?

Message boards : Number crunching : How can I crunch on both GPUs in a ATI4870x2?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Edboard
Avatar

Send message
Joined: 22 Feb 09
Posts: 20
Credit: 105,156,399
RAC: 0
Message 24922 - Posted: 11 Jun 2009, 6:26:09 UTC

I would like to know if it is already possible to use both GPUs of a dual 4870 ATI graphic card or still not.
ID: 24922 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile borandi
Avatar

Send message
Joined: 21 Feb 09
Posts: 180
Credit: 27,806,824
RAC: 0
Message 24958 - Posted: 11 Jun 2009, 12:57:21 UTC

Wasn't this enabled with 0.19c?
ID: 24958 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [P3D] Crashtest

Send message
Joined: 8 Jan 09
Posts: 58
Credit: 53,161,741
RAC: 0
Message 24988 - Posted: 11 Jun 2009, 15:26:29 UTC - in response to Message 24958.  

Its possible to crunsh n-WUs on y-GPUs, Gipsel's app 0.19f:

- n-WUs (setting n2 = 2WUs per GPU) on y-single-gpu-cards or (y/2) 38?0x2 / 48?0x2 cards or a mix of it:

Example:
you can crunsh 6 MW-WUs at the same time if you got:

a 4870x2 and a 4870 = 3x 4870-GPU and the setting n2
or
3x ATI4870-Cards = 3x 4870-GPU !

current maximum is 4 4870x2-Cards = 8 GPUs, with n2 = 16WUs; with n3 = 24WUs ...

BUT:
all cards must be active (Monitor or dummy pluged in or Xfire active)

REAL EXAMPLE = http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=48886

core_client_version>6.5.0</core_client_version>
<![CDATA[
<stderr_txt>
Running Milkyway@home ATI GPU application version 0.19f by Gipsel
allowing 2 concurrent WUs per GPU
CPU: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz (8 cores/threads) 2.67772 GHz (282ms)

CAL Runtime: 1.3.158
Found 3 CAL devices

Device 0: ATI Radeon HD 4800 (RV770) 1024 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 780 MHz, memory clock: 1000 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Device 1: ATI Radeon HD 4800 (RV770) 1024 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 780 MHz, memory clock: 1000 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Device 2: ATI Radeon HD 4800 (RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 780 MHz, memory clock: 1000 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

2 WUs already running on GPU 0
2 WUs already running on GPU 1
2 WUs already running on GPU 2
No free GPU! Waiting ... 27.8616 seconds.
Starting WU on GPU 0

....
ID: 24988 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 25032 - Posted: 11 Jun 2009, 18:48:42 UTC

With base settings, I crunch 3 at a time on the 4830.
ID: 25032 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Edboard
Avatar

Send message
Joined: 22 Feb 09
Posts: 20
Credit: 105,156,399
RAC: 0
Message 25061 - Posted: 11 Jun 2009, 21:03:30 UTC - in response to Message 24988.  

Its possible to crunsh n-WUs on y-GPUs, Gipsel's app 0.19f:ยทยทยท


If I only have a 4870x2 graphic card can I crunch 4 WUs at a time (two in each GPU) with n2 and the internal Xfire active?

BUT:
all cards must be active (Monitor or dummy pluged in or Xfire active)


I know that with Nvidia card and CUDA the internal SLI of a dual graphic card MUST BE innactive.
ID: 25061 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [P3D] Crashtest

Send message
Joined: 8 Jan 09
Posts: 58
Credit: 53,161,741
RAC: 0
Message 25062 - Posted: 11 Jun 2009, 21:06:25 UTC - in response to Message 25061.  

yes you can / should be able to do this
ID: 25062 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
frigens

Send message
Joined: 25 Mar 09
Posts: 11
Credit: 10,178,231
RAC: 0
Message 25511 - Posted: 15 Jun 2009, 8:45:13 UTC

Is it possible if I add 3870 card into my current 4870 system? I know that they cannot be used to CrossFire each other but what about using dummy plug or monitor?
ID: 25511 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 25513 - Posted: 15 Jun 2009, 10:39:39 UTC - in response to Message 25511.  

Is it possible if I add 3870 card into my current 4870 system? I know that they cannot be used to CrossFire each other but what about using dummy plug or monitor?


Provided you have empty PCI-E slots, and use a dummy monitor plug, the additional GPU should crunch fine (as I understand it).
Go away, I was asleep


ID: 25513 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile borandi
Avatar

Send message
Joined: 21 Feb 09
Posts: 180
Credit: 27,806,824
RAC: 0
Message 25679 - Posted: 16 Jun 2009, 19:44:01 UTC

You can use it without using a dummy plug or a second monitor. Right click desktop -> properties -> settings. Select your second card, click 'Extend the desktop to this Monitor' and 'Make my primary monitor/desktop'. Click Apply, then you have to move your monitor cable over to your second card. Now both will be enabled for crunching.
ID: 25679 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
sysfried

Send message
Joined: 25 Apr 08
Posts: 19
Credit: 31,151,552
RAC: 0
Message 25713 - Posted: 16 Jun 2009, 21:09:18 UTC

Enable Crossfire via the CCC.

I have TWO 4850X2s in one system and they work fine with Crossfire enabled.
ID: 25713 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 25720 - Posted: 16 Jun 2009, 21:53:02 UTC - in response to Message 25713.  

Enable Crossfire via the CCC.

I have TWO 4850X2s in one system and they work fine with Crossfire enabled.

Or to cite one of your WU's task details:

CPU time 78.30197
stderr out <core_client_version>6.6.20</core_client_version>
<![CDATA[
<stderr_txt>
Running Milkyway@home ATI GPU application version 0.19f by Gipsel
allowing 2 concurrent WUs per GPU
CPU: AMD Phenom(tm) II X4 940 Processor (4 cores/threads) 3.0002 GHz (425ms)

CAL Runtime: 1.4.283
Found 4 CAL devices

Device 0: ATI Radeon HD 4800 (RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 625 MHz, memory clock: 250 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Device 1: ATI Radeon HD 4800 (RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 625 MHz, memory clock: 250 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Device 2: ATI Radeon HD 4800 (RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 625 MHz, memory clock: 250 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Device 3: ATI Radeon HD 4800 (RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 625 MHz, memory clock: 250 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

2 WUs already running on GPU 0
2 WUs already running on GPU 1
2 WUs already running on GPU 2
2 WUs already running on GPU 3
No free GPU! Waiting ... 18.892 seconds.
Starting WU on GPU 1

main integral, 320 iterations
predicted runtime per iteration is 174 ms (33.3333 ms are allowed), dividing each iteration in 6 parts
borders of the domains at 0 272 536 800 1072 1336 1600
Calculated about 7.40024e+012 floatingpoint ops on GPU, 1.23583e+008 on FPU. Approximate GPU time 78.302 seconds.

probability calculation (stars)
Calculated about 1.95269e+009 floatingpoint ops on FPU.

WU completed.
CPU time: 9.50046 seconds, GPU time: 78.302 seconds, wall clock time: 198.067 seconds, CPU frequency: 3.00021 GHz

</stderr_txt>
]]>

Validate state Valid
Claimed credit 0.443718160719715
Granted credit 55.51647
application version 0.19

I would think you could gain a tiny bit of performance if you raise the memory clock slightly. It's a 4850X2 with only GDDR3 after all (even on a HD4870 with GDDR5 250MHz reduce the performance by ~5% already). Furthermore, if you enable crossfire the memory contents for the GPUs are duplicated, thus raising the bandwidth requirements.
ID: 25720 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile borandi
Avatar

Send message
Joined: 21 Feb 09
Posts: 180
Credit: 27,806,824
RAC: 0
Message 25742 - Posted: 17 Jun 2009, 1:33:39 UTC - in response to Message 25713.  

Enable Crossfire via the CCC.

I have TWO 4850X2s in one system and they work fine with Crossfire enabled.


Crossfire defaults to the slowest card if you're on a mismatch combo. So 3850+4850 in CF ~= 2x3850 when it comes to crunching.

ID: 25742 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Westsail and *Pyxey*
Avatar

Send message
Joined: 22 Mar 08
Posts: 65
Credit: 15,715,071
RAC: 0
Message 25744 - Posted: 17 Jun 2009, 1:56:04 UTC

Is there any advantage to cross-firing a headless dedicated Boinc cruncher? Would the answer make a difference if matched or unmatched cards? Thanks, just wondering for future purchase decisions. Currently most my experience is with Nvidia cards; in which SLI and Boinc cannot be used together.
ID: 25744 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Edboard
Avatar

Send message
Joined: 22 Feb 09
Posts: 20
Credit: 105,156,399
RAC: 0
Message 25751 - Posted: 17 Jun 2009, 5:53:23 UTC

Now I would like to know whether I shall get approximately the same points/day with a 4870x2 than two 4870 single cards?

ID: 25751 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile borandi
Avatar

Send message
Joined: 21 Feb 09
Posts: 180
Credit: 27,806,824
RAC: 0
Message 25759 - Posted: 17 Jun 2009, 7:09:26 UTC - in response to Message 25744.  
Last modified: 17 Jun 2009, 7:10:24 UTC

Is there any advantage to cross-firing a headless dedicated Boinc cruncher? Would the answer make a difference if matched or unmatched cards? Thanks, just wondering for future purchase decisions. Currently most my experience is with Nvidia cards; in which SLI and Boinc cannot be used together.


It's easier to set up :) Means you don't have to faff around sorting out the second card to crunch. A matched card makes it a lot easier in my experience. Also, more cores helps for the time being...

Now I would like to know whether I shall get approximately the same points/day with a 4870x2 than two 4870 single cards?


If you have a second card in the system, you reduce the CPU:GPU ratio, meaning that you have more chances of running out of work. It's the same if you use two seperate cards, or one X2 card, though with two seperate cards if you come across anotehr quad core machine, you can move it over there. I find 2CPU:1GPU a good minimum ratio - a 1:2 ratio means your cache will be sucked try in about 3-4 minutes. I plan to supply my i7 machine with two 3850X2s come pay day...
ID: 25759 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
sysfried

Send message
Joined: 25 Apr 08
Posts: 19
Credit: 31,151,552
RAC: 0
Message 25779 - Posted: 17 Jun 2009, 11:59:34 UTC - in response to Message 25720.  
Last modified: 17 Jun 2009, 12:08:57 UTC

Is there any advantage to cross-firing a headless dedicated Boinc cruncher? Would the answer make a difference if matched or unmatched cards? Thanks, just wondering for future purchase decisions. Currently most my experience is with Nvidia cards; in which SLI and Boinc cannot be used together.


It's easier to set up :) Means you don't have to faff around sorting out the second card to crunch. A matched card makes it a lot easier in my experience. Also, more cores helps for the time being...

Now I would like to know whether I shall get approximately the same points/day with a 4870x2 than two 4870 single cards?


If you have a second card in the system, you reduce the CPU:GPU ratio, meaning that you have more chances of running out of work. It's the same if you use two seperate cards, or one X2 card, though with two seperate cards if you come across anotehr quad core machine, you can move it over there. I find 2CPU:1GPU a good minimum ratio - a 1:2 ratio means your cache will be sucked try in about 3-4 minutes. I plan to supply my i7 machine with two 3850X2s come pay day...


Not a problem at all. I have a quad core cpu and I get 24 workunits at maximum. Enough to keep four gpus (with 2 Workunits per GPU) busy. I think you can even go to GPU:CPU ratio of 4:1 with the new MilkyWay_GPU Project because we will get bigger workunits there and I bet even more workunits at a time....

Enable Crossfire via the CCC.

I have TWO 4850X2s in one system and they work fine with Crossfire enabled.

Or to cite one of your WU's task details:

CPU time 78.30197
stderr out <core_client_version>6.6.20</core_client_version>
<![CDATA[
<stderr_txt>
Running Milkyway@home ATI GPU application version 0.19f by Gipsel
allowing 2 concurrent WUs per GPU
CPU: AMD Phenom(tm) II X4 940 Processor (4 cores/threads) 3.0002 GHz (425ms)

CAL Runtime: 1.4.283
Found 4 CAL devices

Device 0: ATI Radeon HD 4800 (RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 625 MHz, memory clock: 250 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Device 1: ATI Radeon HD 4800 (RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 625 MHz, memory clock: 250 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Device 2: ATI Radeon HD 4800 (RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 625 MHz, memory clock: 250 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Device 3: ATI Radeon HD 4800 (RV770) 512 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 625 MHz, memory clock: 250 MHz
800 shader units organized in 10 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

2 WUs already running on GPU 0
2 WUs already running on GPU 1
2 WUs already running on GPU 2
2 WUs already running on GPU 3
No free GPU! Waiting ... 18.892 seconds.
Starting WU on GPU 1

I would think you could gain a tiny bit of performance if you raise the memory clock slightly. It's a 4850X2 with only GDDR3 after all (even on a HD4870 with GDDR5 250MHz reduce the performance by ~5% already). Furthermore, if you enable crossfire the memory contents for the GPUs are duplicated, thus raising the bandwidth requirements.


Well, two 4850X2's produce a nice amount of heat, and one of the GPUs is at a high temp level. I won't overclock before I get that straightened.

Also, I'm going to solder those "vga dummy" devices so I can go without Crossfire in the future so the mentioned bandwith problems are no longer an issue.

But thanks for your comments on this. :-)

Sysfried
ID: 25779 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile borandi
Avatar

Send message
Joined: 21 Feb 09
Posts: 180
Credit: 27,806,824
RAC: 0
Message 25787 - Posted: 17 Jun 2009, 13:02:28 UTC

You'd be surprised - even with work flowing as it is, I have come across one of my dual cores, with a 4850 running n1 setting, be without a WU for a couple of minutes every so often as it only takes 6 minutes to empty the cache.

With a 4:1 on 30sec WUs you'd compute 4 WUs in the first 30s, then end up with 30s with 2 GPUs not working before you could contact the server again. Then you'd only have work for the next minute if you get it.

Agreed on the MW_GPU front though - it won't matter then :)
ID: 25787 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [TiDC] Anlupa

Send message
Joined: 17 Nov 08
Posts: 2
Credit: 33,115,365
RAC: 0
Message 26027 - Posted: 19 Jun 2009, 19:00:53 UTC

Is there any special configuration for the app_info file?
Thanks!
ID: 26027 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
localizer

Send message
Joined: 28 Jan 08
Posts: 40
Credit: 379,931,801
RAC: 0
Message 26226 - Posted: 22 Jun 2009, 14:50:33 UTC

...... Hi!

Just installed a new 4870X2 - I can only get it to run WUs on a single GPU. As it is a dual card Xfire is enabled automatically - can't see any obvious way to disable it - any thoughts?

From reading this post it looked as though the X2s would run WUs on both GPUs - well not for me!

My appinfo.xml is set up the same way as on a host with multiple cards (which works fine) - is there some trick I'm missing?

Thanks,
P.
ID: 26226 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 29 Aug 07
Posts: 486
Credit: 576,515,913
RAC: 37,034
Message 26227 - Posted: 22 Jun 2009, 15:20:19 UTC

Pat, Have you Tried using a Dummy Plug yet on the 2'nd half of the Card ... ???
ID: 26227 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : How can I crunch on both GPUs in a ATI4870x2?

©2024 Astroinformatics Group