Welcome to MilkyWay@home

7970 crunching more WUs

Message boards : Number crunching : 7970 crunching more WUs
Message board moderation

To post messages, you must log in.

AuthorMessage
Filip Falta

Send message
Joined: 19 Feb 12
Posts: 2
Credit: 12,265,516
RAC: 35
Message 53722 - Posted: 20 Mar 2012, 13:31:06 UTC

Hi, is there any way how can I run more than one WU on my 7970 at the same time?
ID: 53722 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 53723 - Posted: 20 Mar 2012, 15:11:59 UTC

Put an app_info.xml File into the Project-Dir of MW@H

Content:

<app_info>
<app>
<name>milkyway</name>
</app>
<file_info>
<name>milkyway_separation_1.02_windows_intelx86__opencl_amd_ati.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway</app_name>
<version_num>102</version_num>
<flops>1.0e11</flops>
<avg_ncpus>0.05</avg_ncpus>
<max_ncpus>1</max_ncpus>
<plan_class>ati14ati</plan_class>
<coproc>
<type>ATI</type>
<count>0.5</count>
</coproc>
<cmdline>--gpu-target-frequency 60 --gpu-disable-checkpointing </cmdline>
<file_ref>
<file_name>milkyway_separation_1.02_windows_intelx86__opencl_amd_ati.exe</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>


If the name of the exe file is differen for 64 Windows, you have to replace the name.
Please send back all WU's before you implement the App_info.xml.
then request new WU's
Otherwise they get lost till.

regards

Franz
ID: 53723 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 53725 - Posted: 21 Mar 2012, 0:44:04 UTC

With 7970s generally speaking you can bring down the:

--gpu-target-frequency 60

to:

--gpu-target-frequency 10

without a problem. If you do hit a sluggish screen - frankly highly unlikely with 7970s - then bring it up in steps of five until the screen responds ok. I've run mine at less than that, but its going to depend on individual setup going lower than 10, so experiment if you want, but going down to ten is pretty as much as you will squeeze out of it in practice.

Keep an eye on temperatures when doing this, should be fine, 7970s are cooled well, but watch it until you are happy its settled ok.

Regards
Zy
ID: 53725 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
Message 53733 - Posted: 21 Mar 2012, 20:23:00 UTC - in response to Message 53725.  

You can also set this in the account setting for MW now.

MRS
Scanning for our furry friends since Jan 2002
ID: 53733 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Terminal123*

Send message
Joined: 29 May 10
Posts: 19
Credit: 5,917,319
RAC: 0
Message 53772 - Posted: 24 Mar 2012, 22:07:07 UTC - in response to Message 53733.  
Last modified: 24 Mar 2012, 22:10:19 UTC

I changed this to 10 instead of 60, and don't seem to notice a difference...Is there something specific I'm looking for?

One thing I do notice, is that my GPU's are only fluxuating between a 65-80% load when 12 CPU Cores are crunching 12 WU's and my 2 GPU's are crunching 2 WU's. But if limit it from 100% cpu's used to 10%, it will run 2 GPU WU's and 1 CPU WU and my GPU's will be at 95-100% load, and finish units faster. So I think what I'm running into is that the 2 GPU WU's are needing some CPU assistance, but since they're all busy crunching of their own, there's not enough left over. I see that you can set a max number of cores used, but can you specific 11 CPU cores and 2 GPU?

Edit: It looks like the sweet spot is to set it to use 90% of cores, that way it uses 10 CPU and 2 GPU, allowing the 2 extra CPU cores to assist in the 2 GPU WU's.

I'm currently crunching with the following:
CPU: i7-3960x @ Stock Speeds
RAM: 32GB of Quad Channel 2133Mhz
GPU1: Radeon HD 7970 3GB @ Stock Speeds
GPU2: Radeon HD 7970 3GB @ Stock Speeds
ID: 53772 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile NullCoding*
Avatar

Send message
Joined: 23 Sep 10
Posts: 24
Credit: 58,711,243
RAC: 0
Message 53789 - Posted: 25 Mar 2012, 17:18:40 UTC

I will try this on my 6950 later today (after unlocking it). Currently it's at 6950 stock speeds and shader count and showing 40%-45% load running one task at a time about every 170 seconds. I suppose this app-info.xml will work okay as it is for my card? I'll keep fps at 60 as it's fine right now, and I'm not usually using the machine anyway.


ID: 53789 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tobias

Send message
Joined: 6 Mar 12
Posts: 10
Credit: 8,070,771
RAC: 0
Message 53813 - Posted: 27 Mar 2012, 10:38:35 UTC

@FruehwF: Thanks so much for posting the XML file!
Got my Sapphire 7950 OC just an hour ago and it's crunching 2 WUs at < 90s!
(GPU 1GHz, Mem 1.270 MHz, Temp. < 60°C, Coolers 40%; set by CCC)
Even though it's an open case, I can hardly hear the card... I'm amazed! :)
ID: 53813 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 53814 - Posted: 27 Mar 2012, 12:58:31 UTC - in response to Message 53813.  

Pleasure

[German]
gern gscheng, wir ösi helfen ja gerne mal aus :-)

Ich schreib dir noch ein PN
[/German]
ID: 53814 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ghostrider

Send message
Joined: 11 Nov 09
Posts: 4
Credit: 20,410,621
RAC: 0
Message 54035 - Posted: 15 Apr 2012, 18:06:01 UTC

Single WU takes 44 seconds to run on my 7970 (stock 925/1375) without any config changes. Card reports ~98% GPU load.
ID: 54035 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mike Simicsak

Send message
Joined: 14 Jan 11
Posts: 5
Credit: 51,204,877
RAC: 0
Message 54040 - Posted: 15 Apr 2012, 22:36:01 UTC
Last modified: 15 Apr 2012, 22:38:21 UTC

For me, the bulk of my credits come from the GPU. I have recently had an issue with Apple Mobile Device manager where it was "running away" and consumming an entire core. Adding in contention with the 5 CPU tasks I generally run, the GPGPU Open_CL WU's were getting no CPU time and the GPU was idle. It is a good illustration that keeping the GPU busy requires some CPU cycles. It took me about a month to figure out what the problem was and to get it corrected. My averages are still depressed from this incident.

Previously the MW@H WU's showed about 5% CPU load. Even with the recent change (upg BOINC client to 7.0.25), my experience is that it does not take 96% of a core to keep the GPU busy; it's still much more like 5%. My observation is that this appears to be a frontend/backend load to get the WU in and out of the GPU.

I do have an IT background, so hopefully this will make sense. My idea was that the GPU tasks should be isolated from the CPU tasks to avoid any contention. As had already been posted this is partially accomplished by "reserving" cores by reducing the maximum CPU% available to the BOINC client by converting the fraction 1/cores to a percentage and subtracting from 100%. In my case with 6 cores, 1/6 = 16% and I set the maximum CPU utilization to 84%. If you observe task manager after making this change, it will appear that all cores are loaded but none are 100% busy. The next part involves utlizing the concept of proccess / processor affinity. In simple terms this sort of means "gluing" a task/program/process to a CPU core. This can be observed through task manager. Goto the "Processes" tab, right click on a process and select "set affinity" from the popup menu. There will be a check box for each core. You can change CPU/Core affinity by changing the check boxes. On my system GPU WU take about 1 minute but CPU WU run for about an hour. The "easiest" way to make a difference is to de-select core zero from all the CPU tasks since they run the longest. This should produce a nice 100% busy line for all the CPU WU cores. The ever changing GPU WU will not prefer any core but since all the cores except zero are busy, it will run on core zero.

I've been working with my son to write a C# program to change a task's processor affinity or to say it another way to automate the process I've just described. The program reads a list of active processes and changes the processor affinity for GPU / "openCL" tasks to "prefer" processor zero. It also changes the CPU tasks to "avoid" processor zero. It's cool to watch this in task manager as the "curve" changes from random up down on all 6 cores to core zero being random and the other 5 cores are straight lines at 100%. With the different workloads isolated to different processor cores and with "nothing" else running on the system, it's easy to see that a GPU WU still only uses about 5% of a processor core. With the CPU load I'm seeing, I'm suggesting that it might be possible, even with a 79x0, to use 1 core to keep that beast busy if you can make sure the GPU pre/post processor task always has access to CPU time by utilizing processor affinity.

I've had this working for about 2 weeks now and my average MW@H scores have jumped from about 180,000 credits a day to close to 215,000 credits a day. We're considering generalizing this, making it more user friendly, making it more cross-project usable, and posting it for general availability. Is anyone interested in using a tool like this?

Is there anyone out there who uses either NVidia and/or Multiple GPU's that might be willing to help us test?
ID: 54040 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Sabroe_SMC
Avatar

Send message
Joined: 2 Aug 08
Posts: 24
Credit: 374,440,641
RAC: 0
Message 54046 - Posted: 16 Apr 2012, 10:57:16 UTC - in response to Message 54040.  
Last modified: 16 Apr 2012, 10:58:05 UTC

I think a program as you tried to do is like: http://bitsum.com/prolasso.php
Greetz to all
ID: 54046 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 54048 - Posted: 16 Apr 2012, 11:16:34 UTC

I use Process Lasso, as mentioned by Sabroe_SMC, to set processor affinity of applications when I need to do so. Used it in the past with my HD 5970 on PrimeGrid because the earlier Catalyst drivers used to have the busy wait bug with dual core GPUs. So each GPU task would use a full CPU core although they only needed about half a CPU core to run efficiently. Used Process Lasso to assign the ppsieve OpenCL application to one CPU core so both GPU cores were supported by one CPU core instead of one each. This freed a CPU core to be used for CPU projects. CPU project applications I set the affinity to the other 7 CPU cores.
ID: 54048 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
Message 54050 - Posted: 16 Apr 2012, 20:44:40 UTC - in response to Message 54040.  

I was running my OC'ed HD6970 here, so also plenty of horse power. All it took me to keep it budy was to reserve one core and to run 2 WUs concurrently. The draw back is that this requires and app_info.xml, i.e. in case of a client update you're not getting it automatically.

Besides achieving permanent 99 - 100% GPU utilization (after some time to settle in) during a WU, the ~3s of CPU time at the end of each WU are not wasted on the GPU (it just continues to crunch the other WU in the mean time). With your method and without an app_info you can also achieve maximum GPU utilization during each WU, but you can't work around the CPU time at the end. However, if you use your method with an app_info and 2 concurrent WUs I can not see any benefit over just reserving one core.

MrS
Scanning for our furry friends since Jan 2002
ID: 54050 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : 7970 crunching more WUs

©2024 Astroinformatics Group