Welcome to MilkyWay@home

HD5870 dead - help with replacement!


Advanced search

Message boards : Number crunching : HD5870 dead - help with replacement!
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
300 million credit badge12 year member badgeextraordinary contributions badge
Message 50804 - Posted: 23 Aug 2011, 11:09:43 UTC - in response to Message 50378.  

A 6950 will consistently outperform - when measured across a broad range of tasks - a 5870 by around 10%. The margin for a 6970 is around 20%.
In theory this should be true. However, I have two workstations executing identical series WU's and the HD5870 (Win/XP) is consistently running away from the HD6970 (WIN7 64-bit). MW seems to have a hard time keeping the HD6970 more than 75% busy even if I prevent BOINC from using more than 90% of the processor cores available. The HD5870 is running about 96% busy with no restrictions on processor core use.

It is a puzzlement...
ID: 50804 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
100 million credit badge12 year member badgeextraordinary contributions badge
Message 50805 - Posted: 23 Aug 2011, 14:35:59 UTC - in response to Message 50804.  

MW seems to have a hard time keeping the HD6970 more than 75% busy even if I prevent BOINC from using more than 90% of the processor cores available

Bit puzzled at that one ....

"even if I.."

Why would you want to?

Regards
Zy
ID: 50805 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileBeyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 503,249,851
RAC: 35,341
500 million credit badge13 year member badge
Message 50807 - Posted: 23 Aug 2011, 14:49:58 UTC - in response to Message 50804.  

In theory this should be true. However, I have two workstations executing identical series WU's and the HD5870 (Win/XP) is consistently running away from the HD6970 (WIN7 64-bit). MW seems to have a hard time keeping the HD6970 more than 75% busy even if I prevent BOINC from using more than 90% of the processor cores available. The HD5870 is running about 96% busy with no restrictions on processor core use.

DL Process Lasso, install it and configure the default priority of MW to "Above normal". MW should now run at 98-99% and will still use very little CPU. Running 2 WUs at a time may help too.
ID: 50807 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
300 million credit badge12 year member badgeextraordinary contributions badge
Message 50811 - Posted: 23 Aug 2011, 20:34:32 UTC - in response to Message 50807.  

DL Process Lasso, install it and configure the default priority of MW to "Above normal". MW should now run at 98-99%...
Have previously tried setting MW ATI app to "above normal" or "high" with Task Manager and no visible improvement at all.
ID: 50811 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
300 million credit badge12 year member badgeextraordinary contributions badge
Message 50812 - Posted: 23 Aug 2011, 20:35:31 UTC - in response to Message 50805.  

Why would you want to?
Several people have claimed on here that leaving a whole core available for MW ATI app results in improved throughput. I'm not seeing it...
ID: 50812 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
200 million credit badge13 year member badge
Message 50813 - Posted: 23 Aug 2011, 20:53:06 UTC - in response to Message 50812.  

Several people have claimed on here that leaving a whole core available for MW ATI app results in improved throughput. I'm not seeing it...

Absolutely. With all cores loaded I'm either seeing full utiliztion or ~75% on my HD6950 with unlocked shaders in Win 7 64. It somehow depends on the CPU projects, but I haven't been able to identfy which one is to blame.

If I leave one logical core of my i7 free I'm getting consistent 98-99% utilization. I'm using
app_info.xml wrote:
<cmdline>--process-priority 3 --gpu-disable-checkpointing</cmdline>

but I don't think it matter much here. Feel free to check out the performance of that card - that's totally impossible for a Cypress (without liquid nitrogen cooling). What you're seeing is some weird software problem not restricted to Cayman. I suppose you'd see similar figures, just the other way around, if you switched the cards between the PCs.

MrS
Scanning for our furry friends since Jan 2002
ID: 50813 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
100 million credit badge12 year member badgeextraordinary contributions badge
Message 50814 - Posted: 23 Aug 2011, 21:05:42 UTC - in response to Message 50813.  

Several people have claimed on here that leaving a whole core available for MW ATI app results in improved throughput. I'm not seeing it...


Depends hugely on the CPU apps you run at the same time. If those apps dont utilise all the CPU then not much affect, if they run at 100% cpu, big effect. No one size fits all rule. If I am running CPU apps as well as GPU, I will leave at least one cpu core free, sometimes more as you also need to take into account CPU useage of the GPU app you are running.

However you originaly said you restricted BOINC to 90%, am still mystified by that, nothing gets done if BOINC does not have the access, and reduced output if a restricted access.

Regards
Zy
ID: 50814 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
100 million credit badge13 year member badge
Message 50816 - Posted: 23 Aug 2011, 23:01:47 UTC

Your Cypress is in an 4 core machine.
Your Cayman is in a 24(!) core machine.


Test: Leave 20 cores unused and see how the Cayman performs now. It should now be fully utilized. Than slowly use more cores until taskmanager shows overall cpu usage close to (but below) 100% or gpu utilization drops. Which one comes first? How many cores are boinc allowed to use at that point?
ID: 50816 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
200 million credit badge13 year member badge
Message 50827 - Posted: 24 Aug 2011, 21:01:55 UTC

Wow, 2 6-core Xeons for 24 threads overall. That's a lot of thread juggling. Could it be that your MW CPU process is starved for memory bandwidth? As Len already suggested, I'd vary BOINC CPU usage to see until when I'd get good GPU usage (using interval halfing, i.e. start with 0 cores, then 12 and then 6 or 18 depending on whether 12 is still good).

MrS
Scanning for our furry friends since Jan 2002
ID: 50827 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
300 million credit badge12 year member badgeextraordinary contributions badge
Message 50840 - Posted: 26 Aug 2011, 13:24:40 UTC - in response to Message 50816.  

Test: Leave 20 cores unused and see how the Cayman performs now. It should now be fully utilized.
Even with BOINC allowed to use only 10% of the cores, the GPU was still running around 75%. Utterly bizarre.

One thing that is noticeably different with this machine is the NUMA memory access mode is enabled by default in the BIOS. I tried disabling it. Now everything is running fine!

With BOINC allowed to use 95% of the processor cores, I get 98% GPU usage with 22 CPU WU's running.

At 96% of the processor cores, GPU drops to 94% usage with 23 CPU WU's running.

At 100% of the processor cores, GPU drops to 65% (!).
ID: 50840 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
200 million credit badge13 year member badge
Message 50841 - Posted: 26 Aug 2011, 20:01:01 UTC - in response to Message 50840.  

Wow, that's highly unexpected! NUMA stands for "non-uniform memory access", meaning that one Xeon can access the memory of the other one via the quick path link, right? That increases average memory latency and improves overall bandwidth. I don't see how this could slow a GPU down that much. Maybe a bug in the BIOS? Is anything else running faster now?

MrS
Scanning for our furry friends since Jan 2002
ID: 50841 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
100 million credit badge13 year member badge
Message 50845 - Posted: 27 Aug 2011, 0:39:37 UTC

Reading http://msdn.microsoft.com/en-us/library/aa363804(v=vs.85).aspx it seems a program needs to be optimized to use NUMA. Else the traditional memory access model used with many different progs not optimized for NUMA can be faster. On the other side a multithreaded app using most/all cores can be faster with NUMA even if it is unaware of it.
ID: 50845 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
200 million credit badge13 year member badge
Message 50854 - Posted: 28 Aug 2011, 11:59:16 UTC - in response to Message 50845.  

THe GPU needs assistance every few ms, whereas in NUMA we're talking about a few 10 ns difference in access times to non-local memory. Even in the slow case that's still 5 orders of magnitude faster than what the GPU needs.

So I'm still puzzled. The only possible reason for this performance difference that I can think of is the following: in the old app Gipsel estimated when the GPU might need the next CPU intervention. This was pretty accurate, but he also included an option to scale the calculated wait times by some factor. If the CPU needs much longer for its calculations than expected, the calculated time point when the cpu thread is being woken up might be too late, leading to a GPU starving for data. In this case a simple "scale wait times by 0.95" should do the trick.

Or something else: is MS using a coarser granualarity for the scheduler in multiprocessor systems with NUMA enabled? LEt's say 5 ms instead of 1 ms (standard)? In that case I could see how performance would drop, as the CPU thread might get activated too late.

MrS
Scanning for our furry friends since Jan 2002
ID: 50854 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
100 million credit badge13 year member badge
Message 50860 - Posted: 28 Aug 2011, 20:20:02 UTC

The actual v0.82 is doing an endless loop with 1ms sleep time in between.
That's with the default of 1 for polling mode.
(The next released version should have a working 'initialwait'. The code change for it is in the source since mid of June.)

The MSDN paper didn't say anything about differences in the scheduler if you change the memory access model between NUMA and shared.
This leaves NUMA with NUMA unaware apps as reason for the slowdown; a closer investigation why NUMA is slower in those cases should be interesting. I would look into what happens if an app is switching between cores from one ms scheduler time slice to the next.

Would be interesting to see if running the app in 'busy wait' (polling of -1) would help in NUMA; maybe together with process priority of 3 instead of default 2. This should keep a full core busy full time and maybe prevent changing cores.
ID: 50860 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilearkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
50 million credit badge12 year member badge
Message 50864 - Posted: 29 Aug 2011, 0:36:55 UTC - in response to Message 50402.  

I lost the fan and video out on my Sapphire HD5830, it is currently somewhere in California, maybe already delivered. I had to stick my old HD4830 in for the time being.

But I do have a GTX560 coming tomorrow and it will replace the GTX460 and the 460 will replace the 4830 for the time being.


Currently, I have the HD5830 installed and running so I can get my RAC up a little faster than what I was getting with the OpenCL app for Nvidia on the GTX560.

Once I get the 850 watt power supply, I will be able to run them both at the same time. They are both currently in there, the 560 just does not have any 6-pin connectors.
ID: 50864 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileDavid Steel

Send message
Joined: 4 Sep 10
Posts: 5
Credit: 13,384,800
RAC: 0
10 million credit badge11 year member badge
Message 50882 - Posted: 30 Aug 2011, 13:20:46 UTC - in response to Message 50864.  

Can anyone tell me how the GTX 590 goes on MW?

It’s now been over 10 weeks and I'm still waiting for XFX to sort out my faulty card.
Also, any new replacement card I might receive will mirror the warranty of the original card I bought, so at best I will have about 2 1/2 months warranty remaining!
As far as I’m concerned, a company that does this to its customers doesn't deserve anyone’s business.
I will never buy from XFX again and will tell as many people as I know to do the same.
Imagine it was a car, or mobile phone, or fridge, would or could anyone wait over 10 weeks for repairs???
They can shove their card where the sun doesn’t shine...
ID: 50882 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilearkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
50 million credit badge12 year member badge
Message 50885 - Posted: 30 Aug 2011, 13:50:52 UTC - in response to Message 50882.  

Nvidia is still slow on Milkyway, my HD5830 is still about twice as fast on MW than my GTX560.
ID: 50885 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
200 million credit badge13 year member badge
Message 50890 - Posted: 30 Aug 2011, 20:55:06 UTC - in response to Message 50882.  

Imagine it was a car, or mobile phone, or fridge, would or could anyone wait over 10 weeks for repairs???

OT: a colleague of mine is waiting since 1 year for his new car. It's some Kia, so maybe not the best idea in the first place anyway..

MrS
Scanning for our furry friends since Jan 2002
ID: 50890 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileBladeD
Avatar

Send message
Joined: 2 Nov 10
Posts: 731
Credit: 131,536,342
RAC: 0
100 million credit badge11 year member badge
Message 50893 - Posted: 31 Aug 2011, 0:02:45 UTC - in response to Message 50890.  

Imagine it was a car, or mobile phone, or fridge, would or could anyone wait over 10 weeks for repairs???

OT: a colleague of mine is waiting since 1 year for his new car. It's some Kia, so maybe not the best idea in the first place anyway..

MrS

Waiting for a new car is different than waiting to get a car repaired.
ID: 50893 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : HD5870 dead - help with replacement!

©2021 Astroinformatics Group