Welcome to MilkyWay@home

Nvidia tasks not equal to AMD performance...


Advanced search

Message boards : Number crunching : Nvidia tasks not equal to AMD performance...
Message board moderation

To post messages, you must log in.

AuthorMessage
melk

Send message
Joined: 10 Dec 17
Posts: 47
Credit: 652,951,670
RAC: 45,524
500 million credit badge1 year member badge
Message 67744 - Posted: 29 Aug 2018, 23:41:15 UTC

It appears that the OpenCL Nvidia tasks are not as efficient as the OpenCL for AMD?

I have a Titan Black card, which on paper has very strong FP64 performance, around ~1500 GFLOPS which is better than AMD 280X which are around ~1000 GFLOPS.

My various 280X cards return anywhere between 360,000 - 460,000 credit each.

The Titan Black is currently earning about 167,0000 credit. Disappointing :(

The card is at steady 99% utilization, running 4 WU at once. There are no Error or Invalid tasks.

I have to assume this is just due to the efficiency of the tasks and how they are coded? Or maybe something else? Any ideas? Thanks!
ID: 67744 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 230
Credit: 110,638,317
RAC: 87,721
100 million credit badge8 year member badgeextraordinary contributions badge
Message 67745 - Posted: 30 Aug 2018, 3:27:27 UTC

A task is just a slice of the dataset. The science application is what processes the task. You have only discovered that the Nvidia OpenCL application is not as optimized as the AMD OpenCL application. Nothing more. Just the differences in compiler optimizations or possibly OpenCL libraries. A task is just a task. No difference between crunching it on Nvidia hardware or AMD hardware other than the efficiency of the science applications themselves.
ID: 67745 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2247
Credit: 322,807,865
RAC: 848,185
300 million credit badge10 year member badgeextraordinary contributions badge
Message 67746 - Posted: 30 Aug 2018, 11:35:06 UTC - in response to Message 67745.  

A task is just a slice of the dataset. The science application is what processes the task. You have only discovered that the Nvidia OpenCL application is not as optimized as the AMD OpenCL application. Nothing more. Just the differences in compiler optimizations or possibly OpenCL libraries. A task is just a task. No difference between crunching it on Nvidia hardware or AMD hardware other than the efficiency of the science applications themselves.


Yes you are right...in the beginning Nvidia offered help in optimizing the coding for Boinc projects while AMD said nope not doing that so the Nvidia codes are much better at most projects. Coding in most cases was done by volunteers or existing coders and they did what they could to get things up and running. With the lack of money and projects not sharing coders it just is what it is.
ID: 67746 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilebill1024

Send message
Joined: 13 Feb 14
Posts: 2
Credit: 42,466,155
RAC: 65,239
30 million credit badge5 year member badge
Message 67749 - Posted: 31 Aug 2018, 1:35:05 UTC - in response to Message 67744.  

My Titan 6gb can do 8 tasks at once in 300-325 seconds per task, 227 points each for most of the tasks.I would say the average task is 310 seconds.
I have DP turned on the the Nvidia control settings.

That was a lot more than my R9-280x was able to do. Would think the Black would be just as good.
ID: 67749 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilebill1024

Send message
Joined: 13 Feb 14
Posts: 2
Credit: 42,466,155
RAC: 65,239
30 million credit badge5 year member badge
Message 67750 - Posted: 31 Aug 2018, 2:10:52 UTC - in response to Message 67749.  

Just opened Afterburner and slid the voltage and power sliders all the way over, also gave it +90 now my times are down to 272-290 seconds, 280 average or so I would say.
Points are still mostly 227.62 with a 203.92 now and then.
ID: 67750 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 230
Credit: 110,638,317
RAC: 87,721
100 million credit badge8 year member badgeextraordinary contributions badge
Message 67759 - Posted: 31 Aug 2018, 16:23:36 UTC

Case in point about the science applications having the only real impact on the production of credit was just seen at Seti and the just finished WoW-Event contest.

The highest producers of credit were the hosts running a Linux platform and a new science application developed by our volunteer coders. The CUDA9.2 application users dominated the contest after two weeks and occupied the top 20 finishing postions. I myself improved my last year's finishing position from 13th place to 3rd place. My Team finished in 1st place and my Group placed 3rd.

The newest science application runs only on Nvidia with the latest drivers. It produces 10X more credit and runs 10X faster than the science applications written for Intel or AMD gpus. Even with the effects of BOINC CreditNew credit application so evident in view.

So don't blame the hardware, blame the credit mechanism and the science application developers for the lackluster credit awarded to your Nvidia Titan.
ID: 67759 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
melk

Send message
Joined: 10 Dec 17
Posts: 47
Credit: 652,951,670
RAC: 45,524
500 million credit badge1 year member badge
Message 67761 - Posted: 1 Sep 2018, 0:42:03 UTC

Maybe I did not choose the best words. I am not blaming anything, just trying to understand the large discrepancy in credit.

It would appear it boils down to the coding efficiency of the task applications themselves. Fair enough.
ID: 67761 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Oct 16
Posts: 136
Credit: 135,506,573
RAC: 406,711
100 million credit badge2 year member badge
Message 67762 - Posted: 1 Sep 2018, 18:17:24 UTC

Or it could be NVs implementation of OpenCL. AMD GPUs running OpenCL typically use very little CPU but the same app on NV cards can use a full CPU thread.
ID: 67762 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 230
Credit: 110,638,317
RAC: 87,721
100 million credit badge8 year member badgeextraordinary contributions badge
Message 67763 - Posted: 1 Sep 2018, 19:07:52 UTC - in response to Message 67762.  

That might be it. AMD is now fully behind open source and is working with the Vulkan development team so they might be more focused on interoperability rather than speed of OpenCL applications.

I also think that whether you overclock or at least run your gpu memory close to P0 speeds rather than P2 speeds has the most effect on task runtimes.
ID: 67763 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Oct 16
Posts: 136
Credit: 135,506,573
RAC: 406,711
100 million credit badge2 year member badge
Message 67764 - Posted: 2 Sep 2018, 3:49:18 UTC

The SETI instance is a more of an exception. Increasing core clock pretty much is always better then memory clock.
ID: 67764 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Nvidia tasks not equal to AMD performance...

©2019 Astroinformatics Group