rpi_logo
How to reduce VRAM usage for NVIDIA GPU tasks?
How to reduce VRAM usage for NVIDIA GPU tasks?
log in

Advanced search

Message boards : Number crunching : How to reduce VRAM usage for NVIDIA GPU tasks?

Author Message
Cautilus
Send message
Joined: 29 Jul 14
Posts: 9
Credit: 651,014,695
RAC: 591,591

Message 67084 - Posted: 15 Feb 2018, 7:20:56 UTC

So I have a TITAN V that I want to use on this project while it's not doing anything important. The problem is I can't max out its usage by running more WUs simultaneously because I max out the VRAM on the TITAN and all of the work units end in 'computation error'. I can run about 8 or so WUs simultaneously if I micromanage them so they don't hit 12GB VRAM usage, but surely there's a way to set the WUs to use less VRAM somehow right?

mmonnin
Send message
Joined: 2 Oct 16
Posts: 102
Credit: 81,199,642
RAC: 6

Message 67085 - Posted: 15 Feb 2018, 11:32:47 UTC
Last modified: 15 Feb 2018, 11:35:46 UTC

Don't run 8x...

I am running 4x with 330mb of memory usage. Is it so much higher on NV cards?

Is that really 70-90 seconds at 8x?

Cautilus
Send message
Joined: 29 Jul 14
Posts: 9
Credit: 651,014,695
RAC: 591,591

Message 67086 - Posted: 15 Feb 2018, 15:24:46 UTC

Yeah look trust me, I'd need probably 16 WUs simultaneously to saturate the TITAN V's FP64. I know on 280X's the VRAM usage is significantly lower, for some reason on the TITAN, each WU uses about 1.5GB of VRAM. I'm not sure if this is because of the new architecture or if it's just an NVIDIA thing. The WUs process in about 55 - 65 seconds even with 10 WUs running simultaneously, and that's still peaking at only 70 - 75% usage, indicating there's still headroom left.

mmonnin
Send message
Joined: 2 Oct 16
Posts: 102
Credit: 81,199,642
RAC: 6

Message 67089 - Posted: 15 Feb 2018, 22:54:25 UTC

That's ridiculous. They probably complete so fast to keep it busy. A longer task, the individual piece of the bundle, would probably suit that very well. They it could probably stay busy with fewer tasks.

I don't run MW on any of my Pascal/Maxwell cards cause... well they suck at FP64 ha. I' guess it would be NV's implementation of OpenCL code vs one NV card over another. A guess.

Profile Keith Myers
Avatar
Send message
Joined: 24 Jan 11
Posts: 160
Credit: 104,244,493
RAC: 26,297

Message 67094 - Posted: 16 Feb 2018, 20:16:43 UTC - in response to Message 67084.
Last modified: 16 Feb 2018, 20:18:35 UTC

Generally most current OpenCL applications are limited to 25% of VRAM on graphics cards. So you only have approximately 3GB of the 12GB of VRAM accessible on your Titan V for MW tasks to use.

If and when applications start using the OpenCL 2.0 specification that opens up global_work_size memory space, then you would be able to fully access the 12GB.

From the Nvidia driver release notes.


Experimental OpenCL 2.0 Features
Select features in OpenCL 2.0 are available in the driver for evaluation purposes only. The
following are the features as well as a description of known issues with these features in
the driver:
 Device side enqueue
•The current implementation is limited to 64-bit platforms only.
•OpenCL 2.0 allows kernels to be enqueued with global_work_size larger than the
compute capability of the NVIDIA GPU. The current implementation supports only
combinations of global_work_size and local_work_size that are within the compute
capability of the NVIDIA GPU.

The maximum supported CUDA grid and block size of NVIDIA GPUs is available
at http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#computecapabilities.
For a given grid dimension, the global_work_size can be determined by
CUDA grid size x CUDA block size.
•For executing kernels (whether from the host or the device), OpenCL 2.0 supports
non-uniform ND-ranges where global_work_size does not need to be divisible by
the local_work_size. This capability is not yet supported in the NVIDIA driver, and
therefore not supported for device side kernel enqueues.

____________

Cautilus
Send message
Joined: 29 Jul 14
Posts: 9
Credit: 651,014,695
RAC: 591,591

Message 67101 - Posted: 18 Feb 2018, 3:43:06 UTC
Last modified: 18 Feb 2018, 3:44:00 UTC

Well maybe that's how it's set out in NVIDIA's guidelines, but Milkyway still allocates up to 12GB of VRAM. Here's a graph from HWiNFO64 that shows my VRAM usage with 8 WUs running simultaneously, clearly showing it's above the 3GB threshold (Y-axis is from 0 to 12500MB of VRAM allocation).

https://i.imgur.com/665amcH.png

mmonnin
Send message
Joined: 2 Oct 16
Posts: 102
Credit: 81,199,642
RAC: 6

Message 67102 - Posted: 19 Feb 2018, 0:22:14 UTC

Maybe 3gb per task then. Otherwise why put 4+ GB on any AMD card.

Profile mikey
Avatar
Send message
Joined: 8 May 09
Posts: 2201
Credit: 250,014,553
RAC: 91,778

Message 67107 - Posted: 19 Feb 2018, 16:36:43 UTC - in response to Message 67102.

Maybe 3gb per task then. Otherwise why put 4+ GB on any AMD card.


Because crunching is NOT their primary market, it's gaming and they can access it all. Building super computers is a big market share too and they can access it all too.

Profile Chooka
Avatar
Send message
Joined: 13 Dec 12
Posts: 48
Credit: 101,772,329
RAC: 19,461

Message 67132 - Posted: 23 Feb 2018, 20:18:06 UTC

A Titan V as in the $3000 Volta card Cautilus???
Jeezus. I thought I had the BOINC bug bad...LOL.
____________

mmonnin
Send message
Joined: 2 Oct 16
Posts: 102
Credit: 81,199,642
RAC: 6

Message 67133 - Posted: 23 Feb 2018, 22:12:22 UTC - in response to Message 67132.

A Titan V as in the $3000 Volta card Cautilus???
Jeezus. I thought I had the BOINC bug bad...LOL.


Yes, x2 cards. NV put a hefty price tag on their top compute card.

Profile Chooka
Avatar
Send message
Joined: 13 Dec 12
Posts: 48
Credit: 101,772,329
RAC: 19,461

Message 67134 - Posted: 24 Feb 2018, 8:55:14 UTC - in response to Message 67133.

WOW...that has a staggering DP score. 6144.0 (7449.6)
Crazy! (Guess it comes with a crazy price tag too)

Cautilus couldn't do any better to try and catch Gary Roberts ;)
____________


Post to thread

Message boards : Number crunching : How to reduce VRAM usage for NVIDIA GPU tasks?


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group