Welcome to MilkyWay@home

Amount of CPU time on GPU WUs

Message boards : Number crunching : Amount of CPU time on GPU WUs
Message board moderation

To post messages, you must log in.

AuthorMessage
Wailing Angus Beef

Send message
Joined: 24 Dec 07
Posts: 33
Credit: 1,918,711,579
RAC: 3,343
Message 71574 - Posted: 2 Jan 2022, 22:18:47 UTC

What controls the amount of CPU time used for the GPU WUs?

I have a P100 in system running linux Mint 20.1 using the 470.86 driver and it uses practically a full CPU thread for each WU. Here's the host tasks...https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=906244

I also have a Titan V running in a nearly identical system also linux Mint 20.1 using the 470.86 driver but only uses about 1/3 of a CPU thread. Why? I understand the differences in the 2 GPUs but what controls how much CPU is used?

I've also noticed the Titan V uses up to 1266MiB GPU memory per WU while the P100 only uses up to 1066MiB. Both GPUs have 12GB of memory.

Thanks,
scole of TSBT
ID: 71574 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71575 - Posted: 3 Jan 2022, 13:13:24 UTC - in response to Message 71574.  

What controls the amount of CPU time used for the GPU WUs?

I have a P100 in system running linux Mint 20.1 using the 470.86 driver and it uses practically a full CPU thread for each WU. Here's the host tasks...https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=906244

I also have a Titan V running in a nearly identical system also linux Mint 20.1 using the 470.86 driver but only uses about 1/3 of a CPU thread. Why? I understand the differences in the 2 GPUs but what controls how much CPU is used?

I've also noticed the Titan V uses up to 1266MiB GPU memory per WU while the P100 only uses up to 1066MiB. Both GPUs have 12GB of memory.

Thanks,
scole of TSBT
I think different cards can do more of the work themselves, so the CPU has to do more or less work to help out. I also think the AMDs need less CPU assistance.

I have for example an AMD Radeon R9 280X on three different machines, all running Windows 10:
Ryzen R9 3900XT: 4 Milkyway tasks on the GPU, GPU fully utilised, 150 GPU MB per WU (yours is rather high!), 0.1% CPU usage = 2.4% of a core per WU.
Pentium N3700: 1 Milkyway task on the GPU (old GPU, crashes if I use too much RAM), GPU 70% utilised, 300 GPU MB per WU, 10% CPU usage = 40% of a core per WU.
Intel Core2 Q8400 CPU: 2 Milkyway tasks on the GPU (again, old GPU, crashes if given too much to do), GPU 90% utilized, 100 GPU MB per WU, 4% CPU usage = 16% of a core per WU.

Note the CPU usage is intermittent, peaking 4 times per WU, are you taking the average?
ID: 71575 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71577 - Posted: 3 Jan 2022, 15:20:37 UTC - in response to Message 71574.  
Last modified: 3 Jan 2022, 15:21:04 UTC

Your host looks very odd:
https://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=906244
What's a 0000 CPU? And why does it have a weird number of cores (56)? "Genuine Intel(R) CPU 0000 @ 2.40GHz"
Why does your card report 12GB and 4GB on the same line? Yet when I look up a P100, it should have 16GB.
ID: 71577 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Wailing Angus Beef

Send message
Joined: 24 Dec 07
Posts: 33
Credit: 1,918,711,579
RAC: 3,343
Message 71578 - Posted: 3 Jan 2022, 16:03:14 UTC - in response to Message 71577.  

System has dual Xeon V4 engineering sample CPUs in it, 28 threads per CPU, 56 threads total for system.

Don't know why it reports 12GB and 4GB. As for 12GB instead of 16GB, there are both 12GB and 16GB model P100s. Something else I don't understand, if you look at the task info it says it found 2 CL devices. On WUs for all other devices I've only seen 1 CL device reported.
ID: 71578 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71579 - Posted: 3 Jan 2022, 16:38:59 UTC - in response to Message 71578.  

System has dual Xeon V4 engineering sample CPUs in it, 28 threads per CPU, 56 threads total for system.

Don't know why it reports 12GB and 4GB. As for 12GB instead of 16GB, there are both 12GB and 16GB model P100s. Something else I don't understand, if you look at the task info it says it found 2 CL devices. On WUs for all other devices I've only seen 1 CL device reported.


I didn't realise they made CPUs with 28 cores. But then I suppose I've seen 6 and 24, so not always powers of 2.

Is your card a dual processor GPU? Like the AMD 7990?
ID: 71579 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Wailing Angus Beef

Send message
Joined: 24 Dec 07
Posts: 33
Credit: 1,918,711,579
RAC: 3,343
Message 71580 - Posted: 3 Jan 2022, 16:55:35 UTC - in response to Message 71579.  

If it does have multiple devices, it's not like a 7990 which actually appeared as 2 GPUs to your computer and BOINC. I think the K80 tesla is like that. I think the P100 and later teslas can be partitioned in a way so they can be shared in a data center environment but I'm not sure how that's done.
ID: 71580 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71581 - Posted: 3 Jan 2022, 17:12:24 UTC - in response to Message 71578.  

System has dual Xeon V4 engineering sample CPUs in it, 28 threads per CPU, 56 threads total for system.


Are those a lot cheaper? Sounds like a good way to get the latest thing for less cash.
ID: 71581 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Wailing Angus Beef

Send message
Joined: 24 Dec 07
Posts: 33
Credit: 1,918,711,579
RAC: 3,343
Message 71582 - Posted: 3 Jan 2022, 18:07:40 UTC - in response to Message 71581.  

Engineering sample CPUs are cheaper but there are usually different versions of the samples. They don't have the same clocking specs as production CPUs (usually lower), there may be some bugs and they don't work in all motherboards. Research well before buying any.
ID: 71582 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71583 - Posted: 3 Jan 2022, 18:19:02 UTC - in response to Message 71582.  

Don't seem to be many on Ebay, but the one I saw was $300 instead of $1500!
ID: 71583 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Wailing Angus Beef

Send message
Joined: 24 Dec 07
Posts: 33
Credit: 1,918,711,579
RAC: 3,343
Message 71584 - Posted: 3 Jan 2022, 22:32:51 UTC

I just realized why task info is reporting 2 CL devices. I also have a GTX 1050 Ti installed just to use for video output. The BOINC client is configured to ignore it but the milkyway still knows it's there. It doesn't use it but it sees it. I forget it's there.

For the engineering sample CPUs, search for Xeon V4 ES. Then you can filter the search results for the number of cores you want. Identify the QS code specification code like QH2N, QHV7, etc. Then research the ES spec on other forums and see if any is using it and anything they report about it, good or bad. I some 14 core QHV7s that haven't given any problems. Again, they don't work in all mobos. I use Supermicro X10DAL-i mobos on a few systems. Look for a seller that lists the compatible mobos
ID: 71584 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 71585 - Posted: 3 Jan 2022, 22:39:06 UTC - in response to Message 71584.  

I just realized why task info is reporting 2 CL devices. I also have a GTX 1050 Ti installed just to use for video output. The BOINC client is configured to ignore it but the milkyway still knows it's there. It doesn't use it but it sees it. I forget it's there.


Handy for getting 600 instead of 300 WUs at a time.

For the engineering sample CPUs, search for Xeon V4 ES. Then you can filter the search results for the number of cores you want. Identify the QS code specification code like QH2N, QHV7, etc. Then research the ES spec on other forums and see if any is using it and anything they report about it, good or bad. I some 14 core QHV7s that haven't given any problems. Again, they don't work in all mobos. I use Supermicro X10DAL-i mobos on a few systems. Look for a seller that lists the compatible mobos


Thanks for the info.
ID: 71585 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 696
Credit: 539,987,748
RAC: 87,094
Message 71590 - Posted: 4 Jan 2022, 18:00:01 UTC

All Nvidia gpus report 4GB of memory in BOINC because BOINC only uses an outdated 32bit API. 32bits can only report 4GB of addressable memory even when the gpu has much more memory installed.
This is just a reporting issue. All of the memory of a Nvidia gpu gets used. The BOINC developers have never acted on the reported issue or merged the provided solution that GPUUG Team came up with.
https://github.com/BOINC/boinc/issues/1773
You can modify the necessary code module and compile BOINC yourself to fix the issue and have your Nvidia cards report the proper amount of memory they contain.
ID: 71590 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Amount of CPU time on GPU WUs

©2024 Astroinformatics Group