Welcome to MilkyWay@home

New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new!

Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new!
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 19 · Next

AuthorMessage
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 67100 - Posted: 17 Feb 2018, 20:34:50 UTC

GTX 970, stock GPU 1329 MHz, (CPU, i7 6700K @4.2GHz)..................................273s......kashi

Driver 382.53, Win 8.1, BOINC 7.6.22.
Average of 7 227.23 tasks.
ID: 67100 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
sohleks

Send message
Joined: 27 Sep 17
Posts: 2
Credit: 76,052
RAC: 0
Message 67137 - Posted: 25 Feb 2018, 3:21:45 UTC

Hmm. If milkyway desires double precision and my nvidia 750ti is pretty limited on that compared to amd cards maybe my card is better served serving other projects. Ehh, the quick gratification getting 200 points in rapid succession is still nice.

Modestly my evga 750ti superclocked at stock gets these done in 665 seconds with an i5-4460 running at stock 3.2 GHz (one core free)

Yeah somewhere I just learned that amd cards have far more double precision compute power and that milkyway uses it. And so I came over here if there's anything about it. I now have more respect for amd hardware.

I have wanted to upgrade to an 8 core Ryzen. Didn't think I'd have reason to consider amd in the gpu slot (other than the driver interface looks pretty nice). Too bad modern amd graphics cards are pretty much all gone. Doesn't seem like that's gonna get much better soon.
ID: 67137 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Oct 16
Posts: 167
Credit: 1,008,061,493
RAC: 7,988
Message 67138 - Posted: 25 Feb 2018, 6:58:31 UTC - in response to Message 67137.  

Hmm. If milkyway desires double precision and my nvidia 750ti is pretty limited on that compared to amd cards maybe my card is better served serving other projects. Ehh, the quick gratification getting 200 points in rapid succession is still nice.

Modestly my evga 750ti superclocked at stock gets these done in 665 seconds with an i5-4460 running at stock 3.2 GHz (one core free)

Yeah somewhere I just learned that amd cards have far more double precision compute power and that milkyway uses it. And so I came over here if there's anything about it. I now have more respect for amd hardware.

I have wanted to upgrade to an 8 core Ryzen. Didn't think I'd have reason to consider amd in the gpu slot (other than the driver interface looks pretty nice). Too bad modern amd graphics cards are pretty much all gone. Doesn't seem like that's gonna get much better soon.


Older AMD cards have better double precision, FP64. The 7970/280x generation or any derivatives. All newer consumer cards from NV and AMD have poor FP64 performance including that maxwell 750Ti. That's left to prosumer or above cards like the Nvidia Titan Volta for $3k.
ID: 67138 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jozef J

Send message
Joined: 4 Mar 10
Posts: 65
Credit: 639,958,626
RAC: 0
Message 67202 - Posted: 5 Mar 2018, 5:25:58 UTC - in response to Message 63479.  
Last modified: 5 Mar 2018, 5:26:46 UTC

fake results from khryl . do not compare old units and some "time" to todays units. what a .
here is real results from khryl:
Completed, validation inconclusive 70.00 13.91 pending MilkyWay@Home v1.46 (opencl_ati_101)
2268037982 1580436922 24 Feb 2018, 4:57:44 UTC 24 Feb 2018, 5:42:43 UTC Completed, validation inconclusive 62.98 13.77 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267653732 1580248511 23 Feb 2018, 15:26:22 UTC 24 Feb 2018, 1:09:54 UTC Completed, validation inconclusive 76.09 13.64 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267644665 1580243934 23 Feb 2018, 15:11:30 UTC 23 Feb 2018, 15:57:39 UTC Completed, validation inconclusive 62.28 14.38 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267629937 1580236708 23 Feb 2018, 14:46:44 UTC 23 Feb 2018, 15:31:18 UTC Completed and validated 71.96 12.41 227.26 MilkyWay@Home v1.46 (opencl_ati_101)
2267530589 1580186570 23 Feb 2018, 12:04:10 UTC 23 Feb 2018, 12:51:58 UTC Completed, validation inconclusive 68.30 13.72 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267513420 1580177755 23 Feb 2018, 11:35:02 UTC 23 Feb 2018, 12:22:09 UTC Completed, validation inconclusive 72.32 14.05 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267469346 1580156059 23 Feb 2018, 10:21:09 UTC 23 Feb 2018, 11:06:31 UTC Completed, validation inconclusive 64.34 12.44 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267238655 1580034207 23 Feb 2018, 3:52:28 UTC 23 Feb 2018, 4:37:22 UTC Completed, validation inconclusive 65.85 12.00 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267119524 1579966285 23 Feb 2018, 0:29:12 UTC 23 Feb 2018, 1:13:52 UTC Completed, validation inconclusive 64.20 13.06 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267118431 1579965540 23 Feb 2018, 0:27:30 UTC 23 Feb 2018, 1:13:52 UTC Completed, validation inconclusive 61.27 12.78 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267116504 1579964234 23 Feb 2018, 0:24:14 UTC 23 Feb 2018, 1:08:56 UTC Completed, validation inconclusive 65.89 13.77 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267115908 1579963902 23 Feb 2018, 0:22:35 UTC 23 Feb 2018, 1:08:56 UTC Completed, validation inconclusive 68.44 12.39 pending MilkyWay@Home v1.46 (opencl_ati_101)
2267115909 1579963903 23 Feb 2018, 0:22:35 UTC 23 Feb 2018, 1:08:56 UTC Completed, validation inconclusive 68.23 12.98 pending MilkyWay@Home v1.46 (opencl_ati_101)
2266724060 1579880429 22 Feb 2018, 15:12:24 UTC 22 Feb 2018, 15:57:26 UTC Completed, validation inconclusive 66.12 12.39 pending MilkyWay@Home v1.46 (opencl_ati_101)
2266255461 1579840695 22 Feb 2018, 2:58:17 UTC 22 Feb 2018, 3:44:39 UTC Completed, validation inconclusive 6

7970 is still best.. about 38-45 sec ,, )))
ID: 67202 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DVDL

Send message
Joined: 5 Jul 12
Posts: 6
Credit: 7,136,476
RAC: 0
Message 67213 - Posted: 6 Mar 2018, 7:25:00 UTC

The new Volta is alive and got an 1/2 ratio of FP64/32. It hast got 6900 Glops at FP64, while the 7990 got 1946. Thats an huge improve, but the card wil cost around 3k :(
ID: 67213 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
melk

Send message
Joined: 10 Dec 17
Posts: 47
Credit: 695,662,962
RAC: 0
Message 67361 - Posted: 19 Apr 2018, 15:04:43 UTC

You can pickup used Server-version FirePro cards for under $400 that have strong FP64. S9100/S9150 have the best price/performance ratio. They are around ~2500 gflops. 280X is around 1000 gflops for comparison.

I just got my S9150 up and running, paid $340 shipped from ebay. Two things to note with these, they are headless - no video connectors, so compute only. And they are passive, no fans. You'll have to rig up your own cooling.

The Workstation-version W8100/W9100 cards offer good FP64 performance and include video outputs and built-in cooling fan, but are significantly more expensive. ~$1000 and way, way up.

The older 8000/9000 series do not offer competitive gflops per watt (3.5 vs 9.5 for the 91xx and 10.7 for the 9150 specifically)

For Nvidia the original Titan, Titan Black and dual-gpu Titan Z all offer strong FP64 at a favorable GFLOPS per Watt (6.0, 6.6, 7.2)

6990 gflop/watt is 3.4
7990 gflop/watt is 5.0

7970 gflop/watt is 3.7
280X gflop/watt is 4.1

This of course is all dependent on your specific clockspeeds, any thermal throttling, etc. But a useful reference point anyway.

I have been compiling a spreadsheet and will share it here later. (No googledocs access at work)
ID: 67361 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
vseven

Send message
Joined: 26 Mar 18
Posts: 24
Credit: 102,912,937
RAC: 0
Message 67432 - Posted: 8 May 2018, 13:04:22 UTC
Last modified: 8 May 2018, 13:08:56 UTC

nVidia Tesla v100 16Gb SXM2 interface, Ubuntu 16.04, Cuda 9.1:

GPU - CPU - Credit
16.31 - 13.95 - 227.63
17.30 - 15.26 - 229.05
16.21 - 13.95 - 227.63
16.23 - 11.73 - 228.52
15.20 - 13.08 - 228.13

With the above said it can run 6 at a time averaging about 31s each so a bit above 5s per WU.


Its beautiful: https://imgur.com/Gsb3NiR
ID: 67432 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
vseven

Send message
Joined: 26 Mar 18
Posts: 24
Credit: 102,912,937
RAC: 0
Message 67433 - Posted: 8 May 2018, 15:02:29 UTC - in response to Message 67361.  

For Nvidia the original Titan, Titan Black and dual-gpu Titan Z all offer strong FP64 at a favorable GFLOPS per Watt (6.0, 6.6, 7.2)


The Wikipedia on these is pretty interesting also:

https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units
ID: 67433 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 67448 - Posted: 10 May 2018, 1:13:54 UTC

Just saw this thread and got the following computations using this program.

Single S9100 with Q9550s (Core 2 quad) - Three concurrent work units
        Run Time     CPU Time     Credit
         (sec)         (sec)
            93.3          19.1        227.7
           100.3          19.2        227.7
            86.3          17.9        227.6
            96.3          21.1        227.7
           114.4          18.5        229.7
           105.3          17.8        229.4
           111.4          17.7        227.6
            78.2          18.0        227.6
            93.4          13.9        228.5
            93.3          14.9        231.2
            90.3          17.7        227.6
            97.4          16.3        227.6
            94.3          16.2        227.6
           128.5          14.8        231.2
            97.4          14.3        228.5
           102.4          21.4        227.7
           110.4          18.5        229.4
           119.4          14.8        231.2
            91.3          18.0        227.6
           123.4          18.6        229.7
         ----------------------------------
AVG:       101.3          17.4        228.6
STD:        12.7           2.1          1.3


101.3 / 3 = about 34 seconds for a single work unit


Pair of S9000 (same as HD7950) on X5470 w/771->775 adapter Five concurrent tasks
        Run Time     CPU Time     Credit
         (sec)         (sec)
           170.5          12.0        227.6
           211.8          19.0        227.7
           271.5          20.5        229.3
           213.1          16.5        229.4
           223.1          16.9        229.7
           252.5          16.6        230.8
           252.5          16.5        230.8
           197.8          16.6        229.4
           204.3          19.4        227.7
           185.9          16.3        227.6
           235.5          22.3        227.3
           239.4          20.3        229.3
           264.2          16.9        230.8
           172.7          16.7        227.6
           266.4          16.4        230.8
           235.0          22.3        227.3
           172.0          16.4        227.7
           178.4          18.8        227.7
           179.7          17.0        227.7
           221.3          16.6        227.6
         ----------------------------------
AVG:       217.4          17.7        228.7
STD:        33.1           2.3          1.3


217 / 5 = 44 seconds per workunit each board.
ID: 67448 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zzrhardy

Send message
Joined: 22 Jun 18
Posts: 2
Credit: 917,408,116
RAC: 0
Message 67619 - Posted: 22 Jun 2018, 20:49:03 UTC - in response to Message 67448.  
Last modified: 22 Jun 2018, 20:49:28 UTC

Ryzen 1700X - 32GB RAM - 2x Titan V - Windows 10 64
Running 8 concurrent WU per card
Average 65s per WU
https://imgur.com/Jd7qCDE

<app_config>
<app>
<name>milkyway</name>
<gpu_versions>
<gpu_usage>0.125</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>
</app_config>
ID: 67619 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 22 Jan 11
Posts: 375
Credit: 64,707,046
RAC: 495
Message 67630 - Posted: 26 Jun 2018, 20:53:01 UTC

Thanks for the replies & data guys, good to see an on going discussion :).
I will post an updated table when I get time (not much for here atm I'm afraid), & when I've had answers to the questions below.

Mikey, JoeM, tictoc, ultraZ did you use times from WUs with 227.23 credits?

JoeM
Both the HD5870 and the HD6970 were mated with the 8350 CPUs. The HD5870 mated with the slightly faster one. Both 8350 CPUs are running stock, no overclock
A? Lol, err you say they're both with 8350s running stock, but 1 CPU is faster??? You've lost me mate ;).

ultraZ
Good to know that's still the case & to see numbers for it :).
I think I need to create a 2nd table showing WU times for concurrent run WUs, if the common type WU isn't changing all the time.....

DVDL
Wow! The Volta is insane, both in terms of FP64 performance & price! lol

mlek
Interesting, & good info :)

vseven
Quite a range of credit number WUs & times there, is that typical now? (I haven't looked in ~6mths).
Btw, if you want your time in the table, it'll need to be an average of at least 5 of the 227.23 credit WUs.

BeemerBiker
Hmm, that's concerning, no 227.23 credit WUs?

(PS. I've got an e46 330d, if you care about 4 wheeled vehicles ;))
Team AnandTech - SETI@H, DPAD, F@H, MW@H, A@H, LHC, POGS, R@H, Einstein@H, DHEP, WCG

Main rig - Ryzen 5 3600, MSI B450 G.Pro C. AC, RTX 3060Ti 8GB, 32GB DDR4 3200, Win 10 64bit
2nd rig - i7 4930k @4.1 GHz, HD 7870 XT 3GB(DS), 16GB DDR3 1866, Win7
ID: 67630 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
vseven

Send message
Joined: 26 Mar 18
Posts: 24
Credit: 102,912,937
RAC: 0
Message 67661 - Posted: 9 Jul 2018, 13:57:07 UTC - in response to Message 67630.  
Last modified: 9 Jul 2018, 14:04:03 UTC


vseven
Quite a range of credit number WUs & times there, is that typical now? (I haven't looked in ~6mths).
Btw, if you want your time in the table, it'll need to be an average of at least 5 of the 227.23 credit WUs.


Here is something more interesting. Titan V running stock clock rates:

2377879357 1633136006 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.26 10.05 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879358 1633136007 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.40 10.00 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879360 1633136009 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.29 10.08 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879368 1633031063 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.33 10.00 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879369 1633069821 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.29 9.78 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879117 1633015264 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:41:41 UTC Completed and validated 64.30 10.34 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879373 1633119141 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 64.35 9.63 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)


They are averaging just under 64 seconds each BUT I'm running 7 at a time so technically averaging just over 9 seconds each overall. GPU load floats around 90%. Adding another WU makes it hit 100% and is overall slower, less WU's and my average goes up. I'm also CPU limited on this machine (4 cores total) so I'm only giving each WU 0.4 CPU which is hurting me slightly. Also this:

Device peak FLOPS 5,964.50 GFLOPS

Its rated for a theoretical 6900 so I'd say it's doing ok.

I should get a chance to test the Tesla V100 again in a week or so. I'll try to get updated results for just the 227.23. If given the full number of CPU cores and run 8 at a time it should place just under 6 seconds average per WU.
ID: 67661 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
vseven

Send message
Joined: 26 Mar 18
Posts: 24
Credit: 102,912,937
RAC: 0
Message 67662 - Posted: 9 Jul 2018, 15:28:19 UTC - in response to Message 67630.  


vseven
Quite a range of credit number WUs & times there, is that typical now? (I haven't looked in ~6mths).
Btw, if you want your time in the table, it'll need to be an average of at least 5 of the 227.23 credit WUs.


Got to grab 20 minutes on our V100 machine:

2377935709 1633165069 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 35.35 32.82 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935710 1633165186 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 35.34 33.26 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935457 1633166969 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 34.32 32.82 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935472 1633160555 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 36.35 34.24 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935747 1633055494 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 33.26 31.61 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935759 1633137483 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 35.26 33.70 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935761 1633160810 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 33.24 30.70 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)

Again running 7 WU's at a time so each is just under 35 sec but taking into account 7 running at once puts you right at 5 seconds a WU. Task details show 6,270.17 GFLOPS, just barely higher then the Titan V.


I'm really surprised at how much better the Tesla v100 does then the Titan V though. Clock speeds are about the same, Tesla has higher memory bandwidth but I can't imagine such a huge difference. The Tesla on paper technically has lower double precision specs but runs higher in my testing and performs almost twice as fast.
ID: 67662 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chooka
Avatar

Send message
Joined: 13 Dec 12
Posts: 101
Credit: 1,782,758,310
RAC: 0
Message 67675 - Posted: 17 Jul 2018, 9:19:10 UTC - in response to Message 67662.  

That.....is insane! 5sec/wu!

ID: 67675 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Artemis Entrari
Avatar

Send message
Joined: 16 Jul 18
Posts: 2
Credit: 64,668,103
RAC: 0
Message 67694 - Posted: 30 Jul 2018, 19:27:19 UTC

Hello,
I'm newish to the numbers game. I have built a computer with the sole purpose of smashing data at home. I fear I don't know all the lingo for all the terms yet for the Bonic worlds, as well as the message boards. I have a few questions.

1. What program is everyone using for there benchmark for the "WU"?
2. Are all WU's the same size across the board on Bonic?
3. What is a "WU"? work unit?

No Bad Days,
Artemis
ID: 67694 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Artemis Entrari
Avatar

Send message
Joined: 16 Jul 18
Posts: 2
Credit: 64,668,103
RAC: 0
Message 67695 - Posted: 30 Jul 2018, 19:48:52 UTC - in response to Message 67694.  
Last modified: 30 Jul 2018, 19:49:28 UTC

This is the only benchmark I can find



7/30/2018 12:47:12 PM | | Benchmark results:
7/30/2018 12:47:12 PM | | Number of CPUs: 28
7/30/2018 12:47:12 PM | | 4871 floating point MIPS (Whetstone) per CPU
7/30/2018 12:47:12 PM | | 16805 integer MIPS (Dhrystone) per CPU
7/30/2018 12:47:13 PM | | Resuming computation
ID: 67695 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toby Broom

Send message
Joined: 13 Jun 09
Posts: 24
Credit: 137,665,719
RAC: 684
Message 67699 - Posted: 6 Aug 2018, 17:18:50 UTC
Last modified: 6 Aug 2018, 17:26:12 UTC

I have W8100 and I ran 2 WU at the same time and saw 100% usage with default app. Seems like others are often running many more in concurent? All WU's were either 227.23 or 227.25 credits


	Run Time CPU Time
	(sec)	(sec)
	69.12	17.61
	77.13	16.77
	73.16	17.39
	77.13	17.34
	77.15	16.91
	66.1	16.56
	65.15	16.34
	77.16	15.14
	69.15	14.89
	78.15	15.02
	83.16	15.91
	71.15	17.47
	78.15	19.14
	79.15	16.66
	71.13	15.88
	69.15	17.09
	79.16	17.95
	76.15	17.08
	67.11	17.5
	73.16	18.17
	-------------	
Average	74	17
SD	5.1	1.1




74 / 2 = about 37 seconds for a single work unit


App only reports 500GFLOP which is well below the theory 2000GFLOP
ID: 67699 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Evans MWAH

Send message
Joined: 23 Jan 10
Posts: 1
Credit: 23,566,873
RAC: 2,192
Message 67795 - Posted: 7 Sep 2018, 15:33:15 UTC

Ryzen 2200G integrated GPU running at stock 1.1GHz. CPU is normally throttled to 2.3GHz. Both are undervolted 15% and I think they will go lower. DDR4 at 3GT/s. The machine is 100% passively cooled and draws only 45W at the wall.

2543621 1653232458 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 13:58:38 UTC Completed and validated 482.49 24.08 203.92 MilkyWay@Home v1.46 (opencl_ati_101)
2543889 1653232725 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 11:15:00 UTC Completed and validated 453.40 19.09 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
2543379 1653232216 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 12:48:26 UTC Completed and validated 480.38 23.84 203.92 MilkyWay@Home v1.46 (opencl_ati_101)
2543891 1653232727 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 12:32:41 UTC Completed and validated 463.37 20.11 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
2543896 1653232732 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 11:22:27 UTC Completed and validated 450.45 18.98 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
2543898 1653232734 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 14:30:06 UTC Completed and validated 459.55 18.69 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
2543900 1653232736 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 14:46:00 UTC Completed and validated 488.53 18.61 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
2543905 1653232741 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 13:27:00 UTC Completed and validated 459.24 19.75 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
2543907 1653232743 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 10:43:31 UTC Completed and validated 467.55 20.61 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
2543908 1653232744 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 13:42:29 UTC Completed and validated 465.41 19.27 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
ID: 67795 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
vseven

Send message
Joined: 26 Mar 18
Posts: 24
Credit: 102,912,937
RAC: 0
Message 67822 - Posted: 26 Sep 2018, 18:41:51 UTC
Last modified: 26 Sep 2018, 18:42:41 UTC

Tesla V100 SXM2 16Gb running CUDA 10.0 and nVidia 410.28 drivers. In previous testing on CUDA 9.2 / 390.* drivers the best I could do was run 6 WU at a time which averaged 31 seconds per WU or 31/6 = 5.17 seconds a WU. On the new software:

19155972 1661153815 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:47:54 UTC Completed and validated 26.16 24.37 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155974 1661153915 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 25.13 23.43 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155977 1661158879 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 28.12 26.69 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155978 1661158880 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:47:54 UTC Completed and validated 26.11 24.48 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155983 1661168505 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:51:07 UTC Completed and validated 28.18 26.03 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155984 1661168506 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:39:06 UTC Completed and validated 35.40 32.05 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155986 1661169020 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 28.15 26.10 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155988 1661169637 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 27.11 25.12 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155995 1661172790 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:39:06 UTC Completed and validated 33.39 29.64 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155841 1661157523 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:47:54 UTC Completed and validated 26.17 24.44 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155635 1661163438 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:46:19 UTC Completed and validated 28.13 26.18 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)

Averaging about 28.4 with 6 at a time so 4.73 sec a WU, a overall drop of about 0.45 sec on average from the older software. So I did some testing to see if the 6 WU at a time was still optimal.

Bumped it up to 7 WU at a time and got a average of 33.2 / 7 = 4.74 so almost the exact same average and no reason to tax the GPU more.

Dropped it to 5 WU at a time and got a average of 21.34 per WU / 5 = 4.27...a pretty big drop on average.

Tried 4 WU at a time and it averaged 18.4 per WU / 4 = 4.6 average so better then 6 or 7 but not as good as 5.

So the sweet spot for a v100 on CUDA 9.2 / 390.* was 6 WU at a time but now its 5 WU at least when talking about the 227 credit WU's. And although the change is small the volume is huge. I.e. the fastest I could get on the old software of 5.17 seconds average is now 4.27 on this software so gaining a average of 0.9 seconds per WU equates to 3500 extra WU per day which for a software only change is pretty neat.


Now if we count all WU's in total, the 227's and 203's and everything, the average is actually no better on the new software. For example a snapshot (again running 5 at a time):

19189880 1661146709 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 22.11 20.25 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189881 1661152293 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 23.11 20.54 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189890 1661174490 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:28:40 UTC Completed and validated 31.15 29.10 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19190181 1661189571 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:28:40 UTC Completed and validated 31.16 29.37 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189930 1660606696 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 29.12 27.49 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189931 1660626975 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:27:03 UTC Completed and validated 28.21 26.36 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189177 1661161661 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 21.13 19.92 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19184908 1661139963 26 Sep 2018, 18:17:09 UTC 26 Sep 2018, 18:23:48 UTC Completed and validated 28.16 26.41 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19184909 1661140017 26 Sep 2018, 18:17:09 UTC 26 Sep 2018, 18:22:41 UTC Completed and validated 21.11 19.19 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)


The 203 credit WU's are taking 20% - 30% longer then the 227s which I can't figure out. In previous tests they were quicker. So my net averages with all WU's being considered is slightly worse.




I also ran some WU's on a old GTX Titan (original) with FP32 mode enabled (which doesn't speed up the WU but does allow multiples to run without pegging the GPU):

19109285 1661134247 26 Sep 2018, 16:14:33 UTC 26 Sep 2018, 17:15:41 UTC Completed and validated 298.58 10.08 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19104584 1661094666 26 Sep 2018, 16:06:32 UTC 26 Sep 2018, 17:10:52 UTC Completed and validated 301.60 12.81 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19102151 1661110962 26 Sep 2018, 16:03:18 UTC 26 Sep 2018, 17:06:01 UTC Completed and validated 296.60 8.03 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19100232 1661145317 26 Sep 2018, 16:00:09 UTC 26 Sep 2018, 17:01:13 UTC Completed and validated 295.63 8.33 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19099268 1661139650 26 Sep 2018, 15:58:33 UTC 26 Sep 2018, 17:01:13 UTC Completed and validated 296.61 8.45 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)

Averaging about 298 sec with 4 running at a time (GPU still under 100%) so around 75 sec per WU.


Will test with my RTX 2080 in a couple days but I'm assuming with a lack of double precision power it will do worse then the very old Titan and horribly worse then the Tesla.
ID: 67822 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
tictoc
Avatar

Send message
Joined: 31 Dec 11
Posts: 17
Credit: 3,172,591,853
RAC: 4,671
Message 67824 - Posted: 26 Sep 2018, 23:28:34 UTC - in response to Message 67822.  
Last modified: 26 Sep 2018, 23:28:56 UTC

Will test with my RTX 2080 in a couple days but I'm assuming with a lack of double precision power it will do worse then the very old Titan and horribly worse then the Tesla.


The 20xx series GPUs fp64 is 1/32 fp32, so only marginally better than a 1080ti. Which puts it in the same class as an RX 480 or an HD 5850.
ID: 67824 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 19 · Next

Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new!

©2024 Astroinformatics Group