Message boards :
Number crunching :
New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new!
Message board moderation
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 19 · Next
Author | Message |
---|---|
Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0 |
GTX 970, stock GPU 1329 MHz, (CPU, i7 6700K @4.2GHz)..................................273s......kashi Driver 382.53, Win 8.1, BOINC 7.6.22. Average of 7 227.23 tasks. |
Send message Joined: 27 Sep 17 Posts: 2 Credit: 76,052 RAC: 0 |
Hmm. If milkyway desires double precision and my nvidia 750ti is pretty limited on that compared to amd cards maybe my card is better served serving other projects. Ehh, the quick gratification getting 200 points in rapid succession is still nice. Modestly my evga 750ti superclocked at stock gets these done in 665 seconds with an i5-4460 running at stock 3.2 GHz (one core free) Yeah somewhere I just learned that amd cards have far more double precision compute power and that milkyway uses it. And so I came over here if there's anything about it. I now have more respect for amd hardware. I have wanted to upgrade to an 8 core Ryzen. Didn't think I'd have reason to consider amd in the gpu slot (other than the driver interface looks pretty nice). Too bad modern amd graphics cards are pretty much all gone. Doesn't seem like that's gonna get much better soon. |
Send message Joined: 2 Oct 16 Posts: 167 Credit: 1,007,165,349 RAC: 12,296 |
Hmm. If milkyway desires double precision and my nvidia 750ti is pretty limited on that compared to amd cards maybe my card is better served serving other projects. Ehh, the quick gratification getting 200 points in rapid succession is still nice. Older AMD cards have better double precision, FP64. The 7970/280x generation or any derivatives. All newer consumer cards from NV and AMD have poor FP64 performance including that maxwell 750Ti. That's left to prosumer or above cards like the Nvidia Titan Volta for $3k. |
Send message Joined: 4 Mar 10 Posts: 65 Credit: 639,958,626 RAC: 0 |
fake results from khryl . do not compare old units and some "time" to todays units. what a . here is real results from khryl: Completed, validation inconclusive 70.00 13.91 pending MilkyWay@Home v1.46 (opencl_ati_101) 2268037982 1580436922 24 Feb 2018, 4:57:44 UTC 24 Feb 2018, 5:42:43 UTC Completed, validation inconclusive 62.98 13.77 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267653732 1580248511 23 Feb 2018, 15:26:22 UTC 24 Feb 2018, 1:09:54 UTC Completed, validation inconclusive 76.09 13.64 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267644665 1580243934 23 Feb 2018, 15:11:30 UTC 23 Feb 2018, 15:57:39 UTC Completed, validation inconclusive 62.28 14.38 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267629937 1580236708 23 Feb 2018, 14:46:44 UTC 23 Feb 2018, 15:31:18 UTC Completed and validated 71.96 12.41 227.26 MilkyWay@Home v1.46 (opencl_ati_101) 2267530589 1580186570 23 Feb 2018, 12:04:10 UTC 23 Feb 2018, 12:51:58 UTC Completed, validation inconclusive 68.30 13.72 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267513420 1580177755 23 Feb 2018, 11:35:02 UTC 23 Feb 2018, 12:22:09 UTC Completed, validation inconclusive 72.32 14.05 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267469346 1580156059 23 Feb 2018, 10:21:09 UTC 23 Feb 2018, 11:06:31 UTC Completed, validation inconclusive 64.34 12.44 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267238655 1580034207 23 Feb 2018, 3:52:28 UTC 23 Feb 2018, 4:37:22 UTC Completed, validation inconclusive 65.85 12.00 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267119524 1579966285 23 Feb 2018, 0:29:12 UTC 23 Feb 2018, 1:13:52 UTC Completed, validation inconclusive 64.20 13.06 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267118431 1579965540 23 Feb 2018, 0:27:30 UTC 23 Feb 2018, 1:13:52 UTC Completed, validation inconclusive 61.27 12.78 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267116504 1579964234 23 Feb 2018, 0:24:14 UTC 23 Feb 2018, 1:08:56 UTC Completed, validation inconclusive 65.89 13.77 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267115908 1579963902 23 Feb 2018, 0:22:35 UTC 23 Feb 2018, 1:08:56 UTC Completed, validation inconclusive 68.44 12.39 pending MilkyWay@Home v1.46 (opencl_ati_101) 2267115909 1579963903 23 Feb 2018, 0:22:35 UTC 23 Feb 2018, 1:08:56 UTC Completed, validation inconclusive 68.23 12.98 pending MilkyWay@Home v1.46 (opencl_ati_101) 2266724060 1579880429 22 Feb 2018, 15:12:24 UTC 22 Feb 2018, 15:57:26 UTC Completed, validation inconclusive 66.12 12.39 pending MilkyWay@Home v1.46 (opencl_ati_101) 2266255461 1579840695 22 Feb 2018, 2:58:17 UTC 22 Feb 2018, 3:44:39 UTC Completed, validation inconclusive 6 7970 is still best.. about 38-45 sec ,, ))) |
Send message Joined: 5 Jul 12 Posts: 6 Credit: 7,136,476 RAC: 0 |
The new Volta is alive and got an 1/2 ratio of FP64/32. It hast got 6900 Glops at FP64, while the 7990 got 1946. Thats an huge improve, but the card wil cost around 3k :( |
Send message Joined: 10 Dec 17 Posts: 47 Credit: 695,662,962 RAC: 0 |
You can pickup used Server-version FirePro cards for under $400 that have strong FP64. S9100/S9150 have the best price/performance ratio. They are around ~2500 gflops. 280X is around 1000 gflops for comparison. I just got my S9150 up and running, paid $340 shipped from ebay. Two things to note with these, they are headless - no video connectors, so compute only. And they are passive, no fans. You'll have to rig up your own cooling. The Workstation-version W8100/W9100 cards offer good FP64 performance and include video outputs and built-in cooling fan, but are significantly more expensive. ~$1000 and way, way up. The older 8000/9000 series do not offer competitive gflops per watt (3.5 vs 9.5 for the 91xx and 10.7 for the 9150 specifically) For Nvidia the original Titan, Titan Black and dual-gpu Titan Z all offer strong FP64 at a favorable GFLOPS per Watt (6.0, 6.6, 7.2) 6990 gflop/watt is 3.4 7990 gflop/watt is 5.0 7970 gflop/watt is 3.7 280X gflop/watt is 4.1 This of course is all dependent on your specific clockspeeds, any thermal throttling, etc. But a useful reference point anyway. I have been compiling a spreadsheet and will share it here later. (No googledocs access at work) |
Send message Joined: 26 Mar 18 Posts: 24 Credit: 102,912,937 RAC: 0 |
nVidia Tesla v100 16Gb SXM2 interface, Ubuntu 16.04, Cuda 9.1: GPU - CPU - Credit 16.31 - 13.95 - 227.63 17.30 - 15.26 - 229.05 16.21 - 13.95 - 227.63 16.23 - 11.73 - 228.52 15.20 - 13.08 - 228.13 With the above said it can run 6 at a time averaging about 31s each so a bit above 5s per WU. Its beautiful: https://imgur.com/Gsb3NiR |
Send message Joined: 26 Mar 18 Posts: 24 Credit: 102,912,937 RAC: 0 |
For Nvidia the original Titan, Titan Black and dual-gpu Titan Z all offer strong FP64 at a favorable GFLOPS per Watt (6.0, 6.6, 7.2) The Wikipedia on these is pretty interesting also: https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units |
Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,461,693,501 RAC: 0 |
Just saw this thread and got the following computations using this program. Single S9100 with Q9550s (Core 2 quad) - Three concurrent work units Run Time CPU Time Credit (sec) (sec) 93.3 19.1 227.7 100.3 19.2 227.7 86.3 17.9 227.6 96.3 21.1 227.7 114.4 18.5 229.7 105.3 17.8 229.4 111.4 17.7 227.6 78.2 18.0 227.6 93.4 13.9 228.5 93.3 14.9 231.2 90.3 17.7 227.6 97.4 16.3 227.6 94.3 16.2 227.6 128.5 14.8 231.2 97.4 14.3 228.5 102.4 21.4 227.7 110.4 18.5 229.4 119.4 14.8 231.2 91.3 18.0 227.6 123.4 18.6 229.7 ---------------------------------- AVG: 101.3 17.4 228.6 STD: 12.7 2.1 1.3 101.3 / 3 = about 34 seconds for a single work unit Pair of S9000 (same as HD7950) on X5470 w/771->775 adapter Five concurrent tasks Run Time CPU Time Credit (sec) (sec) 170.5 12.0 227.6 211.8 19.0 227.7 271.5 20.5 229.3 213.1 16.5 229.4 223.1 16.9 229.7 252.5 16.6 230.8 252.5 16.5 230.8 197.8 16.6 229.4 204.3 19.4 227.7 185.9 16.3 227.6 235.5 22.3 227.3 239.4 20.3 229.3 264.2 16.9 230.8 172.7 16.7 227.6 266.4 16.4 230.8 235.0 22.3 227.3 172.0 16.4 227.7 178.4 18.8 227.7 179.7 17.0 227.7 221.3 16.6 227.6 ---------------------------------- AVG: 217.4 17.7 228.7 STD: 33.1 2.3 1.3 217 / 5 = 44 seconds per workunit each board. |
Send message Joined: 22 Jun 18 Posts: 2 Credit: 917,408,116 RAC: 0 |
Ryzen 1700X - 32GB RAM - 2x Titan V - Windows 10 64 Running 8 concurrent WU per card Average 65s per WU https://imgur.com/Jd7qCDE <app_config> <app> <name>milkyway</name> <gpu_versions> <gpu_usage>0.125</gpu_usage> <cpu_usage>1</cpu_usage> </gpu_versions> </app> </app_config> |
Send message Joined: 22 Jan 11 Posts: 375 Credit: 64,657,871 RAC: 0 |
Thanks for the replies & data guys, good to see an on going discussion :). I will post an updated table when I get time (not much for here atm I'm afraid), & when I've had answers to the questions below. Mikey, JoeM, tictoc, ultraZ did you use times from WUs with 227.23 credits? JoeM Both the HD5870 and the HD6970 were mated with the 8350 CPUs. The HD5870 mated with the slightly faster one. Both 8350 CPUs are running stock, no overclock A? Lol, err you say they're both with 8350s running stock, but 1 CPU is faster??? You've lost me mate ;). ultraZ Good to know that's still the case & to see numbers for it :). I think I need to create a 2nd table showing WU times for concurrent run WUs, if the common type WU isn't changing all the time..... DVDL Wow! The Volta is insane, both in terms of FP64 performance & price! lol mlek Interesting, & good info :) vseven Quite a range of credit number WUs & times there, is that typical now? (I haven't looked in ~6mths). Btw, if you want your time in the table, it'll need to be an average of at least 5 of the 227.23 credit WUs. BeemerBiker Hmm, that's concerning, no 227.23 credit WUs? (PS. I've got an e46 330d, if you care about 4 wheeled vehicles ;)) Team AnandTech - SETI@H, DPAD, F@H, MW@H, A@H, LHC, POGS, R@H, Einstein@H, DHEP, WCG Main rig - Ryzen 5 3600, MSI B450 G.Pro C. AC, RTX 3060Ti 8GB, 32GB DDR4 3200, Win 10 64bit 2nd rig - i7 4930k @4.1 GHz, HD 7870 XT 3GB(DS), 16GB DDR3 1866, Win7 |
Send message Joined: 26 Mar 18 Posts: 24 Credit: 102,912,937 RAC: 0 |
Here is something more interesting. Titan V running stock clock rates: 2377879357 1633136006 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.26 10.05 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377879358 1633136007 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.40 10.00 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377879360 1633136009 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.29 10.08 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377879368 1633031063 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.33 10.00 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377879369 1633069821 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.29 9.78 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377879117 1633015264 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:41:41 UTC Completed and validated 64.30 10.34 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377879373 1633119141 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 64.35 9.63 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) They are averaging just under 64 seconds each BUT I'm running 7 at a time so technically averaging just over 9 seconds each overall. GPU load floats around 90%. Adding another WU makes it hit 100% and is overall slower, less WU's and my average goes up. I'm also CPU limited on this machine (4 cores total) so I'm only giving each WU 0.4 CPU which is hurting me slightly. Also this: Device peak FLOPS 5,964.50 GFLOPS Its rated for a theoretical 6900 so I'd say it's doing ok. I should get a chance to test the Tesla V100 again in a week or so. I'll try to get updated results for just the 227.23. If given the full number of CPU cores and run 8 at a time it should place just under 6 seconds average per WU. |
Send message Joined: 26 Mar 18 Posts: 24 Credit: 102,912,937 RAC: 0 |
Got to grab 20 minutes on our V100 machine: 2377935709 1633165069 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 35.35 32.82 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377935710 1633165186 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 35.34 33.26 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377935457 1633166969 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 34.32 32.82 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377935472 1633160555 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 36.35 34.24 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377935747 1633055494 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 33.26 31.61 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377935759 1633137483 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 35.26 33.70 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) 2377935761 1633160810 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 33.24 30.70 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101) Again running 7 WU's at a time so each is just under 35 sec but taking into account 7 running at once puts you right at 5 seconds a WU. Task details show 6,270.17 GFLOPS, just barely higher then the Titan V. I'm really surprised at how much better the Tesla v100 does then the Titan V though. Clock speeds are about the same, Tesla has higher memory bandwidth but I can't imagine such a huge difference. The Tesla on paper technically has lower double precision specs but runs higher in my testing and performs almost twice as fast. |
Send message Joined: 13 Dec 12 Posts: 101 Credit: 1,782,758,310 RAC: 0 |
That.....is insane! 5sec/wu! |
Send message Joined: 16 Jul 18 Posts: 2 Credit: 64,668,103 RAC: 0 |
Hello, I'm newish to the numbers game. I have built a computer with the sole purpose of smashing data at home. I fear I don't know all the lingo for all the terms yet for the Bonic worlds, as well as the message boards. I have a few questions. 1. What program is everyone using for there benchmark for the "WU"? 2. Are all WU's the same size across the board on Bonic? 3. What is a "WU"? work unit? No Bad Days, Artemis |
Send message Joined: 16 Jul 18 Posts: 2 Credit: 64,668,103 RAC: 0 |
This is the only benchmark I can find 7/30/2018 12:47:12 PM | | Benchmark results: 7/30/2018 12:47:12 PM | | Number of CPUs: 28 7/30/2018 12:47:12 PM | | 4871 floating point MIPS (Whetstone) per CPU 7/30/2018 12:47:12 PM | | 16805 integer MIPS (Dhrystone) per CPU 7/30/2018 12:47:13 PM | | Resuming computation |
Send message Joined: 13 Jun 09 Posts: 24 Credit: 137,536,729 RAC: 0 |
I have W8100 and I ran 2 WU at the same time and saw 100% usage with default app. Seems like others are often running many more in concurent? All WU's were either 227.23 or 227.25 credits Run Time CPU Time (sec) (sec) 69.12 17.61 77.13 16.77 73.16 17.39 77.13 17.34 77.15 16.91 66.1 16.56 65.15 16.34 77.16 15.14 69.15 14.89 78.15 15.02 83.16 15.91 71.15 17.47 78.15 19.14 79.15 16.66 71.13 15.88 69.15 17.09 79.16 17.95 76.15 17.08 67.11 17.5 73.16 18.17 ------------- Average 74 17 SD 5.1 1.1 74 / 2 = about 37 seconds for a single work unit App only reports 500GFLOP which is well below the theory 2000GFLOP |
Send message Joined: 23 Jan 10 Posts: 1 Credit: 23,515,226 RAC: 2,452 |
Ryzen 2200G integrated GPU running at stock 1.1GHz. CPU is normally throttled to 2.3GHz. Both are undervolted 15% and I think they will go lower. DDR4 at 3GT/s. The machine is 100% passively cooled and draws only 45W at the wall. 2543621 1653232458 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 13:58:38 UTC Completed and validated 482.49 24.08 203.92 MilkyWay@Home v1.46 (opencl_ati_101) 2543889 1653232725 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 11:15:00 UTC Completed and validated 453.40 19.09 227.62 MilkyWay@Home v1.46 (opencl_ati_101) 2543379 1653232216 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 12:48:26 UTC Completed and validated 480.38 23.84 203.92 MilkyWay@Home v1.46 (opencl_ati_101) 2543891 1653232727 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 12:32:41 UTC Completed and validated 463.37 20.11 227.62 MilkyWay@Home v1.46 (opencl_ati_101) 2543896 1653232732 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 11:22:27 UTC Completed and validated 450.45 18.98 227.62 MilkyWay@Home v1.46 (opencl_ati_101) 2543898 1653232734 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 14:30:06 UTC Completed and validated 459.55 18.69 227.62 MilkyWay@Home v1.46 (opencl_ati_101) 2543900 1653232736 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 14:46:00 UTC Completed and validated 488.53 18.61 227.62 MilkyWay@Home v1.46 (opencl_ati_101) 2543905 1653232741 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 13:27:00 UTC Completed and validated 459.24 19.75 227.62 MilkyWay@Home v1.46 (opencl_ati_101) 2543907 1653232743 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 10:43:31 UTC Completed and validated 467.55 20.61 227.62 MilkyWay@Home v1.46 (opencl_ati_101) 2543908 1653232744 702195 6 Sep 2018, 10:10:31 UTC 6 Sep 2018, 13:42:29 UTC Completed and validated 465.41 19.27 227.62 MilkyWay@Home v1.46 (opencl_ati_101) |
Send message Joined: 26 Mar 18 Posts: 24 Credit: 102,912,937 RAC: 0 |
Tesla V100 SXM2 16Gb running CUDA 10.0 and nVidia 410.28 drivers. In previous testing on CUDA 9.2 / 390.* drivers the best I could do was run 6 WU at a time which averaged 31 seconds per WU or 31/6 = 5.17 seconds a WU. On the new software: 19155972 1661153815 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:47:54 UTC Completed and validated 26.16 24.37 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155974 1661153915 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 25.13 23.43 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155977 1661158879 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 28.12 26.69 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155978 1661158880 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:47:54 UTC Completed and validated 26.11 24.48 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155983 1661168505 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:51:07 UTC Completed and validated 28.18 26.03 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155984 1661168506 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:39:06 UTC Completed and validated 35.40 32.05 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155986 1661169020 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 28.15 26.10 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155988 1661169637 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 27.11 25.12 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155995 1661172790 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:39:06 UTC Completed and validated 33.39 29.64 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155841 1661157523 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:47:54 UTC Completed and validated 26.17 24.44 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19155635 1661163438 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:46:19 UTC Completed and validated 28.13 26.18 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) Averaging about 28.4 with 6 at a time so 4.73 sec a WU, a overall drop of about 0.45 sec on average from the older software. So I did some testing to see if the 6 WU at a time was still optimal. Bumped it up to 7 WU at a time and got a average of 33.2 / 7 = 4.74 so almost the exact same average and no reason to tax the GPU more. Dropped it to 5 WU at a time and got a average of 21.34 per WU / 5 = 4.27...a pretty big drop on average. Tried 4 WU at a time and it averaged 18.4 per WU / 4 = 4.6 average so better then 6 or 7 but not as good as 5. So the sweet spot for a v100 on CUDA 9.2 / 390.* was 6 WU at a time but now its 5 WU at least when talking about the 227 credit WU's. And although the change is small the volume is huge. I.e. the fastest I could get on the old software of 5.17 seconds average is now 4.27 on this software so gaining a average of 0.9 seconds per WU equates to 3500 extra WU per day which for a software only change is pretty neat. Now if we count all WU's in total, the 227's and 203's and everything, the average is actually no better on the new software. For example a snapshot (again running 5 at a time): 19189880 1661146709 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 22.11 20.25 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19189881 1661152293 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 23.11 20.54 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19189890 1661174490 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:28:40 UTC Completed and validated 31.15 29.10 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101) 19190181 1661189571 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:28:40 UTC Completed and validated 31.16 29.37 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101) 19189930 1660606696 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 29.12 27.49 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101) 19189931 1660626975 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:27:03 UTC Completed and validated 28.21 26.36 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101) 19189177 1661161661 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 21.13 19.92 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19184908 1661139963 26 Sep 2018, 18:17:09 UTC 26 Sep 2018, 18:23:48 UTC Completed and validated 28.16 26.41 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101) 19184909 1661140017 26 Sep 2018, 18:17:09 UTC 26 Sep 2018, 18:22:41 UTC Completed and validated 21.11 19.19 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) The 203 credit WU's are taking 20% - 30% longer then the 227s which I can't figure out. In previous tests they were quicker. So my net averages with all WU's being considered is slightly worse. I also ran some WU's on a old GTX Titan (original) with FP32 mode enabled (which doesn't speed up the WU but does allow multiples to run without pegging the GPU): 19109285 1661134247 26 Sep 2018, 16:14:33 UTC 26 Sep 2018, 17:15:41 UTC Completed and validated 298.58 10.08 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19104584 1661094666 26 Sep 2018, 16:06:32 UTC 26 Sep 2018, 17:10:52 UTC Completed and validated 301.60 12.81 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19102151 1661110962 26 Sep 2018, 16:03:18 UTC 26 Sep 2018, 17:06:01 UTC Completed and validated 296.60 8.03 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19100232 1661145317 26 Sep 2018, 16:00:09 UTC 26 Sep 2018, 17:01:13 UTC Completed and validated 295.63 8.33 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) 19099268 1661139650 26 Sep 2018, 15:58:33 UTC 26 Sep 2018, 17:01:13 UTC Completed and validated 296.61 8.45 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101) Averaging about 298 sec with 4 running at a time (GPU still under 100%) so around 75 sec per WU. Will test with my RTX 2080 in a couple days but I'm assuming with a lack of double precision power it will do worse then the very old Titan and horribly worse then the Tesla. |
Send message Joined: 31 Dec 11 Posts: 17 Credit: 3,172,528,345 RAC: 0 |
Will test with my RTX 2080 in a couple days but I'm assuming with a lack of double precision power it will do worse then the very old Titan and horribly worse then the Tesla. The 20xx series GPUs fp64 is 1/32 fp32, so only marginally better than a 1080ti. Which puts it in the same class as an RX 480 or an HD 5850. |
©2024 Astroinformatics Group