Welcome to MilkyWay@home

Posts by vseven

1) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68901)
Posted 17 Jul 2019 by vseven
Post:
Thanks for the examples. I'll implement the script to rerun the update command and see how it goes.
2) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68891)
Posted 12 Jul 2019 by vseven
Post:
Not sure if there is a definitive fix for this but I'm sitting idle for 10 - 15 minutes at a time waiting on work. If I manually update I get it but that's not feasible in the middle of the night.

Long story short I have access to a AI learning machine for 2 - 3 days per month with a Tesla v100. Once I'm done with my actual work I'll run this project. Currently I can churn through WU at a 5.2 second per WU average so I kill all 300 in about 26 minutes. Then my machine sits idle for 10 - 14 minutes, gets another batch of 300, and repeats. You can even see it in the logs for the CPU:



That 12 minutes of average idle time every 26 minutes is the equivalent of 220 WU not being done a hour or over 5,000 a day. Granted I only run the machine for a couple days a month but if there is a solution for this I'd appreciate it.

Not sure if based on times a host is turning things back in if the WU cache limit can be increased. Obviously I would be happy if I got 1000 WU's at a time.
3) Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new! (Message 68883)
Posted 8 Jul 2019 by vseven
Post:
Rechecking with the latest nVidia drivers and the updated CUDA software.

OS: Ubuntu 19.04
CPU: Intel(R) Xeon(R) CPU @ 2.20GHz [Family 6 Model 79 Stepping 0]
GPU: NVIDIA Tesla V100-SXM2-16GB
nVidia Driver: 418.67
CUDA: 10.1 Patch 1


261707342 1777594136 8 Jul 2019, 16:39:53 UTC 8 Jul 2019, 16:48:31 UTC Completed and validated 10.04 8.19 227.53 Milkyway@home Separation v1.46 (opencl_nvidia_101)
x86_64-pc-linux-gnu
261708170 1777194521 8 Jul 2019, 16:39:53 UTC 8 Jul 2019, 16:48:31 UTC Completed and validated 10.07 8.52 227.53 Milkyway@home Separation v1.46 (opencl_nvidia_101)
x86_64-pc-linux-gnu
261708449 1777589688 8 Jul 2019, 16:39:53 UTC 8 Jul 2019, 16:48:31 UTC Completed and validated 10.03 8.25 227.52 Milkyway@home Separation v1.46 (opencl_nvidia_101)
x86_64-pc-linux-gnu
261707986 1777527520 8 Jul 2019, 16:39:53 UTC 8 Jul 2019, 16:48:31 UTC Completed and validated 10.03 8.23 227.52 Milkyway@home Separation v1.46 (opencl_nvidia_101)
x86_64-pc-linux-gnu
261708294 1777626487 8 Jul 2019, 16:39:52 UTC 8 Jul 2019, 16:48:31 UTC Completed and validated 10.03 8.52 227.53 Milkyway@home Separation v1.46 (opencl_nvidia_101)
x86_64-pc-linux-gnu
261708298 1777628745 8 Jul 2019, 16:39:52 UTC 8 Jul 2019, 16:48:31 UTC Completed and validated 10.03 8.62 227.53 Milkyway@home Separation v1.46 (opencl_nvidia_101)
x86_64-pc-linux-gnu

Shaved almost a second from the older driver/software which is impressive considering it was already down to 10.7. The current "244.*" tasks showed a similar decrease from around 11.5 to 10.5. Above were run one at a time. Running multiples gave just slightly better performance then before, I was at 5 WU at a time for a average of 5.7 seconds per WU and now its at 5.2 seconds per (which looks like it scaled correctly like the single WU did at around 10% faster).
4) Message boards : Number crunching : Download Stalled? (Message 68477)
Posted 5 Apr 2019 by vseven
Post:
Yeah, I'm getting 6 WU then nothing for 20 - 30 minutes then another 6. Those 6 last less then a minute.
5) Message boards : Number crunching : Benchmark thread 1-2019 on - GPU & CPU times wanted for new WUs, old & new hardware! (Message 68222)
Posted 7 Mar 2019 by vseven
Post:
Please state what speed & type CPU you have: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
Please state GPU clock speeds if overclocked or state 'stock': All stock
It would also be useful if you could state your BOINC & driver version & OS: BOINC 7.12.0, nVidia graphics drivers 418.39, CUDA 10.1, Ubuntu 18.10.


Gonna break this down between a bunch of higher end nVidia Tesla machines. Freshly installed Ubuntu 18.10 virtual machine with 4 CPU cores and 15Gb RAM assigned. Did 10 of each, all 227.* WU.

Single WU Runs

nVidia Tesla K80 - Average: 49.43 seconds
165570106 1732138206 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 14:41:39 UTC Completed and validated 49.09 26.92 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165570113 1732158674 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 14:41:39 UTC Completed and validated 50.07 27.20 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165570118 1732223740 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 15:03:10 UTC Completed and validated 49.23 27.02 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165570141 1732196470 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 15:03:10 UTC Completed and validated 49.11 27.14 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165569918 1732134432 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 15:03:10 UTC Completed and validated 49.11 26.93 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165570175 1732138120 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 15:03:10 UTC Completed and validated 50.22 27.43 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
165570176 1732138124 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 15:03:10 UTC Completed and validated 50.18 27.35 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
165570177 1732138184 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 15:03:10 UTC Completed and validated 49.08 26.82 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165569941 1732159261 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 14:41:39 UTC Completed and validated 49.18 27.04 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165569947 1732227368 7 Mar 2019, 14:30:35 UTC 7 Mar 2019, 15:03:10 UTC Completed and validated 49.07 26.15 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)


nVidia Tesla P100 PCIe - Average: 14.96 seconds
165630500 1732177626 7 Mar 2019, 16:00:12 UTC 7 Mar 2019, 16:05:20 UTC Completed and validated 14.06 12.69 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630251 1732267476 7 Mar 2019, 16:00:12 UTC 7 Mar 2019, 16:09:33 UTC Completed and validated 15.06 13.31 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630849 1732210271 7 Mar 2019, 16:00:12 UTC 7 Mar 2019, 16:05:20 UTC Completed and validated 15.03 13.05 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630800 1732006193 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:05:20 UTC Completed and validated 15.08 13.68 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630835 1731849889 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:09:33 UTC Completed and validated 15.03 12.99 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630600 1732260102 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:11:30 UTC Completed and validated 15.12 13.11 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630667 1732288252 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:05:20 UTC Completed and validated 15.07 12.99 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165577925 1732158655 7 Mar 2019, 14:41:39 UTC 7 Mar 2019, 16:00:11 UTC Completed and validated 15.03 12.91 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
165577928 1732161593 7 Mar 2019, 14:41:39 UTC 7 Mar 2019, 16:05:20 UTC Completed and validated 15.04 13.31 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
165577929 1732161594 7 Mar 2019, 14:41:39 UTC 7 Mar 2019, 16:05:20 UTC Completed and validated 15.05 13.26 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)


nVidia Tesla V100 SMX2 - Average: 11:04
165630250 1732267472 7 Mar 2019, 16:00:12 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.03 9.34 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630577 1732299799 7 Mar 2019, 16:00:12 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.03 9.07 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630578 1732300744 7 Mar 2019, 16:00:12 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.02 9.52 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630647 1732259839 7 Mar 2019, 16:00:12 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.04 9.10 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630832 1731849877 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.03 9.10 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630834 1731849884 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.03 9.56 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630844 1732133330 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.03 9.08 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630595 1732208528 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.05 8.86 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630669 1732289928 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.04 9.55 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630674 1732290272 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:52:41 UTC Completed and validated 11.06 9.20 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)



Note: There is some type of "rounding" going on when a WU processes this quickly. If you look at both the P100 and V100 results all the times are just over a even second. I find it hard to believe that I have a bunch of 11 second times with no 10.83, 11.24, etc in between but that's what I got.



Multiple WU Runs

nVidia Tesla V100 SMX2 - Average: 24.83 @ 5 WU at a time (4.97 sec per WU average)
(Note: CPU cores increased from 4 -> 8 for this run to support the number of WU's)
165630828 1731799179 7 Mar 2019, 16:00:12 UTC 7 Mar 2019, 16:59:11 UTC Completed and validated 26.15 24.14 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630850 1732228852 7 Mar 2019, 16:00:12 UTC 7 Mar 2019, 17:00:48 UTC Completed and validated 26.16 24.01 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630803 1732050297 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:55:56 UTC Completed and validated 24.10 22.38 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630807 1732229295 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:55:56 UTC Completed and validated 23.12 21.78 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630297 1732242479 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:57:33 UTC Completed and validated 23.12 20.85 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630298 1732242485 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:59:11 UTC Completed and validated 24.09 22.18 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630300 1732242487 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 17:03:17 UTC Completed and validated 25.12 23.43 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630571 1732093880 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:57:33 UTC Completed and validated 25.08 22.97 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630572 1732149794 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 17:01:40 UTC Completed and validated 27.17 23.75 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
165630575 1732260945 7 Mar 2019, 16:00:11 UTC 7 Mar 2019, 16:59:11 UTC Completed and validated 24.17 22.22 227.16 MilkyWay@Home v1.46 (opencl_nvidia_101)
6) Message boards : Number crunching : AMD recently announced Radeon VII with 6.9TFLOPS of FP64 (1:2) for only $699???!!! (Message 68029)
Posted 17 Jan 2019 by vseven
Post:


- Titan V costs $2,999 (DP of 7,450 GFLOPS)



Just a FYI - The Titan V, on paper, has a DP of 7.4 TFlops but in real world performance for Milkyway its about half the speed of a Tesla V100 (which is rated the same). Then again it is half the price of a v100 so I guess that makes sense.
7) Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new! (Message 68028)
Posted 17 Jan 2019 by vseven
Post:

Btw Vseven, you mentioned 'new software', what new s/w?
Also what CPU was that V100 running on?

Sorry, just saw this. New software was CUDA 10.0, old was 9.2, and a driver update. And here is a updated Tesla V100 on the same computer as above. The WU credits changed on me...getting a mix of 243.63 and 227.?? where the last digits are changing. I'll just post both:

121234701 1710812038 17 Jan 2019, 15:53:19 UTC 17 Jan 2019, 16:13:24 UTC Completed and validated 11.02 8.83 243.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
121234702 1710812041 17 Jan 2019, 15:53:19 UTC 17 Jan 2019, 16:13:24 UTC Completed and validated 11.02 8.84 243.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
121234703 1710812771 17 Jan 2019, 15:53:19 UTC 17 Jan 2019, 16:13:24 UTC Completed and validated 11.02 8.84 243.63 MilkyWay@Home v1.46 (opencl_nvidia_101)
121234713 1710700163 17 Jan 2019, 15:53:19 UTC 17 Jan 2019, 16:13:24 UTC Completed and validated 11.02 9.69 243.63 MilkyWay@Home v1.46 (opencl_nvidia_101)
121234930 1710821654 17 Jan 2019, 15:53:19 UTC 17 Jan 2019, 16:13:24 UTC Completed and validated 12.01 9.79 243.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
121233199 1710670465 17 Jan 2019, 15:51:40 UTC 17 Jan 2019, 16:13:24 UTC Completed and validated 12.02 9.83 243.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
121233201 1710698699 17 Jan 2019, 15:51:40 UTC 17 Jan 2019, 16:13:24 UTC Completed and validated 12.01 9.74 243.63 MilkyWay@Home v1.46 (opencl_nvidia_101)

So for the 243.63 WU's I'm averaging 11.45 seconds per WU. And the other ones:

121250325 1710802738 17 Jan 2019, 16:15:01 UTC 17 Jan 2019, 16:27:26 UTC Completed and validated 11.04 9.63 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
121250345 1710762998 17 Jan 2019, 16:15:01 UTC 17 Jan 2019, 16:27:26 UTC Completed and validated 10.02 8.04 227.13 MilkyWay@Home v1.46 (opencl_nvidia_101)
121250348 1710815959 17 Jan 2019, 16:15:01 UTC 17 Jan 2019, 16:27:26 UTC Completed and validated 12.04 9.95 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)
121250493 1710818169 17 Jan 2019, 16:15:01 UTC 17 Jan 2019, 16:27:26 UTC Completed and validated 10.02 7.97 227.13 MilkyWay@Home v1.46 (opencl_nvidia_101)
121250495 1710818832 17 Jan 2019, 16:15:01 UTC 17 Jan 2019, 16:27:26 UTC Completed and validated 10.03 8.03 227.13 MilkyWay@Home v1.46 (opencl_nvidia_101)
121250244 1710765704 17 Jan 2019, 16:15:01 UTC 17 Jan 2019, 16:27:26 UTC Completed and validated 10.06 8.76 227.15 MilkyWay@Home v1.46 (opencl_nvidia_101)
121249480 1710829880 17 Jan 2019, 16:15:01 UTC 17 Jan 2019, 16:27:26 UTC Completed and validated 10.03 7.90 227.13 MilkyWay@Home v1.46 (opencl_nvidia_101)
121250523 1710454414 17 Jan 2019, 16:15:01 UTC 17 Jan 2019, 16:27:26 UTC Completed and validated 12.06 10.10 227.17 MilkyWay@Home v1.46 (opencl_nvidia_101)

For the 227.?? WU's I'm averaging about 10.7 seconds per WU.


But as I mentioned a couple posts up the sweet spot is 5 WU at a time. Taking into account all the different WU's this nets around a 27.9 second average which divided by 5 at a time is around 5.6 seconds per WU.


NOTE: There are two different Tesla V100's, a PCIe version and a SXM2 version. The SMX2 is what I'm using and runs faster then the PCIe version due to a quicker interface. The exact model is NVIDIA Tesla V100-SXM2-16GB. In my testing for Milkyway the SXM2 version runs about 10% faster. Might want to note "Tesla V100 (SXM2)" in the benchmarks.


If you want a Tesla P100 I can probably get that too. I can tell you it will be roughly half the speed of a V100.
8) Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new! (Message 68026)
Posted 17 Jan 2019 by vseven
Post:

vseven
What CPU does that machine have? (I'm not trawling through pages of this thread & I don't see you listed in the AnandTech thread).


For that benckmark (Tesla T4) it was a Intel Xeon CPU E5-2683 v4 @ 2.10GHz but only had 8 cores assigned to the Ubuntu VM. Don't think HT matters since it was single WU at a time.

I can also rerun the Tesla V100 to get you the CPU on that but from what I remember it was the same.
9) Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new! (Message 68014)
Posted 16 Jan 2019 by vseven
Post:
Got to play with a Tesla T4 for a little bit. Same machine as my Tesla V100 test a couple months ago. CUDA 10.0 and nVidia 410.28 drivers, 8 CPU cores. The T4, kinda as expected, is slower in double precision then its predecessor V100:

119256158 1709623437 15 Jan 2019, 15:26:28 UTC 16 Jan 2019, 13:43:18 UTC Completed and validated 151.29 150.00 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
119256165 1709847295 15 Jan 2019, 15:26:28 UTC 16 Jan 2019, 14:53:50 UTC Completed and validated 151.34 149.90 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
119256166 1709847329 15 Jan 2019, 15:26:28 UTC 16 Jan 2019, 13:35:04 UTC Completed and validated 151.17 149.10 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
119256168 1709847342 15 Jan 2019, 15:26:28 UTC 16 Jan 2019, 14:37:58 UTC Completed and validated 151.21 150.09 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
119255936 1709840508 15 Jan 2019, 15:26:28 UTC 16 Jan 2019, 14:19:00 UTC Completed and validated 151.35 149.79 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
119256225 1709852404 15 Jan 2019, 15:26:28 UTC 16 Jan 2019, 14:25:53 UTC Completed and validated 151.43 150.00 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)


One WU maxed out the GPU and takes about 151 seconds, overall about 30x slower then the v100 when you consider its taking almost 6 times as long per WU and can only do one at a time.

Tesla V100 is still the king for the foreseeable future.
10) Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new! (Message 67822)
Posted 26 Sep 2018 by vseven
Post:
Tesla V100 SXM2 16Gb running CUDA 10.0 and nVidia 410.28 drivers. In previous testing on CUDA 9.2 / 390.* drivers the best I could do was run 6 WU at a time which averaged 31 seconds per WU or 31/6 = 5.17 seconds a WU. On the new software:

19155972 1661153815 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:47:54 UTC Completed and validated 26.16 24.37 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155974 1661153915 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 25.13 23.43 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155977 1661158879 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 28.12 26.69 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155978 1661158880 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:47:54 UTC Completed and validated 26.11 24.48 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155983 1661168505 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:51:07 UTC Completed and validated 28.18 26.03 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155984 1661168506 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:39:06 UTC Completed and validated 35.40 32.05 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155986 1661169020 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 28.15 26.10 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155988 1661169637 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:49:31 UTC Completed and validated 27.11 25.12 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155995 1661172790 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:39:06 UTC Completed and validated 33.39 29.64 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155841 1661157523 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:47:54 UTC Completed and validated 26.17 24.44 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19155635 1661163438 26 Sep 2018, 17:30:01 UTC 26 Sep 2018, 17:46:19 UTC Completed and validated 28.13 26.18 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)

Averaging about 28.4 with 6 at a time so 4.73 sec a WU, a overall drop of about 0.45 sec on average from the older software. So I did some testing to see if the 6 WU at a time was still optimal.

Bumped it up to 7 WU at a time and got a average of 33.2 / 7 = 4.74 so almost the exact same average and no reason to tax the GPU more.

Dropped it to 5 WU at a time and got a average of 21.34 per WU / 5 = 4.27...a pretty big drop on average.

Tried 4 WU at a time and it averaged 18.4 per WU / 4 = 4.6 average so better then 6 or 7 but not as good as 5.

So the sweet spot for a v100 on CUDA 9.2 / 390.* was 6 WU at a time but now its 5 WU at least when talking about the 227 credit WU's. And although the change is small the volume is huge. I.e. the fastest I could get on the old software of 5.17 seconds average is now 4.27 on this software so gaining a average of 0.9 seconds per WU equates to 3500 extra WU per day which for a software only change is pretty neat.


Now if we count all WU's in total, the 227's and 203's and everything, the average is actually no better on the new software. For example a snapshot (again running 5 at a time):

19189880 1661146709 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 22.11 20.25 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189881 1661152293 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 23.11 20.54 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189890 1661174490 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:28:40 UTC Completed and validated 31.15 29.10 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19190181 1661189571 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:28:40 UTC Completed and validated 31.16 29.37 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189930 1660606696 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 29.12 27.49 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189931 1660626975 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:27:03 UTC Completed and validated 28.21 26.36 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19189177 1661161661 26 Sep 2018, 18:25:25 UTC 26 Sep 2018, 18:31:55 UTC Completed and validated 21.13 19.92 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19184908 1661139963 26 Sep 2018, 18:17:09 UTC 26 Sep 2018, 18:23:48 UTC Completed and validated 28.16 26.41 203.92 MilkyWay@Home v1.46 (opencl_nvidia_101)
19184909 1661140017 26 Sep 2018, 18:17:09 UTC 26 Sep 2018, 18:22:41 UTC Completed and validated 21.11 19.19 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)


The 203 credit WU's are taking 20% - 30% longer then the 227s which I can't figure out. In previous tests they were quicker. So my net averages with all WU's being considered is slightly worse.




I also ran some WU's on a old GTX Titan (original) with FP32 mode enabled (which doesn't speed up the WU but does allow multiples to run without pegging the GPU):

19109285 1661134247 26 Sep 2018, 16:14:33 UTC 26 Sep 2018, 17:15:41 UTC Completed and validated 298.58 10.08 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19104584 1661094666 26 Sep 2018, 16:06:32 UTC 26 Sep 2018, 17:10:52 UTC Completed and validated 301.60 12.81 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19102151 1661110962 26 Sep 2018, 16:03:18 UTC 26 Sep 2018, 17:06:01 UTC Completed and validated 296.60 8.03 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19100232 1661145317 26 Sep 2018, 16:00:09 UTC 26 Sep 2018, 17:01:13 UTC Completed and validated 295.63 8.33 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)
19099268 1661139650 26 Sep 2018, 15:58:33 UTC 26 Sep 2018, 17:01:13 UTC Completed and validated 296.61 8.45 227.62 MilkyWay@Home v1.46 (opencl_nvidia_101)

Averaging about 298 sec with 4 running at a time (GPU still under 100%) so around 75 sec per WU.


Will test with my RTX 2080 in a couple days but I'm assuming with a lack of double precision power it will do worse then the very old Titan and horribly worse then the Tesla.
11) Message boards : News : Database Maintenance 9-4-2014 (Message 67766)
Posted 4 Sep 2018 by vseven
Post:
It would be nice if we could have the WU limit increased and maybe the deadline decreased a bit so when things like this happen we can keep crunching. I'm using a Volta based card and 80 WU are gone in a couple minutes.
12) Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new! (Message 67662)
Posted 9 Jul 2018 by vseven
Post:

vseven
Quite a range of credit number WUs & times there, is that typical now? (I haven't looked in ~6mths).
Btw, if you want your time in the table, it'll need to be an average of at least 5 of the 227.23 credit WUs.


Got to grab 20 minutes on our V100 machine:

2377935709 1633165069 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 35.35 32.82 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935710 1633165186 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 35.34 33.26 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935457 1633166969 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 34.32 32.82 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935472 1633160555 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 36.35 34.24 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935747 1633055494 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 33.26 31.61 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935759 1633137483 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 35.26 33.70 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377935761 1633160810 9 Jul 2018, 15:07:37 UTC 9 Jul 2018, 15:20:37 UTC Completed and validated 33.24 30.70 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)

Again running 7 WU's at a time so each is just under 35 sec but taking into account 7 running at once puts you right at 5 seconds a WU. Task details show 6,270.17 GFLOPS, just barely higher then the Titan V.


I'm really surprised at how much better the Tesla v100 does then the Titan V though. Clock speeds are about the same, Tesla has higher memory bandwidth but I can't imagine such a huge difference. The Tesla on paper technically has lower double precision specs but runs higher in my testing and performs almost twice as fast.
13) Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new! (Message 67661)
Posted 9 Jul 2018 by vseven
Post:

vseven
Quite a range of credit number WUs & times there, is that typical now? (I haven't looked in ~6mths).
Btw, if you want your time in the table, it'll need to be an average of at least 5 of the 227.23 credit WUs.


Here is something more interesting. Titan V running stock clock rates:

2377879357 1633136006 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.26 10.05 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879358 1633136007 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.40 10.00 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879360 1633136009 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.29 10.08 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879368 1633031063 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.33 10.00 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879369 1633069821 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 63.29 9.78 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879117 1633015264 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:41:41 UTC Completed and validated 64.30 10.34 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)
2377879373 1633119141 9 Jul 2018, 13:28:03 UTC 9 Jul 2018, 13:51:29 UTC Completed and validated 64.35 9.63 227.23 MilkyWay@Home v1.46 (opencl_nvidia_101)


They are averaging just under 64 seconds each BUT I'm running 7 at a time so technically averaging just over 9 seconds each overall. GPU load floats around 90%. Adding another WU makes it hit 100% and is overall slower, less WU's and my average goes up. I'm also CPU limited on this machine (4 cores total) so I'm only giving each WU 0.4 CPU which is hurting me slightly. Also this:

Device peak FLOPS 5,964.50 GFLOPS

Its rated for a theoretical 6900 so I'd say it's doing ok.

I should get a chance to test the Tesla V100 again in a week or so. I'll try to get updated results for just the 227.23. If given the full number of CPU cores and run 8 at a time it should place just under 6 seconds average per WU.
14) Message boards : Number crunching : Huge number of 'Validation inconclusive' WUs (Message 67592)
Posted 12 Jun 2018 by vseven
Post:
I have 0 errors on every card I've tried so far. In fact just looking through the computers of some of the participants I find tons of AMD cards with lots of credit. Maybe its not the project but your system/computer?
15) Message boards : Number crunching : Huge number of 'Validation inconclusive' WUs (Message 67449)
Posted 10 May 2018 by vseven
Post:
Oh...they are not "my" Tesla's. I wish. I got permission to play with the machines they were in. Thats also why they are hidden...didn't want the host names to get out. And I no longer have access to them but I might get another chance in the future.

Here is the output from your program for a v100 16Gb SXM2 interface:
        Run Time     CPU Time     Credit
         (sec)         (sec)
            32.6          28.1        231.2
            33.6          29.5        227.3
            36.6          32.0        229.3
            24.5          21.1        228.5
            35.6          30.8        230.8
            29.6          26.3        227.6
            29.6          24.4        227.6
            32.7          28.5        227.7
            30.5          26.5        227.6
            30.5          26.6        227.7
            33.7          29.3        227.7
            30.6          27.3        231.2
            31.6          28.4        227.6
            40.6          35.8        229.3
            34.6          30.4        229.7
            32.5          26.3        227.6
            25.5          22.1        227.6
            33.7          30.2        227.7
            34.5          31.2        229.4
            33.7          28.9        229.4
         ----------------------------------
AVG:        32.3          28.2        228.6
STD:         3.5           3.3          1.3

Keep in mind I'm running 6 WU at a time in the above.

Here is from a p100 16Gb PCIe interface:

        Run Time     CPU Time     Credit
         (sec)         (sec)
            53.4          51.4        227.3
            61.5          59.4        229.1
            53.4          51.5        227.3
            59.5          57.6        227.3
            48.4          46.9        227.2
            54.4          52.7        227.3
            65.5          63.8        229.1
            57.4          55.8        227.3
            49.3          46.3        227.2
            57.4          54.6        227.3
            60.4          58.3        227.3
            50.4          48.8        227.3
            54.4          52.4        228.1
            55.4          53.6        227.3
            53.4          51.3        227.2
            53.4          51.9        228.1
            56.4          54.7        229.3
            57.4          55.3        227.3
            57.6          49.3        227.3
            55.6          48.1        227.2
         ----------------------------------
AVG:        55.7          53.2        227.6
STD:         4.0           4.3          0.7


Also running 6 WU at a time.

I do not have 20 consecutive valid results using 1 per WU but I might be able to borrow a v100 for 20 minutes to get it if you think it would make a big difference. I know overall 1 WU at a time is slower since the card is barely loaded with 1.
16) Message boards : Number crunching : GPU stopped due to quad core task (Message 67444)
Posted 9 May 2018 by vseven
Post:
Highlight a task in your task list and hit properties. Try the name in there.
17) Message boards : Number crunching : Titan V (Message 67443)
Posted 9 May 2018 by vseven
Post:

But if it could run that much faster I imagine it would run through a couple milkyway units in only a minute! Sounds nutty.


On a Tesla v100, which is pretty much the same underlying core but the 24/7 server version, I was averaging a WU in under 6 seconds (around 36ish seconds per WU running 7 at a time). I would imaging the Titan V to be similar. It has less memory then a v100 so you can only run 6 WU without errors but it can probably do those 6 slightly faster so the average probably wouldn't change much.


Yep I got my S9150 for $340 shipped on ebay which works out to $0.13 per GFLOP (@ 10.77 GFLOPS per Watt)

TITAN V is $0.40 per GFLOP @ 29.80 GFLOPS per Watt


Yeah....at $3k there are better options. For 3k you could setup a 4 GPU machine with S9150s. Powering it wouldn't be fun though.
18) Message boards : Number crunching : GPU stopped due to quad core task (Message 67441)
Posted 9 May 2018 by vseven
Post:
That a good question....I can barely figure that out myself. :) I'd start with that name, do a Options -> Read Config Files and see what happens. If its wrong you will see a notice pop up and say something like "Invalid application name in config".
19) Message boards : Number crunching : Huge number of 'Validation inconclusive' WUs (Message 67440)
Posted 9 May 2018 by vseven
Post:
I was crunching on 3 Tesla v100's at the same time. A WU took around 35 seconds while running 6 at a time. So averaging a little under 6 seconds per WU x 3 cards which averages out to under 2 seconds per WU. Thats over 43,000 a day. Now I only ran like this for a couple hours (testing) but my inconclusive total was over 2,000 in those couple hours. Out of those I threw 6 errors and 2 invalids, I have 134 still at inconclusive, and all the rest validated.


I still don't know what they issue is you guys are posting about.....
20) Message boards : Number crunching : GPU stopped due to quad core task (Message 67438)
Posted 9 May 2018 by vseven
Post:
https://boinc.berkeley.edu/wiki/Client_configuration

You should be able to do this in your app_config.xml file:

<app_config>
<app>
<name>Application_Name</name>
<max_concurrent>1</max_concurrent>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>.4</cpu_usage>
</gpu_versions>
</app>
<app_version>
<app_name>Application_Name</app_name>
<avg_ncpus>x</avg_ncpus>
</app_version>
</app_config>[/code]


From the link above:

avg_ncpus - the number of CPU instances (possibly fractional) used by the app version.

So I would think once you figure out the app running you can set this to 3 and it would prevent a 4 thread WU.


Next 20

©2024 Astroinformatics Group