Welcome to MilkyWay@home

Posts by Sebastian*

1) Message boards : News : New Poll Regarding GPU Application of N-Body (Message 71019)
Posted 23 Jul 2021 by Sebastian*
Post:
If I have a high FP64 GPU then I assume it would be similar to the current WU, in terms of speed up?


Not really. Nbody is pretty complex to calculate. I don't know how exactly it works tho. But if there are a lot of parts which have to be finished, before you can continue you calculations, then you have to wait. CPUs can handle forks and dependencies well, GPUs not so much. You can stall most of the GPU to wait for one important calculation to finish.

Video onecoding is such a thing, that is why you have special hardware for video encoding on GPUs and some intel CPUs. Even CPUs are faster compared to video encoding, when you don't use the special hardware inside the GPU for it.
2) Message boards : News : New Poll Regarding GPU Application of N-Body (Message 71015)
Posted 23 Jul 2021 by Sebastian*
Post:
There is an "anomaly" in the latest version of BOINC client's use of tasks that use the GPU option. It has something to do with the Intel GPU. It can hang a task indefinitely. Trod carefully before checking this out. As Redbeard the pirate might say if he were still alive, "Matey, ye be warned!"


I have had that "hangin indefinitely" on AMD GPUs as well. It seems to depend on the driver. On my R9 390X it was fixed in the 19.7.2 driver i think. Not sure if it is still working. On the Fire Pro W9100, with professional drivers, it still hangs. And on the Radeon VII the last working driver is the professional 20q4 driver. The desktop drivers always cause hanging.
That is why i would like to see longer running tasks like the nbody ones on the GPU. A single WU at a time runs fine tho.
3) Message boards : News : New Poll Regarding GPU Application of N-Body (Message 71014)
Posted 23 Jul 2021 by Sebastian*
Post:
To clarify, running the GPU was testing doing the same task as the CPU. AMD GPUs from a purely teraflop performance standpoint should preform much faster than the RTX 3070 when simply looking at spec sheet as AMD invests more into this feature, however it remains to be seen weather this advantage will be realized in practice. The main idea here is to recognize that if the GPU and CPU are doing the same amount of work and an average computer has both a GPU and CPU; it is likely to double the amount of computation preformed overall on part of the network. I would like to point out that, although this slightly effects CPU performance, it is a minimal cost as the CPU section of the GPU code is designed to be lightweight and serve only for control purposes; this fact was realized when testing running both at the same time. If anyone wants to get an idea of performance for their specific system, the Github repo does have a working version of the GPU code although it does not support as many features like the LMC at the moment.


Thank you for that info.
About the 10900K, did it run with the intel suggested PL1 and PL2 tau time limits, after it will stay at 125W, or was it allowed to boost all day at the max sustainable frequency?
If it would stay at 125W we could judge the comparison more accurate, even with performance per Watt. But if it was allowed to boost all day then it would be less efficient then a GPU.

i am not sure how fare you are willing to optimize the app (CPU and GPU) but if you do, you can get way more results at the same time compared to before optimisations. What compiler do you use by the way? And are all the flags set to support the Ryzen CPUs to it's full potential?
4) Message boards : News : New Poll Regarding GPU Application of N-Body (Message 71004)
Posted 22 Jul 2021 by Sebastian*
Post:
If the GPU version is doing the same work as the CPU version in roughly the same time, then, no, don't distribute it. GPU versions have a habit of also needing some CPU, so the net result is some CPU plus a whole GPU for the same effect as a CPU. Waste of resources.

That is the reason why we asked for the comparison. What CPU was used and what GPU. Especially the consumer Nvidia cards don't have a lot of double precision performance. But even tho they have a lot more memory bandwith, and will be more efficient that way.
A 10900k is also not a very common CPU. People will likely have worse CPUs but with GPUs which have even more double precision performance then the 3070. So it makes sense to release a GPU app :)
5) Message boards : News : New Poll Regarding GPU Application of N-Body (Message 71001)
Posted 22 Jul 2021 by Sebastian*
Post:
I would be very happy to see longer running work units on GPUs. Especially on high performance AMD cards (with a lot of double precision performance) i have to run several WUs in parallel, which causes driver issues.
I got some Fixed by AMD, but not all.
If i could run a N-Body WU on my GPU and it takes several hours and loads the GPU well, it would be great.

The comparison specifically preformed was between a i9-10900k and RTX 3070, using all available compute cores for both.


When is is the comparison, then AMD cards with a lot of double precision performance should do well, as well as Nvidia cards. A 3070 has roughly 0.3 TFlop double precision performance. A 10900K should turn out the same, since memory bandwith is limited on the CPU.
I would expect a R9 280X to be 3 times as fast as a 3070 then.
6) Message boards : News : GPU Issues Mega Thread (Message 66172)
Posted 10 Feb 2017 by Sebastian*
Post:
Hello again, and a happy new year to everybody.

I still got issues when running several WUs in parallel on the Hawaii bases GPUs. One WU at a time still runs fine.

Could someone look at my invalid WUs with the Validate errors. I can't make anything out of the text.

https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=705276&offset=0&show_names=0&state=5&appid=

Maybe some can help me figure out what is broken.

AMD drivers have improved, the WUs don't hang any longer, but there are still Validate errors.
7) Message boards : News : Scheduled Maintenance Concluded (Message 65869)
Posted 16 Nov 2016 by Sebastian*
Post:
Hello everyone,

Is anyone running a 390 or 390X (290 or 290x may have the same problem)

I still have the problem, when i run several WUs at once, then after some time (from minutes to one hour or so) some WUs start to hang, and go on for ever, while one or two crunch on.

I have tested drivers since 15.9, always the same problem, win 7 or win10 does not matter either. Tried different hardware setups, new installations of windows or old ones no difference.

I hope someone can confirm the problem, so we can start searching for the root cause and maybe even a fix.

PS. Running on the 280X or 7970 doesn't give me the error. Also running one WU at a time is fine.
Running 2 Einstein@home WUs at the same time causes calculation error (invalid tasks)
8) Message boards : News : Updated Server Daemons and Libraries (Message 65586)
Posted 3 Nov 2016 by Sebastian*
Post:
Hi everyone,

does anyone experience similar problems as i do?

Hardware, i7 3930k, not overclocked, Asus Rampage 4 Formular, 16GB 2133 memory and one R9 280X from Asus.

about 2 weeks ago, the connected display goes black when running Milkyway@home, and the computer still seems to run. When i am connected from another PC to this one via the Boinc Client, Milkyway is still runnging, but all the tasks are stuck.

I run 4 GPU-WUs at the same time, using the app_config.xml file.

Milkyway is running alongside Cosmology@home, which only uses the CPU.

I switched the GPU (with the same model), changed the Power supply Unit and completely reinstalled Windows 10. Still the same problem.

It worked all fine until 2 weeks ago, and i didn't change a thing.

When i run games on this computer, there are no problems either.

I will try the 16.10.3 hofix driver now, but my hopes are very low.

Does anybody know the cause of this problem? Or has experienced something familiar?

Computer with the problem:
https://milkyway.cs.rpi.edu/milkyway//show_host_detail.php?hostid=708929

Thank you all for answers.
9) Message boards : News : MilkyWay@home Version 1.38 Released (Message 65402)
Posted 6 Oct 2016 by Sebastian*
Post:
For those who have a R9 390 or 390X, and want to run several workunits at once, stay away from Windows 10 right now. I tried driver 16.10.1 hotfix, 16.7.2, and 16.9.2 hotfix. All drivers crash when one WU reaches 100% and all WUs get stuck where they are. One WU sometimes keeps running, when i run 8 WUs at once.

I use the app_config.xml file to run WUs parallel on one GPU.

And i will try the 390X on Win7 tomorrow, to see if it is Win10 related, what i think. Since the Windows 10 Version 1607 the problem occurs.

Please post anyone if he experiences the same problems
10) Message boards : News : MilkyWay@home Version 1.38 Released (Message 65401)
Posted 6 Oct 2016 by Sebastian*
Post:
For those who have a R9 390 or 390X, and want to run several workunits at once, stay away from Windows 10 right now. I tried driver 16.10.1 hotfix, 16.7.2, and 16.9.2 hotfix. All drivers crash when one WU reaches 100% and all WUs get stuck where they are. One WU sometimes keeps running, when i run 8 WUs at once.

I use the app_config.xml file to run WUs parallel on one GPU.

And i will try the 390X on Win7 tomorrow, to see if it is Win10 related, what i think. Since the Windows 10 Version 1607 the problem occurs.

Please post anyone if he experiences the same problems
11) Message boards : News : Updated Server Daemons and Libraries (Message 65391)
Posted 5 Oct 2016 by Sebastian*
Post:
Thank you very much Jake for all your hard work :)

Any update on the 4xxx series GPUs yet? Or do you just take a little timeout? ;)

Milkyway seems to run well again, well done.
12) Message boards : News : Updated Server Daemons and Libraries (Message 65363)
Posted 1 Oct 2016 by Sebastian*
Post:
After updating my HD5970 computer to Windows 10 1607 (Cumulative Update for Windows 10 Version 1607 for x64-based Systems (KB3194496)), i get some strange behaviors there too. I run 4 WUs on one GPU core. So 8 WUs at once on the dual GPU card
After nearly a minute of heavy CPU-Work (3 Cores almost running at 100% of 6 Cores) the GPU gets into action.

It was running well before the update to 1607. Does anyone have information about what Microsoft changed to the drivers?

The update also affects the R9 390X, and i guess other GPUs as well.

Please post anyone, if he has trouble with it too.
13) Message boards : News : Updated Server Daemons and Libraries (Message 65362)
Posted 30 Sep 2016 by Sebastian*
Post:
captainjack is right, your GPUs don't have double precision, which is needed for Milkyway@home Will Guerin.

My 390X seems to work now, but only when running without app_config file. It is running one WU at a time for now.

After a quick sear on the web, it looks like the latest Microsoft Windows 10 Update really messed up the GPU drivers. Milkyway causes errors since the Update, and even Einstein@home (2 WUs at once). But for some reason my 280X cards seem unaffected.

Can anyone confirm this?

I guess GPUs like 280X and below still work well, while 290X and above cause problems.

Windows 10, latest update, and using an app_config to run multiple WUs at once.
These circumstances have to apply to cause WUs hanging when one WU has finished at 100% and stays there.
14) Message boards : News : MilkyWay@home Version 1.38 Released (Message 65344)
Posted 28 Sep 2016 by Sebastian*
Post:
could anyone look at this:

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=1804922945

any idea why i get the warnings? The other 2 persons running the task did not get it.

It only shows up on the Win10 computer with the 390X gpu. I've installed Boinc in the standard directory. on all my other computers i install it on D:\Boinc\...
15) Message boards : News : MilkyWay@home Version 1.38 Released (Message 65341)
Posted 28 Sep 2016 by Sebastian*
Post:
I only used DDU and installed 16.7.3. But the problem is still there. I am running some Einstein for now, to see if something similar happens there.

Nvidia also allows you to select between windows10 and windows 10 anniversery edition, when looking for their drivers. I wonder if Microsoft changed something related to the GPU drivers.

I am using an app_config file to run 4 WUs at once. Might cause the problem, have to test it later. The strange thing is, that it worked more or less well before the windows update.
16) Message boards : News : MilkyWay@home Version 1.38 Released (Message 65339)
Posted 28 Sep 2016 by Sebastian*
Post:
I did a clean install on the Win 10 computer with the r9 390X. Installed the 16.9.2 (21st of september) driver.

I still get the same problem. Driver crashes, and the WU which cause the crash at 100% gets stuck. Restarting just boinc only lets the WUs start from 0% but they are not running.

Vortac, which driver are you using on Win7 for Boinc?

An all my 7970s or 280X GPU don't get work any longer. The Milkyway server has 1150 WUs available currently. The 390X gets work instantly, but with the error mentioned above.

New computer ID for the PC, because of the clean install.

http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=705276
17) Message boards : News : MilkyWay@home Version 1.38 Released (Message 65328)
Posted 27 Sep 2016 by Sebastian*
Post:
Reinstalling drivers did not help.

Since the system is running on a SSD, i will try tomorrow to reset the project. It could be that the Win10 update did so much wearing on the SSD that it is starting to degrade now.

Any update on the HD 4xxx cards? (Scheduler fix for older cards)

And my 280X cards don't get work any longer at the moment. I will tell if it got better by tomorrow.
18) Message boards : News : MilkyWay@home Version 1.38 Released (Message 65322)
Posted 27 Sep 2016 by Sebastian*
Post:
On my R9 390X computer Windows 10 just did an update to the anniversary edition i guess. Now, after one WU finishes, (several other run through fine) the GPU driver seems to crash and the WUs are stuck.
I had to install the GPU drivers again, so i am not sure what driver windows used, but boinc detectet 2 390X gpu, but only one is installed. With the official AMD driver everything runs fine, until one WU causes the driver crash.

The 280X GPUs seem to run fine on other Win10 boxes with the version. Does anyone have an idea what the root cause could be? I used DDU (Display Driver uninstaller) to get rid of the Win10 drivers.

Fow now something is really broken
19) Message boards : News : MilkyWay@home Version 1.38 Released (Message 65300)
Posted 27 Sep 2016 by Sebastian*
Post:
Ok, the R9 390X seems to work. Getting work fine. Not sure if it will give good results. And i had to plug in a monitor to get her to run the WUs through. Without, they were stuck at 100%.

Computer: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=704045

Let me know, when you have updated the scheduler. Then i will give the HD 4850s another try.
20) Message boards : News : MilkyWay@home Version 1.38 Released (Message 65286)
Posted 26 Sep 2016 by Sebastian*
Post:
Looks like my HD 4850 GPUs don't get work any longer. It got work on the 24th. I now get the message when i update Boinc, that i got 0 new tasks. Has something changed between then an now? Did the Downtime cause the issue?

http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=307550

The computers with the HD 7970 get work tho.

And i get a lof of that in the Event Log from Boinc:

26/09/2016 21:18:26 | Milkyway@Home | Message from task: 0

Seem it shows up once a WU is finished.

Something was changed because of the Downtime. Any ideas?


Next 20

©2021 Astroinformatics Group