GPU Issues Mega Thread
log in

Advanced search

Message boards : News : GPU Issues Mega Thread

Previous · 1 . . . 5 · 6 · 7 · 8
Author Message
ErickGriffin
Send message
Joined: 15 Jul 12
Posts: 2
Credit: 9,495,240
RAC: 44,919

Message 66092 - Posted: 7 Jan 2017, 13:49:52 UTC

Seeing a BSOD which I think is being caused by MilkyWay GPU s/w. I have several Win10 minidumps of the issue (I hope), or whatever windows takes when it goes blue. The reason why I think its GPU is I have that set only to run when I am away from the computer and when screensaver is running, that's when it always happens.

CPU = AMD FX-8370 Eight-Core Processor, 4013 MHz, 4 Kern, 8 logische Prozessors
Mem = 16GB
Graphics = GeForce GTX 1080
Driver = Nvidia 376.33
OS = Microsoft Windows 10 Pro, 10.0.14393 Build 14393

Boinc = 7.5.33 (x64), wxWidgets = 3.0.1

Not sure if you get informed by Microsoft on issues related to your product or not. Please advise what you need next.

captainjack
Send message
Joined: 22 Jun 13
Posts: 40
Credit: 34,884,017
RAC: 10,465

Message 66093 - Posted: 7 Jan 2017, 14:21:38 UTC

wb8ili,

Glad you got it fixed.

Just in case you are still wondering, most of the client files are in /var/lib/boinc-client (projects folder and slots folder included). The procedure that starts/restarts the boinc client is in /etc/init.d and is called boinc-client.

Happy crunching.

ErickGriffin
Send message
Joined: 15 Jul 12
Posts: 2
Credit: 9,495,240
RAC: 44,919

Message 66094 - Posted: 7 Jan 2017, 17:20:42 UTC - in response to Message 66092.

I think this is actually a BOINC issue, I stopped MilkyWay and loaded a different project that uses GPU, and hit same issue even faster. Will post on their forum.

Chris Rampson
Send message
Joined: 14 May 11
Posts: 6
Credit: 28,934,867
RAC: 34,525

Message 66139 - Posted: 25 Jan 2017, 0:53:59 UTC

Speaking about the "#define Q_INV_SQR inf" Milkyway error:

I have been running BOINC/SETI for over 20 years. I have projects in Einstein, SETI, GPUGrid, Rosetta and Milkyway. Of the Milkyway project, there are 3 different types of jobs that I noticed. ONLY ONE OF THOSE IS FAILING.

I do not subscribe to the idea that there is a configuration problem when all my other projects are screaming along (including alternate Milkyway jobs).

The most reasonable explanation is a TYPO in the project files. Someone fatfingered a "t" into a "f" - which is not too hard to do since they are next to each other on the keyboard.

What needs to happen is for a Milkyway developer to explain why this happens - and hopefully fix his own code.

captainjack
Send message
Joined: 22 Jun 13
Posts: 40
Credit: 34,884,017
RAC: 10,465

Message 66141 - Posted: 25 Jan 2017, 15:12:04 UTC
Last modified: 25 Jan 2017, 15:48:26 UTC

Chris Rampson,

If you believe that there is a problem with the code, you might ask yourself this question: Why are other Milkyway users able to successfully complete those tasks using Linux and NVIDIA GPU's?

Edit:

Re: this thread http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4087&postid=66127

Sebastian*
Send message
Joined: 8 Apr 09
Posts: 64
Credit: 4,386,077,288
RAC: 3,783,074

Message 66172 - Posted: 10 Feb 2017, 18:26:52 UTC

Hello again, and a happy new year to everybody.

I still got issues when running several WUs in parallel on the Hawaii bases GPUs. One WU at a time still runs fine.

Could someone look at my invalid WUs with the Validate errors. I can't make anything out of the text.

https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=705276&offset=0&show_names=0&state=5&appid=

Maybe some can help me figure out what is broken.

AMD drivers have improved, the WUs don't hang any longer, but there are still Validate errors.

Profile blitzzkreeg
Send message
Joined: 15 Aug 10
Posts: 1
Credit: 6,497,936
RAC: 13,175

Message 66231 - Posted: 19 Mar 2017, 17:13:41 UTC

Good day,

I've been having issues lately with specifically "MilkyWay@home 1.43 (opencl_ati_101)" units which have a failure rate of 100% 2secs after starting up.

The "MilkyWay@home 1.43 (opencl_nvidia_101)" actually works great on a separate machine so I leads me to believe that this problem is specific to the AMD Radeon GPU (6800 series - HD6870 in my case). The adapter has never been overclocked and is actually working on a machine crunching for 16 distinct BOINC projects without any issues on any other project.

I'm running the latest AMD Catalyst Software Suite available on Windows 10 (non-beta drivers).

Driver Packaging ver. 15.201.1151.1008-151104a-296217E
Provider Advanced Micro Devices, Inc.
2D Driver Version 8.01.01.1500
Direct3D Version 9.14.10.01128
OpenGL Version 6.14.10.13399
Mantle Driver Version 9.1.10.0083
Mantle API Version Not Available
AMD Catalyst CCV 2015.1104.1643.30033

Breakdown is as follows for the end result of completed units...

- MilkyWay@home 1.43 (opencl_ati_101) => Computational error after 2 secs.

- MilkyWay@home 1.43 (opencl_nvidia_101) => 100% successful completion.

- MilkyWay@Home N-Body Simulation 1.62(mt) => 100% successful completion.

- MilkyWay@Home 1.42 => 100% successful completion.


regards,

Peter
____________

bluestang
Send message
Joined: 13 Oct 16
Posts: 32
Credit: 53,815,885
RAC: 85,879

Message 66232 - Posted: 19 Mar 2017, 22:06:34 UTC - in response to Message 66231.

Did Win 10 automatically update your drivers? I think it screws up things if it did. Need to uninstall, clean drivers and then reinstall from AMD site.

mikey
Avatar
Send message
Joined: 8 May 09
Posts: 1988
Credit: 98,851,024
RAC: 181,757

Message 66234 - Posted: 20 Mar 2017, 10:29:01 UTC - in response to Message 66231.

Good day,

I've been having issues lately with specifically "MilkyWay@home 1.43 (opencl_ati_101)" units which have a failure rate of 100% 2secs after starting up.

The "MilkyWay@home 1.43 (opencl_nvidia_101)" actually works great on a separate machine so I leads me to believe that this problem is specific to the AMD Radeon GPU (6800 series - HD6870 in my case). The adapter has never been overclocked and is actually working on a machine crunching for 16 distinct BOINC projects without any issues on any other project.

I'm running the latest AMD Catalyst Software Suite available on Windows 10 (non-beta drivers).

Driver Packaging ver. 15.201.1151.1008-151104a-296217E
Provider Advanced Micro Devices, Inc.
2D Driver Version 8.01.01.1500
Direct3D Version 9.14.10.01128
OpenGL Version 6.14.10.13399
Mantle Driver Version 9.1.10.0083
Mantle API Version Not Available
AMD Catalyst CCV 2015.1104.1643.30033

Breakdown is as follows for the end result of completed units...

- MilkyWay@home 1.43 (opencl_ati_101) => Computational error after 2 secs.

- MilkyWay@home 1.43 (opencl_nvidia_101) => 100% successful completion.

- MilkyWay@Home N-Body Simulation 1.62(mt) => 100% successful completion.

- MilkyWay@Home 1.42 => 100% successful completion.


regards,

Peter


The 2 or 3 second errors can also be caused by not installing this:
For Windows, the most recent Visual Studio 2012 C++ runtime

Tom*
Send message
Joined: 4 Oct 11
Posts: 29
Credit: 222,429,279
RAC: 496,477

Message 66236 - Posted: 20 Mar 2017, 21:49:32 UTC

According to this url
https://en.wikipedia.org/wiki/Radeon_HD_6000_Series

The HD6870 does not support double precision.

Previous · 1 . . . 5 · 6 · 7 · 8
Post to thread

Message boards : News : GPU Issues Mega Thread


Main page · Your account · Message boards


Copyright © 2017 AstroInformatics Group