Welcome to MilkyWay@home

A failed workunit, apparantly when turning on monitor

Message boards : Number crunching : A failed workunit, apparantly when turning on monitor
Message board moderation

To post messages, you must log in.

AuthorMessage
robertmiles

Send message
Joined: 30 Sep 09
Posts: 211
Credit: 36,977,315
RAC: 0
Message 55236 - Posted: 26 Jul 2012, 0:49:50 UTC
Last modified: 26 Jul 2012, 0:51:39 UTC

A failed workunit:

MilkyWay@Home 1.02 (opencl_nvidia)
de_separation_09_2s_sample_2_1341007502_14765734
Computation error

computer 286265
task 264175252

An error message appeared about the same time, but not long enough to copy it exactly.
It said that the graphics card driver had stopped responding, then recovered.

This happened when I turned the monitor back on after a few hours with it off, but with BOINC still running anyway.

A workunit from another project is expected to take 190 hours before reaching a checkpoint, so any software changes are currently on hold.

Comparing with a verified and validated de_separation workunit suggests that one of these two sections of the log file shows where the problem started:

<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>

Failed to wait for integral event (-5): CL_OUT_OF_RESOURCES
Failed to run nu step (-5): CL_OUT_OF_RESOURCES
ID: 55236 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BulletMagnetEd

Send message
Joined: 8 Jul 10
Posts: 15
Credit: 69,266,958
RAC: 0
Message 55279 - Posted: 2 Aug 2012, 9:49:30 UTC
Last modified: 2 Aug 2012, 9:51:10 UTC

ID: 55279 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BulletMagnetEd

Send message
Joined: 8 Jul 10
Posts: 15
Credit: 69,266,958
RAC: 0
Message 55280 - Posted: 2 Aug 2012, 10:03:41 UTC

Okay, I just detached MW@H, rebooted, restarted BOINC and rejoined the project. After downloading the first batch of jobs, the very first WU was the below WU which failed.

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=269317349

MilkyWay@Home 1.00 runs fine. It's the MilkyWay@Home 1.02 (OpenCL NVidia) that's failing all of a sudden. Thanks for any help![/url]
ID: 55280 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,950,829
RAC: 21,429
Message 55281 - Posted: 2 Aug 2012, 11:15:18 UTC - in response to Message 55280.  

Okay, I just detached MW@H, rebooted, restarted BOINC and rejoined the project. After downloading the first batch of jobs, the very first WU was the below WU which failed.

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=269317349

MilkyWay@Home 1.00 runs fine. It's the MilkyWay@Home 1.02 (OpenCL NVidia) that's failing all of a sudden. Thanks for any help![/url]


The key is here "Microsoft Windows", in Windows when a gpu fails the ONLY way to reset it is to restart the whole pc, not JUST Boinc! It is a Windows thing.
ID: 55281 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 578
Credit: 18,845,239
RAC: 856
Message 55284 - Posted: 2 Aug 2012, 18:51:37 UTC - in response to Message 55281.  

Okay, I just detached MW@H, rebooted, restarted BOINC and rejoined the project.


The key is here "Microsoft Windows", in Windows when a gpu fails the ONLY way to reset it is to restart the whole pc, not JUST Boinc! It is a Windows thing.

He rebooted, that's restarting the whole PC.
ID: 55284 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BulletMagnetEd

Send message
Joined: 8 Jul 10
Posts: 15
Credit: 69,266,958
RAC: 0
Message 55285 - Posted: 2 Aug 2012, 19:43:30 UTC - in response to Message 55284.  

Okay, I just detached MW@H, rebooted, restarted BOINC and rejoined the project.


The key is here "Microsoft Windows", in Windows when a gpu fails the ONLY way to reset it is to restart the whole pc, not JUST Boinc! It is a Windows thing.

He rebooted, that's restarting the whole PC.


Exact-a-mundo! For the time being, I'll just disable GPU WUs on MW@H.
ID: 55285 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BulletMagnetEd

Send message
Joined: 8 Jul 10
Posts: 15
Credit: 69,266,958
RAC: 0
Message 55291 - Posted: 3 Aug 2012, 19:30:14 UTC

Alright, here's one that some diagnostic testing ultimately revealed. My GTX560Ti was failing. It finally decided to die today, so I replaced it with another. So far it's processing with no problems! Stupid video card...
ID: 55291 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : A failed workunit, apparantly when turning on monitor

©2024 Astroinformatics Group