Welcome to MilkyWay@home

A failed workunit, apparantly when turning on monitor


Advanced search

Message boards : Number crunching : A failed workunit, apparantly when turning on monitor
Message board moderation

To post messages, you must log in.

AuthorMessage
robertmiles

Send message
Joined: 30 Sep 09
Posts: 211
Credit: 33,818,982
RAC: 45
30 million credit badge12 year member badgeextraordinary contributions badge
Message 55236 - Posted: 26 Jul 2012, 0:49:50 UTC
Last modified: 26 Jul 2012, 0:51:39 UTC

A failed workunit:

MilkyWay@Home 1.02 (opencl_nvidia)
de_separation_09_2s_sample_2_1341007502_14765734
Computation error

computer 286265
task 264175252

An error message appeared about the same time, but not long enough to copy it exactly.
It said that the graphics card driver had stopped responding, then recovered.

This happened when I turned the monitor back on after a few hours with it off, but with BOINC still running anyway.

A workunit from another project is expected to take 190 hours before reaching a checkpoint, so any software changes are currently on hold.

Comparing with a verified and validated de_separation workunit suggests that one of these two sections of the log file shows where the problem started:

<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>

Failed to wait for integral event (-5): CL_OUT_OF_RESOURCES
Failed to run nu step (-5): CL_OUT_OF_RESOURCES
ID: 55236 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BulletMagnetEd

Send message
Joined: 8 Jul 10
Posts: 15
Credit: 69,266,958
RAC: 0
50 million credit badge11 year member badge
Message 55279 - Posted: 2 Aug 2012, 9:49:30 UTC
Last modified: 2 Aug 2012, 9:51:10 UTC

ID: 55279 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BulletMagnetEd

Send message
Joined: 8 Jul 10
Posts: 15
Credit: 69,266,958
RAC: 0
50 million credit badge11 year member badge
Message 55280 - Posted: 2 Aug 2012, 10:03:41 UTC

Okay, I just detached MW@H, rebooted, restarted BOINC and rejoined the project. After downloading the first batch of jobs, the very first WU was the below WU which failed.

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=269317349

MilkyWay@Home 1.00 runs fine. It's the MilkyWay@Home 1.02 (OpenCL NVidia) that's failing all of a sudden. Thanks for any help![/url]
ID: 55280 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2561
Credit: 462,762,741
RAC: 293
300 million credit badge12 year member badgeextraordinary contributions badge
Message 55281 - Posted: 2 Aug 2012, 11:15:18 UTC - in response to Message 55280.  

Okay, I just detached MW@H, rebooted, restarted BOINC and rejoined the project. After downloading the first batch of jobs, the very first WU was the below WU which failed.

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=269317349

MilkyWay@Home 1.00 runs fine. It's the MilkyWay@Home 1.02 (OpenCL NVidia) that's failing all of a sudden. Thanks for any help![/url]


The key is here "Microsoft Windows", in Windows when a gpu fails the ONLY way to reset it is to restart the whole pc, not JUST Boinc! It is a Windows thing.
ID: 55281 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 357
Credit: 16,324,081
RAC: 0
10 million credit badge11 year member badge
Message 55284 - Posted: 2 Aug 2012, 18:51:37 UTC - in response to Message 55281.  

Okay, I just detached MW@H, rebooted, restarted BOINC and rejoined the project.


The key is here "Microsoft Windows", in Windows when a gpu fails the ONLY way to reset it is to restart the whole pc, not JUST Boinc! It is a Windows thing.

He rebooted, that's restarting the whole PC.
.
ID: 55284 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BulletMagnetEd

Send message
Joined: 8 Jul 10
Posts: 15
Credit: 69,266,958
RAC: 0
50 million credit badge11 year member badge
Message 55285 - Posted: 2 Aug 2012, 19:43:30 UTC - in response to Message 55284.  

Okay, I just detached MW@H, rebooted, restarted BOINC and rejoined the project.


The key is here "Microsoft Windows", in Windows when a gpu fails the ONLY way to reset it is to restart the whole pc, not JUST Boinc! It is a Windows thing.

He rebooted, that's restarting the whole PC.


Exact-a-mundo! For the time being, I'll just disable GPU WUs on MW@H.
ID: 55285 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BulletMagnetEd

Send message
Joined: 8 Jul 10
Posts: 15
Credit: 69,266,958
RAC: 0
50 million credit badge11 year member badge
Message 55291 - Posted: 3 Aug 2012, 19:30:14 UTC

Alright, here's one that some diagnostic testing ultimately revealed. My GTX560Ti was failing. It finally decided to die today, so I replaced it with another. So far it's processing with no problems! Stupid video card...
ID: 55291 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : A failed workunit, apparantly when turning on monitor

©2021 Astroinformatics Group