Welcome to MilkyWay@home

WU just stops


Advanced search

Message boards : Number crunching : WU just stops
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
ProfileS@NL - Mellowman
Avatar

Send message
Joined: 9 Dec 07
Posts: 65
Credit: 8,015,709
RAC: 0
5 million credit badge10 year member badge
Message 1103 - Posted: 16 Dec 2007, 15:14:03 UTC

This WU just stopped processing after about 3 minutes. I aborted it so one of the cores won't be idle.

The longer I live, the more reasons I develop for wanting to die.
ID: 1103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile[B^S] Acmefrog
Avatar

Send message
Joined: 28 Aug 07
Posts: 49
Credit: 556,559
RAC: 0
500 thousand credit badge10 year member badge
Message 1104 - Posted: 16 Dec 2007, 19:31:06 UTC

I have had that happen once in a while. I have found that if I suspend that WU and let the processor work on another, when I un-suspend it another core will finish it.
ID: 1104 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileS@NL - Mellowman
Avatar

Send message
Joined: 9 Dec 07
Posts: 65
Credit: 8,015,709
RAC: 0
5 million credit badge10 year member badge
Message 1105 - Posted: 16 Dec 2007, 19:46:16 UTC

I tried suspending/resuming but it didn't work. The WU had stopped almost half an hour before I noticed it. Maybe it was my system as I also lost 2 SETI WU's in progress during a reboot later on.

The longer I live, the more reasons I develop for wanting to die.
ID: 1105 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Misfit
Avatar

Send message
Joined: 27 Aug 07
Posts: 915
Credit: 1,503,319
RAC: 0
1 million credit badge10 year member badge
Message 1106 - Posted: 16 Dec 2007, 20:40:31 UTC

I've had this happen whenever a crashed unit caused the MS popup. Usually the core this happened on freezes but then resets itself within a few minutes.
me@rescam.org
ID: 1106 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profiledduggan47

Send message
Joined: 6 Sep 07
Posts: 2
Credit: 1,632,291
RAC: 772
1 million credit badge10 year member badge
Message 1185 - Posted: 28 Dec 2007, 16:05:36 UTC

I'm seeing this problem as well on at least two machines. The wu is shown as running but gets no CPU time. It clogs up the system pretty effectively. Another task can run at the same time but if I get two MW's in there, I'm out of business until I happen to check that machine and fix it.

Suspending the task and then letting it go again doesn't seem to help. OTOH restarting the BOINC service does seem to get it going again.

I'm running 5.10.30.
ID: 1185 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileJayargh
Avatar

Send message
Joined: 8 Oct 07
Posts: 289
Credit: 3,690,838
RAC: 0
3 million credit badge10 year member badge
Message 1197 - Posted: 30 Dec 2007, 0:02:14 UTC
Last modified: 30 Dec 2007, 0:47:14 UTC

I found this one stuck for a number of hours ...just suspended it and will see if I have to restart Boinc to get it to finish.Looking at everyone who has posted about this problem the one thing in common is windows btw.

EDIT-Had to reboot Boinc to get it to restart.
ID: 1197 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile[B@H] Ray

Send message
Joined: 27 Dec 07
Posts: 35
Credit: 1,432,926
RAC: 0
1 million credit badge10 year member badge
Message 1199 - Posted: 30 Dec 2007, 18:14:29 UTC

I have had two of them so far, hopefully they will see what is causing it and fix it.
ID: 1199 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileWebmaster Yoda
Avatar

Send message
Joined: 21 Dec 07
Posts: 69
Credit: 7,048,412
RAC: 0
5 million credit badge10 year member badge
Message 1203 - Posted: 31 Dec 2007, 5:18:57 UTC

It's not only Windows. I have one right now that's stalled on a 64bit Linux computer.

I would restart BOINC, but since it's running CPDN on the other 3 cores, with fairly lengthy intervals between checkpoints, I waste less CPU time by aborting thie work unit.
Join the #1 Aussie Alliance on MilkyWay!
ID: 1203 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
seti@elrcastor.com

Send message
Joined: 22 Dec 07
Posts: 11
Credit: 5,943,029
RAC: 0
5 million credit badge10 year member badge
Message 1263 - Posted: 2 Jan 2008, 21:09:48 UTC

i've seen a few of them too on linux and windows
ID: 1263 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile[B^S]breathesgelatin

Send message
Joined: 7 Dec 07
Posts: 3
Credit: 7,256
RAC: 0
1 credit badge10 year member badge
Message 1336 - Posted: 3 Jan 2008, 22:47:07 UTC

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=1866922

This one hung up on me just now. I think it had been running most of the day, so I aborted it. Next time I'll try the suspending trick and see if that works. My first hung WU. The problems I was having with missing downloads yesterday do seem to be resolved however.
ID: 1336 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileWebmaster Yoda
Avatar

Send message
Joined: 21 Dec 07
Posts: 69
Credit: 7,048,412
RAC: 0
5 million credit badge10 year member badge
Message 1337 - Posted: 3 Jan 2008, 23:17:33 UTC

Observation: one of my Windows XP computers gets these problems several times a day. The other two haven't (yet). They run different versions of BOINC, so it may be related to the client.

The one that has regular problems is running BOINC 5.10.13 (32bit)
The others are running BOINC 5.10.30 (64bit) and BOINC 5.8.11 (32bit)

I will upgrade the one with 5.10.13 to the latest version when I can get access to it and see if the problem disappears. Will report here if problems persist.

When it does happen, suspending the work unit and resuming it again later does not solve the problem - it stays stuck. Restarting BOINC does seem to work however and it completes within seconds of the restart.


Join the #1 Aussie Alliance on MilkyWay!
ID: 1337 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileCTPAHHNK

Send message
Joined: 21 Sep 07
Posts: 1
Credit: 1,387,208
RAC: 0
1 million credit badge10 year member badge
Message 1347 - Posted: 4 Jan 2008, 9:19:20 UTC

I have these problems on versions:
BOINC 5.10.20(Win 32Bit)
BOINC 5.10.24(Win 32Bit)
BOINC 5.10.30(Win 32Bit)
BOINC 6.01.00(Win 32Bit)
On computers from up Cel 800MGz to QuadCore 6600 and Dual Xeon 5160
OS WinXP Pro(32bit) & Win Svr 2003 Std Ed R2(32Bit)
What to do? Where to move?





ID: 1347 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileWebmaster Yoda
Avatar

Send message
Joined: 21 Dec 07
Posts: 69
Credit: 7,048,412
RAC: 0
5 million credit badge10 year member badge
Message 1348 - Posted: 4 Jan 2008, 11:54:45 UTC - in response to Message 1337.  

Observation: one of my Windows XP computers gets these problems several times a day. The other two haven't (yet). They run different versions of BOINC, so it may be related to the client.

The one that has regular problems is running BOINC 5.10.13 (32bit)


Update: I upgraded BOINC and it just got another one, so it's more likely the app is not always sending the right signals to BOINC when it's finishing.

Result name is gs_92_1199482628_108631_0 if it's any help. I will pause it for now - don't want to restart BOINC until Climate checkpoints again.



Join the #1 Aussie Alliance on MilkyWay!
ID: 1348 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge10 year member badge
Message 1349 - Posted: 4 Jan 2008, 12:33:23 UTC - in response to Message 1348.  

Observation: one of my Windows XP computers gets these problems several times a day. The other two haven't (yet). They run different versions of BOINC, so it may be related to the client.

The one that has regular problems is running BOINC 5.10.13 (32bit)


Update: I upgraded BOINC and it just got another one, so it's more likely the app is not always sending the right signals to BOINC when it's finishing.

Result name is gs_92_1199482628_108631_0 if it's any help. I will pause it for now - don't want to restart BOINC until Climate checkpoints again.




i'll take a look into the freezing WUs. not quite sure whats causing it because i'm pretty sure we're using the API correctly.
ID: 1349 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileJayargh
Avatar

Send message
Joined: 8 Oct 07
Posts: 289
Credit: 3,690,838
RAC: 0
3 million credit badge10 year member badge
Message 1350 - Posted: 4 Jan 2008, 13:48:51 UTC - in response to Message 1348.  

Observation: one of my Windows XP computers gets these problems several times a day. The other two haven't (yet). They run different versions of BOINC, so it may be related to the client.

The one that has regular problems is running BOINC 5.10.13 (32bit)


Update: I upgraded BOINC and it just got another one, so it's more likely the app is not always sending the right signals to BOINC when it's finishing.

Result name is gs_92_1199482628_108631_0 if it's any help. I will pause it for now - don't want to restart BOINC until Climate checkpoints again.




I have only had 1 unit freeze on my windows machine and it wasn't on finish...it was in the middle at about 40-50% progress...so mine had nothing to do with the finish file.
ID: 1350 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileAllen

Send message
Joined: 30 Dec 07
Posts: 8
Credit: 356,682
RAC: 0
100 thousand credit badge10 year member badge
Message 1351 - Posted: 4 Jan 2008, 14:25:40 UTC
Last modified: 4 Jan 2008, 14:26:45 UTC

Mornin Travis, Overnight 3 of my puters could not crunch due to stopped ( frozen ) WU, i case and 2 computation errors All puters are 32 bit, 2 XP Pro and 1 2000 Pro. Boinc versions are 2 @ 5.10.30 and 1 @ v 5.10.28 One is an X86 single core the other 2 are Intel dual cores. Hope this info helps fix bugs. Cleared the jam and they are crunching again
ID: 1351 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilebanditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
500 thousand credit badge10 year member badge
Message 1365 - Posted: 6 Jan 2008, 16:21:56 UTC

I still have 1-2 out of each 20 that stop. A couple have stopped at just a few seconds.
ID: 1365 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge10 year member badge
Message 1367 - Posted: 6 Jan 2008, 17:33:54 UTC - in response to Message 1365.  

I still have 1-2 out of each 20 that stop. A couple have stopped at just a few seconds.


on what architecture?
ID: 1367 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemeshmar

Send message
Joined: 29 Aug 07
Posts: 7
Credit: 187,002
RAC: 0
100 thousand credit badge10 year member badge
Message 1370 - Posted: 6 Jan 2008, 18:02:56 UTC
Last modified: 6 Jan 2008, 18:04:48 UTC

I've had an increase of 'stopped' WU. Most get stuck in the 40-60% completed range. P3 & P4 Intel and a K6-2 AMD - none of the Athlon 64 or the Mac Core2 have 'stopped'. Win2K; Win2K Adv Server; WinXP and 32 bit Linux for OS. Boinc 5.10.30 for Windows and 5.10.21 on Linux.

Only the older architecture cpus seem to be affected on my end if that's any help.

ID: 1370 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilebanditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
500 thousand credit badge10 year member badge
Message 1371 - Posted: 6 Jan 2008, 19:11:45 UTC

well the latest ones at a 7 secs 90, at 21 min is a 92. I've had others, the few sec ones I just abort.
ID: 1371 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : WU just stops

©2019 Astroinformatics Group