rpi_logo
WU just stops
WU just stops
log in

Advanced search

Message boards : Number crunching : WU just stops

1 · 2 · Next
Author Message
Profile S@NL - Mellowman
Avatar
Send message
Joined: 9 Dec 07
Posts: 65
Credit: 8,015,709
RAC: 0

Message 1103 - Posted: 16 Dec 2007, 15:14:03 UTC

This WU just stopped processing after about 3 minutes. I aborted it so one of the cores won't be idle.
____________

The longer I live, the more reasons I develop for wanting to die.

Profile [B^S] Acmefrog
Avatar
Send message
Joined: 28 Aug 07
Posts: 49
Credit: 556,559
RAC: 0

Message 1104 - Posted: 16 Dec 2007, 19:31:06 UTC

I have had that happen once in a while. I have found that if I suspend that WU and let the processor work on another, when I un-suspend it another core will finish it.
____________

Profile S@NL - Mellowman
Avatar
Send message
Joined: 9 Dec 07
Posts: 65
Credit: 8,015,709
RAC: 0

Message 1105 - Posted: 16 Dec 2007, 19:46:16 UTC

I tried suspending/resuming but it didn't work. The WU had stopped almost half an hour before I noticed it. Maybe it was my system as I also lost 2 SETI WU's in progress during a reboot later on.
____________

The longer I live, the more reasons I develop for wanting to die.

Misfit
Avatar
Send message
Joined: 27 Aug 07
Posts: 915
Credit: 1,503,319
RAC: 0

Message 1106 - Posted: 16 Dec 2007, 20:40:31 UTC

I've had this happen whenever a crashed unit caused the MS popup. Usually the core this happened on freezes but then resets itself within a few minutes.
____________
me@rescam.org

Profile dduggan47
Send message
Joined: 6 Sep 07
Posts: 2
Credit: 1,304,297
RAC: 1,679

Message 1185 - Posted: 28 Dec 2007, 16:05:36 UTC

I'm seeing this problem as well on at least two machines. The wu is shown as running but gets no CPU time. It clogs up the system pretty effectively. Another task can run at the same time but if I get two MW's in there, I'm out of business until I happen to check that machine and fix it.

Suspending the task and then letting it go again doesn't seem to help. OTOH restarting the BOINC service does seem to get it going again.

I'm running 5.10.30.

Profile Jayargh
Avatar
Send message
Joined: 8 Oct 07
Posts: 289
Credit: 3,690,838
RAC: 0

Message 1197 - Posted: 30 Dec 2007, 0:02:14 UTC
Last modified: 30 Dec 2007, 0:47:14 UTC

I found this one stuck for a number of hours ...just suspended it and will see if I have to restart Boinc to get it to finish.Looking at everyone who has posted about this problem the one thing in common is windows btw.

EDIT-Had to reboot Boinc to get it to restart.
____________

Profile [B@H] Ray
Send message
Joined: 27 Dec 07
Posts: 35
Credit: 1,432,926
RAC: 0

Message 1199 - Posted: 30 Dec 2007, 18:14:29 UTC

I have had two of them so far, hopefully they will see what is causing it and fix it.

Profile Webmaster Yoda
Avatar
Send message
Joined: 21 Dec 07
Posts: 69
Credit: 7,048,412
RAC: 0

Message 1203 - Posted: 31 Dec 2007, 5:18:57 UTC

It's not only Windows. I have one right now that's stalled on a 64bit Linux computer.

I would restart BOINC, but since it's running CPDN on the other 3 cores, with fairly lengthy intervals between checkpoints, I waste less CPU time by aborting thie work unit.
____________
Join the #1 Aussie Alliance on MilkyWay!

seti@elrcastor.com
Send message
Joined: 22 Dec 07
Posts: 11
Credit: 5,943,029
RAC: 0

Message 1263 - Posted: 2 Jan 2008, 21:09:48 UTC

i've seen a few of them too on linux and windows
____________

Profile [B^S]breathesgelatin
Send message
Joined: 7 Dec 07
Posts: 3
Credit: 7,256
RAC: 0

Message 1336 - Posted: 3 Jan 2008, 22:47:07 UTC

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=1866922

This one hung up on me just now. I think it had been running most of the day, so I aborted it. Next time I'll try the suspending trick and see if that works. My first hung WU. The problems I was having with missing downloads yesterday do seem to be resolved however.
____________

Profile Webmaster Yoda
Avatar
Send message
Joined: 21 Dec 07
Posts: 69
Credit: 7,048,412
RAC: 0

Message 1337 - Posted: 3 Jan 2008, 23:17:33 UTC

Observation: one of my Windows XP computers gets these problems several times a day. The other two haven't (yet). They run different versions of BOINC, so it may be related to the client.

The one that has regular problems is running BOINC 5.10.13 (32bit)
The others are running BOINC 5.10.30 (64bit) and BOINC 5.8.11 (32bit)

I will upgrade the one with 5.10.13 to the latest version when I can get access to it and see if the problem disappears. Will report here if problems persist.

When it does happen, suspending the work unit and resuming it again later does not solve the problem - it stays stuck. Restarting BOINC does seem to work however and it completes within seconds of the restart.


____________
Join the #1 Aussie Alliance on MilkyWay!

Profile CTPAHHNK
Send message
Joined: 21 Sep 07
Posts: 1
Credit: 1,387,208
RAC: 0

Message 1347 - Posted: 4 Jan 2008, 9:19:20 UTC

I have these problems on versions:
BOINC 5.10.20(Win 32Bit)
BOINC 5.10.24(Win 32Bit)
BOINC 5.10.30(Win 32Bit)
BOINC 6.01.00(Win 32Bit)
On computers from up Cel 800MGz to QuadCore 6600 and Dual Xeon 5160
OS WinXP Pro(32bit) & Win Svr 2003 Std Ed R2(32Bit)
What to do? Where to move?





____________

Profile Webmaster Yoda
Avatar
Send message
Joined: 21 Dec 07
Posts: 69
Credit: 7,048,412
RAC: 0

Message 1348 - Posted: 4 Jan 2008, 11:54:45 UTC - in response to Message 1337.

Observation: one of my Windows XP computers gets these problems several times a day. The other two haven't (yet). They run different versions of BOINC, so it may be related to the client.

The one that has regular problems is running BOINC 5.10.13 (32bit)


Update: I upgraded BOINC and it just got another one, so it's more likely the app is not always sending the right signals to BOINC when it's finishing.

Result name is gs_92_1199482628_108631_0 if it's any help. I will pause it for now - don't want to restart BOINC until Climate checkpoints again.



____________
Join the #1 Aussie Alliance on MilkyWay!

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 1349 - Posted: 4 Jan 2008, 12:33:23 UTC - in response to Message 1348.

Observation: one of my Windows XP computers gets these problems several times a day. The other two haven't (yet). They run different versions of BOINC, so it may be related to the client.

The one that has regular problems is running BOINC 5.10.13 (32bit)


Update: I upgraded BOINC and it just got another one, so it's more likely the app is not always sending the right signals to BOINC when it's finishing.

Result name is gs_92_1199482628_108631_0 if it's any help. I will pause it for now - don't want to restart BOINC until Climate checkpoints again.




i'll take a look into the freezing WUs. not quite sure whats causing it because i'm pretty sure we're using the API correctly.

Profile Jayargh
Avatar
Send message
Joined: 8 Oct 07
Posts: 289
Credit: 3,690,838
RAC: 0

Message 1350 - Posted: 4 Jan 2008, 13:48:51 UTC - in response to Message 1348.

Observation: one of my Windows XP computers gets these problems several times a day. The other two haven't (yet). They run different versions of BOINC, so it may be related to the client.

The one that has regular problems is running BOINC 5.10.13 (32bit)


Update: I upgraded BOINC and it just got another one, so it's more likely the app is not always sending the right signals to BOINC when it's finishing.

Result name is gs_92_1199482628_108631_0 if it's any help. I will pause it for now - don't want to restart BOINC until Climate checkpoints again.




I have only had 1 unit freeze on my windows machine and it wasn't on finish...it was in the middle at about 40-50% progress...so mine had nothing to do with the finish file.

Profile Allen
Send message
Joined: 30 Dec 07
Posts: 8
Credit: 356,682
RAC: 0

Message 1351 - Posted: 4 Jan 2008, 14:25:40 UTC
Last modified: 4 Jan 2008, 14:26:45 UTC

Mornin Travis, Overnight 3 of my puters could not crunch due to stopped ( frozen ) WU, i case and 2 computation errors All puters are 32 bit, 2 XP Pro and 1 2000 Pro. Boinc versions are 2 @ 5.10.30 and 1 @ v 5.10.28 One is an X86 single core the other 2 are Intel dual cores. Hope this info helps fix bugs. Cleared the jam and they are crunching again
____________

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0

Message 1365 - Posted: 6 Jan 2008, 16:21:56 UTC

I still have 1-2 out of each 20 that stop. A couple have stopped at just a few seconds.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 1367 - Posted: 6 Jan 2008, 17:33:54 UTC - in response to Message 1365.

I still have 1-2 out of each 20 that stop. A couple have stopped at just a few seconds.


on what architecture?

Profile meshmar
Send message
Joined: 29 Aug 07
Posts: 7
Credit: 187,002
RAC: 0

Message 1370 - Posted: 6 Jan 2008, 18:02:56 UTC
Last modified: 6 Jan 2008, 18:04:48 UTC

I've had an increase of 'stopped' WU. Most get stuck in the 40-60% completed range. P3 & P4 Intel and a K6-2 AMD - none of the Athlon 64 or the Mac Core2 have 'stopped'. Win2K; Win2K Adv Server; WinXP and 32 bit Linux for OS. Boinc 5.10.30 for Windows and 5.10.21 on Linux.

Only the older architecture cpus seem to be affected on my end if that's any help.

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0

Message 1371 - Posted: 6 Jan 2008, 19:11:45 UTC

well the latest ones at a 7 secs 90, at 21 min is a 92. I've had others, the few sec ones I just abort.

1 · 2 · Next
Post to thread

Message boards : Number crunching : WU just stops


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group