Welcome to MilkyWay@home

Client errors

Message boards : Number crunching : Client errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 1684 - Posted: 11 Feb 2008, 16:54:41 UTC
Last modified: 11 Feb 2008, 16:56:01 UTC

Another errored out as I got on computer, wu id #3453228. Got a pop-up on this one.
ID: 1684 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 1685 - Posted: 11 Feb 2008, 18:29:43 UTC - in response to Message 1683.  

errored out, wu id #3452929.
The errors I have been getting are failing around 1300 sec.


whats the progress looking like?
ID: 1685 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 1686 - Posted: 11 Feb 2008, 19:49:38 UTC

Another errored out as I got on computer, wu id #3453269. Got a pop-up on this one. It seems that if I move my mouse to turn the monitor back on it errors.

I think the progress has usually switched back to 0.0% when it errors.
ID: 1686 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stick

Send message
Joined: 8 Oct 07
Posts: 52
Credit: 5,884,118
RAC: 4,655
Message 1687 - Posted: 11 Feb 2008, 20:34:00 UTC - in response to Message 1686.  

Another errored out as I got on computer, wu id #3453269. Got a pop-up on this one. It seems that if I move my mouse to turn the monitor back on it errors.

I think the progress has usually switched back to 0.0% when it errors.


I've only scanned this thread so I may have missed this dialog on this point. Have you been doing Windows Updates regularly? Sometimes out of date DirectX or device drivers can cause problems like the ones you've been experiencing.
ID: 1687 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 1688 - Posted: 11 Feb 2008, 20:47:46 UTC

I do keep directX and other media players and such that I use up to date. I have been using DirectX 10.C since it came out (required for some game I play), which I believe is the last version that will be released for XP as dx 11 is only for Vista.

I have also had 3 times that I noticed, the currently running unit just hangs and doesn't actually run or switch to another. But I haven't been running anything to max my pc usage to stop it from running.
ID: 1688 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stick

Send message
Joined: 8 Oct 07
Posts: 52
Credit: 5,884,118
RAC: 4,655
Message 1689 - Posted: 11 Feb 2008, 21:18:33 UTC - in response to Message 1688.  
Last modified: 11 Feb 2008, 21:23:22 UTC

I have also had 3 times that I noticed, the currently running unit just hangs and doesn't actually run or switch to another. But I haven't been running anything to max my pc usage to stop it from running.


I posted this message on uFluids several months ago when I had a WU hang-up there. Could this be what is happening here? If so, giving permission for Milkway@home to contact the internet through your firewall may stop the hang-ups - but it won't fix anything else.
ID: 1689 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 1690 - Posted: 11 Feb 2008, 22:25:29 UTC - in response to Message 1689.  

I posted this message on uFluids several months ago when I had a WU hang-up there. Could this be what is happening here? If so, giving permission for Milkway@home to contact the internet through your firewall may stop the hang-ups - but it won't fix anything else.


It isn't trying to contact anything, it happens while it's running not when sending info.
ID: 1690 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stick

Send message
Joined: 8 Oct 07
Posts: 52
Credit: 5,884,118
RAC: 4,655
Message 1691 - Posted: 11 Feb 2008, 23:14:36 UTC - in response to Message 1690.  
Last modified: 11 Feb 2008, 23:16:57 UTC

I posted this message on uFluids several months ago when I had a WU hang-up there. Could this be what is happening here? If so, giving permission for Milkway@home to contact the internet through your firewall may stop the hang-ups - but it won't fix anything else.


It isn't trying to contact anything, it happens while it's running not when sending info.


Just making sure we are clear on this: My problem at uFluids started with a WU client error. When that happened, BOINC froze up. The uFluids unit was still "Running" but nothing was happening. The uFluids app had called Windows Debugger (because of the client error) but that call was blocked by my firewall. And everything related to BOINC came to a halt. It was only after I got the alert from my firewall (several minutes later) that I was able figure it out. If, for example, you are using Windows Firewall, you might not get an alert and, therefore, you might never know what was causing the hang-up.
ID: 1691 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 1692 - Posted: 11 Feb 2008, 23:22:09 UTC

I understand you Stick. Doesn't quite sound like what is happening to me, I will keep that in mind though. I can still change menu's on Boinc so the whole program doesn't freeze, just the currently running unit seems to. But that has been a rare thing recently.

I did think to check Boinc messages and got this, maybe it can help more:

2/11/2008 5:29:44 PM|Milkyway@home|Reason: Unrecoverable error for result gs_211_1202829585_6839_0 (One or more arguments are invalid (0x80000003) - exit code -2147483645 (0x80000003))
2/11/2008 5:29:44 PM|Milkyway@home|Output file gs_211_1202829585_6839_0_0 for task gs_211_1202829585_6839_0 absent
ID: 1692 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 1695 - Posted: 12 Feb 2008, 16:25:35 UTC - in response to Message 1692.  

I understand you Stick. Doesn't quite sound like what is happening to me, I will keep that in mind though. I can still change menu's on Boinc so the whole program doesn't freeze, just the currently running unit seems to. But that has been a rare thing recently.

I did think to check Boinc messages and got this, maybe it can help more:

2/11/2008 5:29:44 PM|Milkyway@home|Reason: Unrecoverable error for result gs_211_1202829585_6839_0 (One or more arguments are invalid (0x80000003) - exit code -2147483645 (0x80000003))
2/11/2008 5:29:44 PM|Milkyway@home|Output file gs_211_1202829585_6839_0_0 for task gs_211_1202829585_6839_0 absent


hmm, it looks like (if these are consistently happening when the progress is resetting) that the problem might be happening when we're writing the checkpoint file. i'm gonna look around there and see if anything is amiss.
ID: 1695 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 1696 - Posted: 13 Feb 2008, 18:19:08 UTC

This one errored out when I cliked to open Boinc up to view progress.

2/13/2008 1:15:54 PM|Milkyway@home|Reason: Unrecoverable error for result gs_221_1202991964_97010_0 (One or more arguments are invalid (0x80000003) - exit code -2147483645 (0x80000003))
2/13/2008 1:15:54 PM|Milkyway@home|Computation for task gs_221_1202991964_97010_0 finished
2/13/2008 1:15:54 PM|Milkyway@home|Output file gs_221_1202991964_97010_0_0 for task gs_221_1202991964_97010_0 absent
ID: 1696 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stick

Send message
Joined: 8 Oct 07
Posts: 52
Credit: 5,884,118
RAC: 4,655
Message 1699 - Posted: 13 Feb 2008, 19:45:39 UTC - in response to Message 1696.  

This one errored out when I cliked to open Boinc up to view progress.

2/13/2008 1:15:54 PM|Milkyway@home|Reason: Unrecoverable error for result gs_221_1202991964_97010_0 (One or more arguments are invalid (0x80000003) - exit code -2147483645 (0x80000003))
2/13/2008 1:15:54 PM|Milkyway@home|Computation for task gs_221_1202991964_97010_0 finished
2/13/2008 1:15:54 PM|Milkyway@home|Output file gs_221_1202991964_97010_0_0 for task gs_221_1202991964_97010_0 absent


Since Travis indicated these problems may be related to checkpointing, it might be a good idea to run your disk utilities just to make sure you don't have any indexing or file problems, etc. You might also try moving BOINC to another HD or flash drive just to see if it makes any difference. (i.e. Just copy your BOINC folder to the new drive; and, then run the BOINC installation program and point to the new BOINC folder's location.)
ID: 1699 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 1703 - Posted: 13 Feb 2008, 21:17:19 UTC

I run disk cleanup, etc once a week. My stuff is fine, I keep telling eveyone that, if I had I problem I would have said it.
ID: 1703 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stick

Send message
Joined: 8 Oct 07
Posts: 52
Credit: 5,884,118
RAC: 4,655
Message 1706 - Posted: 14 Feb 2008, 0:29:48 UTC - in response to Message 1703.  
Last modified: 14 Feb 2008, 0:30:25 UTC

I run disk cleanup, etc once a week.

Disk cleanup only gets rid of old files you don't need anymore. It helps free up space - but not much else. Have you also run the disk "Error checking" utility recently? It checks for file structure errors, bad sectors, etc. (Go to "My Computer" and highlight your disk, then go to "Properties" under the "File" menu. The "Error checking" utility is under the "Tools" menu.) If, for example, your disk has some unidentified, bad sectors that happen to be where the program is trying to write a checkpoint, the application might hang-up for a while.

My stuff is fine, I keep telling eveyone that, if I had I problem I would have said it.

Since these client errors aren't being reported by "lots of different people", I think it is reasonable to assume that your system set-up is at least part of the problem. I am just trying to help in "the process of elimination".
ID: 1706 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 1707 - Posted: 14 Feb 2008, 0:50:42 UTC
Last modified: 14 Feb 2008, 1:02:03 UTC

Did you not see the "etc" in my post?

Why must it always be me or my computer. My computer isn't the problem, and I'm not the problem. I run everything and do everything. There are many people running boinc & mw that don't know to do that stuff and it seems that they aren't having any problems they are mentioning here. I think most people have a problem with a project and move to another instead of trying to find a solution.

To add more of an explanation of one occurance: I open Bonic about a dozen times a day to check to see if wu's are running correctly. Once I saw that the wu errored out just as Boinc opened. All the other times its fine.

Other errors seem to pop-up if I am on my computer or not.
ID: 1707 · Rating: -1 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stick

Send message
Joined: 8 Oct 07
Posts: 52
Credit: 5,884,118
RAC: 4,655
Message 1713 - Posted: 14 Feb 2008, 20:22:48 UTC - in response to Message 1707.  
Last modified: 14 Feb 2008, 20:52:31 UTC

Did you not see the "etc" in my post?

Yes I did. But, given some of your earlier posts (e.g. this one) indicate that you are not very experienced with hardware issues; and, because your response did not directly address the question I had raised, I thought I would try again.

Why must it always be me or my computer. My computer isn't the problem, and I'm not the problem. I run everything and do everything.

Apparently, my attempts to help have somehow offended you. This was certainly not my intent. But, if that is what has happened, I apologize.

There are many people running boinc & mw that don't know to do that stuff and it seems that they aren't having any problems they are mentioning here. I think most people have a problem with a project and move to another instead of trying to find a solution.

Very true! And, the fact that you are reporting the problems you encounter here indicates that are both interested in the project and in helping to get those problems resolved.

I have been running varioius BOINC for over 3 years. And, over that time, I have encountered and reported numerous problems with other projects (and I have also helped a lot of other users). But, so far, neither of my computers have encountered a client error running Milkway@home. Your computer, on the other hand, seems to be erroring out WU's 5 to 10% of the time. Our systems are somewhat similar - Intel chips running Win XP. Why are you getting errors and I'm not? There must be some difference in our set-up. You need to be open to the possibility that your system could be part of the problem, if you truly want to resolve the issue. Otherwise, you might as well "move to another project".
ID: 1713 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile DaveSun
Avatar

Send message
Joined: 10 Nov 07
Posts: 28
Credit: 2,549,231
RAC: 0
Message 1717 - Posted: 15 Feb 2008, 3:32:23 UTC

I have been following stalls on my systems for a few weeks to try and find a pattern to them. and have found that they always seem to stall at some point after the progress bar restarts (the final portion) I use BOINCView to controll my hosts remotely, since most of them are headless units. When I find one stalled I suspend it and sometimes the WU errors out at that time. Usually when I resume it, it will crunch to completion with no trouble. Since these are headless units I don't know if any of the failures generate the pop-up errors that have been discussed on these boards and if they do occur they clear themselves before I can get a monitor connected to check them out. I run both Intel and AMD systems with doze 2K pro. Most systems have 512M ram some have more. The systems that I run MW on are just for crunching except one that I use at work. All of these systems have had stalls and errors as I have described.

@banditwolf it may not be your set up it may be BOINC Manager or a configuration in it causing your problems. What is your system and how do you have BOINC configured?
ID: 1717 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jayargh
Avatar

Send message
Joined: 8 Oct 07
Posts: 289
Credit: 3,690,838
RAC: 0
Message 1718 - Posted: 15 Feb 2008, 4:18:52 UTC - in response to Message 1717.  
Last modified: 15 Feb 2008, 4:21:21 UTC

I have been following stalls on my systems for a few weeks to try and find a pattern to them. and have found that they always seem to stall at some point after the progress bar restarts (the final portion) I use BOINCView to controll my hosts remotely, since most of them are headless units. When I find one stalled I suspend it and sometimes the WU errors out at that time. Usually when I resume it, it will crunch to completion with no trouble. Since these are headless units I don't know if any of the failures generate the pop-up errors that have been discussed on these boards and if they do occur they clear themselves before I can get a monitor connected to check them out. I run both Intel and AMD systems with doze 2K pro. Most systems have 512M ram some have more. The systems that I run MW on are just for crunching except one that I use at work. All of these systems have had stalls and errors as I have described.

@banditwolf it may not be your set up it may be BOINC Manager or a configuration in it causing your problems. What is your system and how do you have BOINC configured?


Pop-up errors will cause the thread to stop running until closed.....notice any performance problems? ;) I have had 1 recently...for me its always windows never linux.
ID: 1718 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile DaveSun
Avatar

Send message
Joined: 10 Nov 07
Posts: 28
Credit: 2,549,231
RAC: 0
Message 1719 - Posted: 15 Feb 2008, 4:32:28 UTC - in response to Message 1718.  
Last modified: 15 Feb 2008, 4:32:53 UTC

Pop-up errors will cause the thread to stop running until closed.....notice any performance problems? ;) I have had 1 recently...


If the pop-ups are occuring the system is switching to another project. I have seen one pop-up on one system when everyone first started seeing them and I noticed at that time that it switched to another project and the pop-up went away before I could click it. I do know that the stalls do stop all BOINC activity though, I had one today, and this has been the only performace hit I've seen. The frequency of the stalls and errors has seemed to be less of late at least for my crunchers.
ID: 1719 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ivor Cogdell

Send message
Joined: 9 Feb 08
Posts: 9
Credit: 473,130
RAC: 0
Message 1735 - Posted: 19 Feb 2008, 21:45:44 UTC

Hi Gang,
This is the third time that I have received a "Computational Error" on completion of a work unit. I have not been running MW long, running with it was Rosetta, Seti and Einstein were presently suspended. Running XP sp2. Using Boinc 5.10.28. Any Ideas ?

Ivor

19/02/2008 17:32:24|Milkyway@home|Output file gs_242_1203423232_20856_0_0 for task gs_242_1203423232_20856_0 absent
19/02/2008 19:20:29|Milkyway@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 1 completed tasks
19/02/2008 19:20:51||Project communication failed: attempting access to reference site
19/02/2008 19:20:52||Access to reference site succeeded - project servers may be temporarily down.
19/02/2008 19:20:54|Milkyway@home|Scheduler request failed: Couldn't connect to server

Ivor
My Webpage
ID: 1735 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Client errors

©2024 Astroinformatics Group