Welcome to MilkyWay@home

Increase in number of Computational Errors

Message boards : Number crunching : Increase in number of Computational Errors
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile mscharmack
Avatar

Send message
Joined: 4 Dec 07
Posts: 45
Credit: 1,257,904
RAC: 0
Message 7251 - Posted: 2 Dec 2008, 18:48:06 UTC
Last modified: 2 Dec 2008, 18:51:11 UTC

It seems since the big change in Milkyway@Home that my computational error rate has gone from 0% to about 20% of the work units. It appears that when the computer is turned off for the night or simply restarted that the current work unit comes up with a computational error. Has anyone else seen this problem yet? Can you look into this?

Here are two examples:

56889489 57108829 1 Dec 2008 12:30:51 UTC 2 Dec 2008 18:49:38 UTC Over Client error Compute error 1,230.28 3.97 ---
56884657 57105019 1 Dec 2008 11:04:10 UTC 2 Dec 2008 15:38:21 UTC Over Client error Compute error 1,256.52 4.05 ---

This has never happened in the past.

Thanks
ID: 7251 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GalaxyIce
Avatar

Send message
Joined: 6 Apr 08
Posts: 2018
Credit: 100,142,856
RAC: 0
Message 7252 - Posted: 2 Dec 2008, 19:19:09 UTC - in response to Message 7251.  

It appears that when the computer is turned off for the night or simply restarted that the current work unit comes up with a computational error. Has anyone else seen this problem yet? Can you look into this?

Yes, I've had a few recently. One just now after I switched off my PC to install a wireless keyboard/mouse. In fact, I've seen it a few times after restarting the BOINC manager. Not a great many, just a few.


ID: 7252 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 7253 - Posted: 2 Dec 2008, 19:36:40 UTC - in response to Message 7252.  

It appears that when the computer is turned off for the night or simply restarted that the current work unit comes up with a computational error. Has anyone else seen this problem yet? Can you look into this?

Yes, I've had a few recently. One just now after I switched off my PC to install a wireless keyboard/mouse. In fact, I've seen it a few times after restarting the BOINC manager. Not a great many, just a few.


seen any with the 0.6 version of the application?
ID: 7253 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GalaxyIce
Avatar

Send message
Joined: 6 Apr 08
Posts: 2018
Credit: 100,142,856
RAC: 0
Message 7260 - Posted: 2 Dec 2008, 20:11:34 UTC - in response to Message 7253.  
Last modified: 2 Dec 2008, 20:13:28 UTC

It appears that when the computer is turned off for the night or simply restarted that the current work unit comes up with a computational error. Has anyone else seen this problem yet? Can you look into this?

Yes, I've had a few recently. One just now after I switched off my PC to install a wireless keyboard/mouse. In fact, I've seen it a few times after restarting the BOINC manager. Not a great many, just a few.


seen any with the 0.6 version of the application?

No Travis, all my computational errors were with the 0.4 version. I think I have the 0.4's are all cleared from my caches now.

ID: 7260 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 7262 - Posted: 2 Dec 2008, 20:19:00 UTC - in response to Message 7253.  
Last modified: 2 Dec 2008, 20:45:40 UTC

seen any with the 0.6 version of the application?



Yes Travis.

This host is crunching with 0.6 and has 2 errors on the most recently completed WUs returned, as you can see here.

I have scanned through the results of the other 3 hosts I run and see no "computational errors" from any of them. All are running V0.6.

I hope the next few from the host showing 2 errors is not repeated.

NORE: Using BOINC Manager 6.1.0 - Crunch3r's version with processor affinity and return results after 1 minute enabled).
ID: 7262 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Logan
Avatar

Send message
Joined: 15 Aug 08
Posts: 163
Credit: 3,876,869
RAC: 0
Message 7263 - Posted: 2 Dec 2008, 20:37:57 UTC - in response to Message 7262.  
Last modified: 2 Dec 2008, 20:41:31 UTC

seen any with the 0.6 version of the application?



Yes Travis.

This host is crunching with 0.6 and has 2 errors on the most recently completed WUs returned, as you can see here.

I have scanned through the results of the other 3 hosts I run and see no "computational errors" from any of them.

I hope the next few from the host showing 2 errors is not repeated.

NORE: Using BOINC Manager 6.1.0 - Crunch3r's version with processor affinity and return results after 1 minute enabled).


Hi John!

These work units were initiated to crunch with 0.04 app (see the log), and restarted with 0.06. That causes these errors. I experimented that when 0.06 was downloaded, and since that doesn't repeat these errors.

Sorry for my bad english.

Best regards.
Logan.

BOINC FAQ Service (Ahora, también disponible en Español/Now available in Spanish)
ID: 7263 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 7266 - Posted: 2 Dec 2008, 20:49:13 UTC

Your English is just fine Logan, and my Spanish is no existent (so there). Absolutely no need to apologise.

I take from your comment that the newer WUs are 0.06 ones and unlikely to error, unless I downloaded a few more of the 0.4 which need to be worked off?
ID: 7266 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Logan
Avatar

Send message
Joined: 15 Aug 08
Posts: 163
Credit: 3,876,869
RAC: 0
Message 7271 - Posted: 2 Dec 2008, 21:07:42 UTC - in response to Message 7266.  
Last modified: 2 Dec 2008, 21:09:22 UTC

Your English is just fine Logan, and my Spanish is no existent (so there). Absolutely no need to apologise.

I take from your comment that the newer WUs are 0.06 ones and unlikely to error, unless I downloaded a few more of the 0.4 which need to be worked off?


Yes.

I think the problem is when the wu is started with 0.04 (checkpointing problems and others) and try to be finished with 0.06 after stopping and restarting BOINC and download the 0.06 app.
Logan.

BOINC FAQ Service (Ahora, también disponible en Español/Now available in Spanish)
ID: 7271 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Scratty@SETI.USA

Send message
Joined: 20 Feb 08
Posts: 8
Credit: 9,928,369
RAC: 0
Message 7298 - Posted: 3 Dec 2008, 1:17:08 UTC - in response to Message 7253.  

It appears that when the computer is turned off for the night or simply restarted that the current work unit comes up with a computational error. Has anyone else seen this problem yet? Can you look into this?

Yes, I've had a few recently. One just now after I switched off my PC to install a wireless keyboard/mouse. In fact, I've seen it a few times after restarting the BOINC manager. Not a great many, just a few.


seen any with the 0.6 version of the application?


Getting the following error on the .6 app for windows 32 bit.

Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1228416560.136000
Skipping: /computation_deadline
Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time
Skipping: /mod_time
Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct
Skipping: 100.000000
Skipping: /max_ncpus_pct
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
ID: 7298 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 7300 - Posted: 3 Dec 2008, 1:20:48 UTC - in response to Message 7298.  

It appears that when the computer is turned off for the night or simply restarted that the current work unit comes up with a computational error. Has anyone else seen this problem yet? Can you look into this?

Yes, I've had a few recently. One just now after I switched off my PC to install a wireless keyboard/mouse. In fact, I've seen it a few times after restarting the BOINC manager. Not a great many, just a few.


seen any with the 0.6 version of the application?


Getting the following error on the .6 app for windows 32 bit.

Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1228416560.136000
Skipping: /computation_deadline
Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time
Skipping: /mod_time
Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct
Skipping: 100.000000
Skipping: /max_ncpus_pct
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No error


This looks like there might be a permissions problem.
ID: 7300 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Scratty@SETI.USA

Send message
Joined: 20 Feb 08
Posts: 8
Credit: 9,928,369
RAC: 0
Message 7306 - Posted: 3 Dec 2008, 2:43:20 UTC - in response to Message 7300.  
Last modified: 3 Dec 2008, 2:44:01 UTC

This looks like there might be a permissions problem.



I'm getting this on almost all of my boxes.

In addition, when I check the box, there are 20 + instances of the .6 app running in the task manager.
ID: 7306 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Increase in number of Computational Errors

©2024 Astroinformatics Group