rpi_logo
Compute error: Can't acquire lockfile
Compute error: Can't acquire lockfile
log in

Advanced search

Message boards : Number crunching : Compute error: Can't acquire lockfile

Author Message
RvP_LaN
Send message
Joined: 11 Oct 07
Posts: 8
Credit: 7,419,890
RAC: 23,995

Message 6070 - Posted: 11 Nov 2008, 0:22:50 UTC

Hi there,

First time that I see this message.

<core_client_version>6.3.14</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> Can't acquire lockfile - exiting FILE_LOCK::unlock(): close failed.: No error Can't acquire lockfile - exiting FILE_LOCK::unlock(): close failed.: No error Can't acquire lockfile - exiting FILE_LOCK::unlock(): close failed.: No error Can't acquire lockfile - exiting

Happend three times on a XP64 box, quad core Phenom, 4GB RAM, 18GB free disk space, VM large enough.
MW bin: astronomy_1.22_windows_x86_64.exe
BOINC Client: 6.3.14 for Windows XP64.
The box remains up 24/24.

Any clue about this error message and issue?

By the way, on this same box, same context, your binary now often remains stucked into the process list. I have to stop the BOINC service, for distinguish which alive processes are parented to the BOINC daemon. Then in the list, the remaining boinc_project processes are the MW stucked processes. They don't consume CPU time, but small amount of RAM.

Anyway, if I don't restart the BOINC service, after one week of MW stucked processes, one of the core doesn't compute anymore for ALL other projects. Not acceptable...

I don't blame MW (not yet!), I just report a fact. Maybe the problem is related to this new version of Boinc's client, on a multi-core box. I'm not quite sure to have observed this kind of events with MW and the older Boinc client: 5.10.45.

My Boinc preferences state that projects should NOT "leave applications into memory while suspended". Hope you are following these rules...

Regards.

Profile Stefan Ledwina
Avatar
Send message
Joined: 28 Aug 07
Posts: 16
Credit: 27,420,167
RAC: 409

Message 6073 - Posted: 11 Nov 2008, 16:23:04 UTC - in response to Message 6070.

6.3.14 was an older alpha version. 6.3.21 is the current BOINC alpha version...
Maybe try if things are better with the new version...
____________


pixelicious.at - My little photoblog

RvP_LaN
Send message
Joined: 11 Oct 07
Posts: 8
Credit: 7,419,890
RAC: 23,995

Message 8025 - Posted: 28 Dec 2008, 5:52:24 UTC - in response to Message 6073.
Last modified: 28 Dec 2008, 6:07:21 UTC

6.3.14 was an older alpha version. 6.3.21 is the current BOINC alpha version...

Hello everybody, best wishes to all Boinc people!!!

I Come back on this post, after a while. I wanted to spend some time on this, just to check what happens with the new clients versions (I'm currently using 6.5.0), and the new MW versions (current 0.7 optimized x86_64). By the way, thanx to provide us a true 64 bits client!

Anyway, I still have these "ghost processes" which stay in process list. They are well identified as boinc_project processes; they are well sons of the boinc_master process. They use just a few kilo-octects of MEM, but they don't run at all (0% activity).

To purge the ghost processes, I'm forced to stop and restart my Boinc's service in order to clean this. When I do so, all ghost processes are detached from father boinc_master (even if they still are identified as boinc_project). You have to kill them one by one in the task manager. (I'm using BoincView and SysInternals Process Explorer to monitor this.)

It seems that when a MW's process hangs into error, it is not able to die elegantly! So, it is well indicated that there's a "compute error" in the Boinc's task list, but the process stays in the Windows process list.

MW is not the only project to generate ghots processes. So I won't conclude anything in a unique way! May be the new Boinc's scheduler (since 6.2.x) has something to do with this on multi-core's machines.

Could you check that? That your error's exits don't lead all to terminate the process.

Regards

Profile Paul D. Buck
Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0

Message 8275 - Posted: 12 Jan 2009, 23:30:42 UTC

The lock file problem I only saw it with Rosetta and a note on the Einstein boards suggested that it was because of using less than 100% of the CPU, in other words, using the ability to lower CPU usage below 100% is bugged.

I have not gotten back to running Rosetta on the computer that was so good at demonstrating the problem, but, it is something you may want to look at and post back if it has no effect or if you are using 100% runtime option ...

I am personally curious because I lost about 8 tasks because of this with many of them deep in the processing ... so like two days worth of run time (~48-50 hours)
____________


Post to thread

Message boards : Number crunching : Compute error: Can't acquire lockfile


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group