Message boards :
Number crunching :
Compute error: Can't acquire lockfile
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Oct 07 Posts: 8 Credit: 10,243,352 RAC: 0 |
Hi there, First time that I see this message. <core_client_version>6.3.14</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> Can't acquire lockfile - exiting FILE_LOCK::unlock(): close failed.: No error Can't acquire lockfile - exiting FILE_LOCK::unlock(): close failed.: No error Can't acquire lockfile - exiting FILE_LOCK::unlock(): close failed.: No error Can't acquire lockfile - exiting Happend three times on a XP64 box, quad core Phenom, 4GB RAM, 18GB free disk space, VM large enough. MW bin: astronomy_1.22_windows_x86_64.exe BOINC Client: 6.3.14 for Windows XP64. The box remains up 24/24. Any clue about this error message and issue? By the way, on this same box, same context, your binary now often remains stucked into the process list. I have to stop the BOINC service, for distinguish which alive processes are parented to the BOINC daemon. Then in the list, the remaining boinc_project processes are the MW stucked processes. They don't consume CPU time, but small amount of RAM. Anyway, if I don't restart the BOINC service, after one week of MW stucked processes, one of the core doesn't compute anymore for ALL other projects. Not acceptable... I don't blame MW (not yet!), I just report a fact. Maybe the problem is related to this new version of Boinc's client, on a multi-core box. I'm not quite sure to have observed this kind of events with MW and the older Boinc client: 5.10.45. My Boinc preferences state that projects should NOT "leave applications into memory while suspended". Hope you are following these rules... Regards. |
Send message Joined: 28 Aug 07 Posts: 16 Credit: 70,797,368 RAC: 0 |
6.3.14 was an older alpha version. 6.3.21 is the current BOINC alpha version... Maybe try if things are better with the new version... |
Send message Joined: 11 Oct 07 Posts: 8 Credit: 10,243,352 RAC: 0 |
6.3.14 was an older alpha version. 6.3.21 is the current BOINC alpha version... Hello everybody, best wishes to all Boinc people!!! I Come back on this post, after a while. I wanted to spend some time on this, just to check what happens with the new clients versions (I'm currently using 6.5.0), and the new MW versions (current 0.7 optimized x86_64). By the way, thanx to provide us a true 64 bits client! Anyway, I still have these "ghost processes" which stay in process list. They are well identified as boinc_project processes; they are well sons of the boinc_master process. They use just a few kilo-octects of MEM, but they don't run at all (0% activity). To purge the ghost processes, I'm forced to stop and restart my Boinc's service in order to clean this. When I do so, all ghost processes are detached from father boinc_master (even if they still are identified as boinc_project). You have to kill them one by one in the task manager. (I'm using BoincView and SysInternals Process Explorer to monitor this.) It seems that when a MW's process hangs into error, it is not able to die elegantly! So, it is well indicated that there's a "compute error" in the Boinc's task list, but the process stays in the Windows process list. MW is not the only project to generate ghots processes. So I won't conclude anything in a unique way! May be the new Boinc's scheduler (since 6.2.x) has something to do with this on multi-core's machines. Could you check that? That your error's exits don't lead all to terminate the process. Regards |
Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0 |
The lock file problem I only saw it with Rosetta and a note on the Einstein boards suggested that it was because of using less than 100% of the CPU, in other words, using the ability to lower CPU usage below 100% is bugged. I have not gotten back to running Rosetta on the computer that was so good at demonstrating the problem, but, it is something you may want to look at and post back if it has no effect or if you are using 100% runtime option ... I am personally curious because I lost about 8 tasks because of this with many of them deep in the processing ... so like two days worth of run time (~48-50 hours) |
©2024 Astroinformatics Group