Message boards :
Number crunching :
Ghost processes in the task manager
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Oct 07 Posts: 8 Credit: 10,243,352 RAC: 0 |
Hello everybody, best wishes to all Boinc people!!! I mentioned this in another post, but the title could be confuse. It's better to create another thread. I have spent some time on this before reporting, just to check what happens with the new clients versions (I'm currently using 6.5.0 for Windows x86_64), and the new MW version (0.7 optimized x86_64). By the way, thanx to provide us a true 64 bits client! I have these "ghost processes" which stay in process list. They are well identified as boinc_project processes; they are well sons of the boinc_master process. They use just a few kilo-octets of RAM, but they don't run at all (0% activity). To purge the ghost processes, I'm forced to stop and restart the Boinc's service. When I do so, all ghost processes are detached from father boinc_master. After that, you have to kill them one by one in the task manager. (I'm using BoincView and SysInternals Process Explorer to monitor this, the process parenting is graphicaly explicit.) It seems that when a MW's process hangs into compute error, it is not (always?) able to die elegantly! It is well indicated that there's a "compute error" in the Boinc's task list, but the process stays in the Windows process list. I don't know if there's an automatic mechanism in Boinc's scheduler to detect dead processes. If yes, it doesn't work well! MW is not the only project to generate ghost processes. So I won't conclude anything in a unique way! May be the new Boinc's scheduler (since 6.2.x for me) has something to do with this on multi-core's machines. I never remark anything like this with the very stable 5.10.45. Could you check that? That for ALL cases, when MW exits in error, it does lead to terminate the process? Regards |
Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0 |
Hmmm... Based on what you're seeing, you'd probably be better off to complain on the BOINC Core Client Message Board, since this seems to be a CC issue and not one with MW per se. At least you'd be stating the issue closer to the problem anyway. ;-) Alinator |
Send message Joined: 11 Oct 07 Posts: 8 Credit: 10,243,352 RAC: 0 |
Based on what you're seeing, you'd probably be better off to complain on the BOINC Core Client Message Board, since this seems to be a CC issue and not one with MW per se. Thx for the reply. I will post this message to the Boinc's forum. |
Send message Joined: 11 Oct 07 Posts: 8 Credit: 10,243,352 RAC: 0 |
As suggested, I posted on the Boinc Core Client message board, no answer yet. But also, as I mentioned in the preamble, I have observed the phenomenon for some time before reporting: whatever the client versions of Boinc from 6.4.x, on 30 projects in which I participate, there are only 4 projects that cause dead processes: Milkyway, Simap, Spinhenge and Aqua. For example, when (rarely) an Einstein or QMC process finishes in miscalculation, it doesn't stay stuck like described. I understand that these dead processes problem is only a problem to those which have 24/24h participating hosts. Inevitably, when we switch off our computers, there are no more dead processes! But the trick is that we should be able to let a host calculating 24/24h without carefully monitoring Boinc and its projects. |
Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0 |
FWIW, the stock SAH MB application will do the same thing too, and the problem has been around to a greater or lesser degree in one form or another for quite awhile now. It seems to be most common on multicores, and typically happens when two or more tasks try to exit at or close to the same time. There have been a few reports on single core machines too, and I suspect it happens on them when the preferences leave the tasks in memory when suspended and there is a lost CC heartbeat event which causes all the running tasks to do a 'forced' exit (which is by design) at close to the same time. Alinator |
©2024 Astroinformatics Group