Message boards :
Number crunching :
Client rejecting WUs for no apparent reason
Message board moderation
Author | Message |
---|---|
Send message Joined: 17 Nov 08 Posts: 18 Credit: 130,650,263 RAC: 0 |
I have recently noticed some errors on my Linux clients that only happen occasionally. The log below shows that the problem comes and goes. 1 6/23/2011 2:10:22 PM Starting BOINC client version 6.6.41 for i686-pc-linux-gnu 182 Milkyway@home 6/23/2011 3:17:07 PM Sending scheduler request: To fetch work. 183 Milkyway@home 6/23/2011 3:17:07 PM Requesting new tasks 184 Milkyway@home 6/23/2011 3:17:12 PM Scheduler request completed: got 1 new tasks 185 Milkyway@home 6/23/2011 3:17:12 PM [error] No application found for task: i686-pc-linux-gnu 0 mt; discarding 186 Milkyway@home 6/23/2011 3:18:17 PM Sending scheduler request: To fetch work. 187 Milkyway@home 6/23/2011 3:18:17 PM Requesting new tasks 188 Milkyway@home 6/23/2011 3:18:22 PM Scheduler request completed: got 1 new tasks 189 Milkyway@home 6/23/2011 3:18:22 PM [error] No application found for task: i686-pc-linux-gnu 0 mt; discarding 190 Milkyway@home 6/23/2011 3:19:27 PM Sending scheduler request: To fetch work. 191 Milkyway@home 6/23/2011 3:19:27 PM Requesting new tasks 192 Milkyway@home 6/23/2011 3:19:32 PM Scheduler request completed: got 1 new tasks 193 Milkyway@home 6/23/2011 3:19:32 PM [error] No application found for task: i686-pc-linux-gnu 0 mt; discarding 194 Milkyway@home 6/23/2011 3:23:41 PM Computation for task de_nbody_orphan_test_2model_4_66751_1308833687_2 finished 195 Milkyway@home 6/23/2011 3:23:41 PM Starting de_nbody_orphan_test_2model_4_70890_1308841050_0 196 Milkyway@home 6/23/2011 3:23:41 PM Starting task de_nbody_orphan_test_2model_4_70890_1308841050_0 using milkyway_nbody version 62 197 Milkyway@home 6/23/2011 3:23:42 PM Sending scheduler request: To report completed tasks. 198 Milkyway@home 6/23/2011 3:23:42 PM Reporting 1 completed tasks, requesting new tasks 199 Milkyway@home 6/23/2011 3:23:47 PM Scheduler request completed: got 2 new tasks 200 Milkyway@home 6/23/2011 3:24:44 PM Computation for task de_nbody_orphan_test_2model_4_70644_1308840468_0 finished 201 Milkyway@home 6/23/2011 3:24:44 PM Starting de_nbody_orphan_test_2model_4_71045_1308841050_0 202 Milkyway@home 6/23/2011 3:24:44 PM Starting task de_nbody_orphan_test_2model_4_71045_1308841050_0 using milkyway_nbody version 62 203 Milkyway@home 6/23/2011 3:24:52 PM Sending scheduler request: To report completed tasks. 204 Milkyway@home 6/23/2011 3:24:52 PM Reporting 1 completed tasks, requesting new tasks 205 Milkyway@home 6/23/2011 3:24:57 PM Scheduler request completed: got 1 new tasks In a 7 minute period 3 WUs were dropped and 3 WUs were accepted. Is the problem with the client, my app_info, or the WUs themselves? I know the client is old, but it has never given me problems before, and I don't have the ability to update all of the libraries on my three systems to those required for newer clients. Any Ideas? |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
I have recently noticed some errors on my Linux clients that only happen occasionally. The log below shows that the problem comes and goes. I think this might happen if your app_info exists, but the actual application file is missing. If anything is wrong with your app_info BOINC decides to delete the files for some stupid reason, and you end up with something like this. |
Send message Joined: 17 Nov 08 Posts: 18 Credit: 130,650,263 RAC: 0 |
I thought the same, but the app_info is working most of the time. If you can see a problem, please let me know. <app_info> <app><!-- std app for N-Body 0.62 mt 64bit --> <name>milkyway_nbody</name> <user_friendly_name>MilkyWay@Home nbody</user_friendly_name> </app> <file_info> <name>milkyway_nbody_0.62_i686-pc-linux-gnu__mt</name> <executable/> </file_info> <app_version> <app_name>milkyway_nbody</app_name> <version_num>62</version_num> <plan_class>mt</plan_class> <avg_ncpus>4</avg_ncpus> <max_ncpus>4</max_ncpus> <cmdline>--nthreads=4</cmdline> <file_ref> <file_name>milkyway_nbody_0.62_i686-pc-linux-gnu__mt</file_name> <main_program/> </file_ref> </app_version> </app_info> |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
Try it without the planclass item. |
Send message Joined: 17 Nov 08 Posts: 18 Credit: 130,650,263 RAC: 0 |
Removing the planclass item didn't solve it. I am still getting the following errors for some WUs and not others. If you have any other ideas, I don't want to keep rejecting any WUs, because I know that that puts strain on the Milky Database. My only other option would be to put my Linux boxes to other projects. Milkyway@home 6/27/2011 2:48:19 PM [error] No application found for task: i686-pc-linux-gnu 0 ; discarding |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
I am not a Linux user so I could be well wrong .... Does Linux assume .exe extension, or do you need to explicitly state it? Its not shown against the two instances of the execution file inside your app_info. If it does not assume exe, then the two instances inside the app_info need .exe adding. Regards Zy |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
Linux assumes the executable bit, actually it is set by the OS. Lets take it down to just basic and remove the upper info. <app_info> <app> <name>milkyway_nbody</name> </app> <file_info> <name>milkyway_nbody_0.62_i686-pc-linux-gnu__mt</name> <executable/> </file_info> <app_version> <app_name>milkyway_nbody</app_name> <version_num>62</version_num> <avg_ncpus>4</avg_ncpus> <max_ncpus>4</max_ncpus> <cmdline>--nthreads=4</cmdline> <file_ref> <file_name>milkyway_nbody_0.62_i686-pc-linux-gnu__mt</file_name> <main_program/> </file_ref> </app_version> </app_info> |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
Check both that the binary wasn't actually deleted, and make sure the permissions and user are right. It should be read/execute and owned by the boinc user. |
©2024 Astroinformatics Group