Message boards :
Number crunching :
"Computation error" - only on ati gpu WUs
Message board moderation
Author | Message |
---|---|
Send message Joined: 21 Mar 09 Posts: 4 Credit: 203,656 RAC: 0 |
I think there's something wrong with my set-up, because I have (so far) been unable to complete any GPU WUs - they error after a few seconds, every time. BOINC appears to detect the GPUs ok, this shows up after it starts: Tue 09 Aug 2011 17:35:41 BST ATI GPU 0: ATI unknown (CAL version 1.4.1457, 2048MB, 3379 GFLOPS peak) Tue 09 Aug 2011 17:35:41 BST ATI GPU 1: ATI unknown (CAL version 1.4.1457, 2048MB, 3379 GFLOPS peak) My drivers are up to date - 11.7. I also installed the 2.5 SDK but I don't know if it's actually being used... Is there anything I can do to fix this? (OS is Linux - 64 bit) |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
I wonder if the scheduler sent you the 32-bit binaries instead of the 64-bit version, that would be one cause of the error you are seeing. If you are comfortable with changing the application, you might try the 64-bit version with an app_info. http://www.arkayn.us/forum/index.php?action=downloads;sa=view;down=23 |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
Are you using app_info? Could you try running file on the actual binary? It should be somewhere like /var/lib/boinc/projects/milkyway.cs.rpi.edu_milkyway/milkyway_separation_0.82_x86_64-pc-linux-gnu__ati14 |
Send message Joined: 21 Mar 09 Posts: 4 Credit: 203,656 RAC: 0 |
I wonder if the scheduler sent you the 32-bit binaries instead of the 64-bit version, that would be one cause of the error you are seeing. I just tried this - the computation error happened like before. Are you using app_info? Could you try running file on the actual binary? It should be somewhere like /var/lib/boinc/projects/milkyway.cs.rpi.edu_milkyway/milkyway_separation_0.82_x86_64-pc-linux-gnu__ati14 I'm not really sure what to do there, I have the original executable file that was in the milkyway folder - but I don't know how to run it on a work unit. |
Send message Joined: 21 Mar 09 Posts: 4 Credit: 203,656 RAC: 0 |
If it helps, this is the error I'm getting: <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234) </message> <stderr_txt> execv: No such file or directory </stderr_txt> ]]> |
Send message Joined: 28 Feb 10 Posts: 120 Credit: 109,840,492 RAC: 0 |
I don't use Linux so I can't say much - but have you thought on user rights on the working DIR of BOinc? |
Send message Joined: 21 Mar 09 Posts: 4 Credit: 203,656 RAC: 0 |
I don't use Linux so I can't say much - but have you thought on user rights on the working DIR of BOinc? I don't think that's a problem, I've created a separate user called "boinc" and everything runs in the home folder /home/boinc (which is owned by that user). |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
Maybe this thread will help? |
Send message Joined: 28 Feb 10 Posts: 120 Credit: 109,840,492 RAC: 0 |
I don't use Linux so I can't say much - but have you thought on user rights on the working DIR of BOinc? And this user has exec Rights (r-x) in this folder? |
Send message Joined: 30 Nov 09 Posts: 4 Credit: 287,953 RAC: 0 |
I have the same problem... if I try to run $ cd BOINC/projects/milkyway.cs.rpi.edu_milkyway/ $ ./milkyway_separation_0.82_x86_64-pc-linux-gnu__ati14 bash: ./milkyway_separation_0.82_x86_64-pc-linux-gnu__ati14: /lib/ld-linux-x86-64.so.2: bad ELF interpreter: File o directory non esistente The problem is ld-linux-x86_64.so.2 is in /lib64/ and not in /lib/. I resolved with ln -s /lib64/ld-linux-x86-64.so.2 /lib/ld-linux-x86-64.so.2 Now the GPU app is working :-) |
Send message Joined: 5 Nov 10 Posts: 69 Credit: 15,064,831 RAC: 0 |
Same error where tasks fail with a computation error after 2 seconds in XP Pro (32-bit) & HD3850. Just started this project running GPU tasks on my new(!) AGP card with CCC 10.5. Collatz & DNETC working 100% OK ??? Have set Milkyway to not request new tasks on that PC until this is fixed. |
Send message Joined: 24 Feb 09 Posts: 620 Credit: 100,587,625 RAC: 0 |
Same error where tasks fail with a computation error after 2 seconds in XP Pro (32-bit) & HD3850. Highly likely its the driver - you are on 1.4.636, try updating it and post again if still not working Regards Zy |
Send message Joined: 2 Nov 10 Posts: 731 Credit: 131,536,342 RAC: 0 |
I think there's something wrong with my set-up, because I have (so far) been unable to complete any GPU WUs - they error after a few seconds, every time. I know nothing about Linux, but some users have problems with the latest drivers. Have you tried an older ones? |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
Same error where tasks fail with a computation error after 2 seconds in XP Pro (32-bit) & HD3850. Cat 10.5 is clearly too old to work with the ATI app. I know it was working with cat 10.10. So you should start from there and see what the latest cat version supporting your HD3850(AGP) is. |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
I'm using 11.3 with the AGP hotfix on my 3850 with no problems (XP Pro 32bit). |
Send message Joined: 5 Nov 10 Posts: 69 Credit: 15,064,831 RAC: 0 |
Just got back to this after some non-graphics card hardware updates. OK, have just completed a task that validated OK. Run ......... CPU Time ........ Time (sec) ....... (sec) ------------------- 462.31 .... 20.30 (7.7 minutes) WOW that's fast! (For this rig, anway.) CCC details:- Driver Packaging Version 8.881-110728a-122947E-ATI Catalystâ„¢ Version 11.8 Provider AMD Technologies Inc. 2D Driver Version 6.14.10.7213 Direct3D Version 6.14.10.0855 OpenGL Version 6.14.10.11005 Catalystâ„¢ Control Center Version 2011.0728.1723.29300 AIW/VIVO WDM Driver Version 6.14.10.6238 AIW/VIVO WDM SP Driver Version 6.14.10.6238 |
Send message Joined: 5 Nov 10 Posts: 69 Credit: 15,064,831 RAC: 0 |
... PS. Yes this PC (Computer [ID] 231173) now shows in BOINC as having 'Coprocessors':- "CAL ATI Radeon HD 3800 (RV670) (512MB) driver: 1.4.1523" although I have no idea where it gets the driver reference 1.4.1523 ... that exact driver reference only appears in BOINC files on this PC. Can someone point me in the right direction so that I can cross-reference these utterly confusing GPU driver references? Please? (FWIW if I Google e.g., "1.4.1523" all I get is a string of links related to BOINC.) Alternatively, I respectfully suggest that maybe references to whatever BOINC calls the driver should be replaced or quantified by the correct ATI driver ref - in ATI's format. Otherwise the BOINC driver reference appears meaningless (with the greatest respect, if I'm simply missing something obvious here). Summary:- In (XP) Computer/Manage/Device Manager/Display adapters/ATI Radeon HD 3850 AGP the driver shows up as ... 8.881.0.0 In BOINC the driver shows as ... 1.4.1523 |
Send message Joined: 5 Nov 10 Posts: 69 Credit: 15,064,831 RAC: 0 |
OK, updated everything as detailed before All was OK for a while and processed a ton of WU's e.g., WU 14197156 etc, etc Then updated a few BOINC preferences. Now processing errors after 4 seconds on all MW@H tasks due to the old "Computation Error" problem. HELP! FWIW this problem does not occur running Collatz GPU tasks on the same PC, same preferences e.g., WU 42570881 etc post updated BOINC preferences |
Send message Joined: 5 Nov 10 Posts: 69 Credit: 15,064,831 RAC: 0 |
Well, this PC is now processing GPU tasks OK ps_separation_13_3s_fix20_2_5993594_1 finished ps_separation_13_3s_fix20_2_593217_1 78% complete etc Very strange. |
©2024 Astroinformatics Group