Message boards :
Number crunching :
All Milkyway@Home 1.02 tasks ending in computation error on HD6950.
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Oct 10 Posts: 6 Credit: 10,437,789 RAC: 0 |
Hi all, I've restarted MW on my ATI which I didn't use for a very long time. It used to work fine, but now, I'm getting errors on every MilkyWay@Home v1.02 (opencl_amd_ati) tasks. The WU seem to works fine (GPU usage 100%), but when they reach 100%, they error out. The error seems to be always the same : <core_client_version>7.0.64</core_client_version> Using AMD IL kernel These tasks are computed on this host, http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=250340 ... Note that Milkyway@Home Separation (Modified Fit) v1.28 (opencl_amd_ati) tasks are doing fine on the same system. So as other BOINC apps. Host is running on 7 64bits with Catalyst 13.4 (I noticed that 13.9 wouldn't update the OpenCL driver, so I didn't install it). Any advice ? |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Hi all, The pc of yours with the 69?? gpu in it is working just fine, when I click on your link it says pc not found though. When I click on your name and then view your computers I only see one pc. |
Send message Joined: 22 Aug 10 Posts: 32 Credit: 86,014,800 RAC: 0 |
I am also having a problem with every MilkyWay@Home WU failing on my machine as well, though the Separation (Modified Fit) runs complete successfully. I have both a 6950 and 7950 in my rig. |
Send message Joined: 30 Oct 10 Posts: 6 Credit: 10,437,789 RAC: 0 |
mikey> yes, I forgot to take the last 0 in the URL tags ... but it's the only one I have on the project with DP capable GPU, so it's not hard to find it in my profile :D I unchecked standard Milkyway@Home application in my profile to avoid the failing application, but I'd rather figure out what's goning wrong with my setup, it would be more useful to the project ... |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
I am also having a problem with every MilkyWay@Home WU failing on my machine as well, though the Separation (Modified Fit) runs complete successfully. I have both a 6950 and 7950 in my rig. Your pc's are hidden so I can't see...what version of the Catalyst software are you using? Are you overclocking? |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
mikey> yes, I forgot to take the last 0 in the URL tags ... but it's the only one I have on the project with DP capable GPU, so it's not hard to find it in my profile :D That worked for me, thanks! The only thing that looks funky to me is the lack of system ram in your i7 8cpu machine, 4gb of ram is just shabby these days, 8gb or even 16gb is the norm now. One other thing could be if you aren't leaving a cpu core free just for the gpu to use, gpu's can be sensitive to not being fed new info when they want it and if the cpu is busy crunching it can delay things causing problems. |
Send message Joined: 22 Aug 10 Posts: 32 Credit: 86,014,800 RAC: 0 |
Didn't realize I had my PCs hidden. Fixed now. I only run MW on the AMD/ATI machine. I'm using Catalyst 13.4. I do have my cards overclocked. They never error out on Separation runs - only the the MilkyWay 1.02 WUs. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Didn't realize I had my PCs hidden. Fixed now. I only run MW on the AMD/ATI machine. I'm using Catalyst 13.4. I do have my cards overclocked. They never error out on Separation runs - only the the MilkyWay 1.02 WUs. The overclocking is probably the reason then, some units are much more sensitive to any ever so slight differences as they go thru their crunching and can error out in a heartbeat. Don't worry about it, others will pick those up and do them, just keep your eye on the News section and when they release new units allow them again and if they work great, if not it's okay too. |
Send message Joined: 30 Oct 10 Posts: 6 Credit: 10,437,789 RAC: 0 |
I'm pretty much sure that the issue is not from the overclocking or the card itself. If my memories are right, error code 0xc0000005 on Windows means that a memory violation occurred ... but this is a Windows error, not something coming from the GPU. |
Send message Joined: 22 Aug 10 Posts: 32 Credit: 86,014,800 RAC: 0 |
Some MW 1.02 WUs came through today even though I have only Separation runs checked in my preferences. Of these, roughly half seemed to process alright. The others resulted in errors as usual. Still, this is the first time in quite a while I've seen ANY of those WUs successful. No changes to my setup. The MilkyWay@Home site shows my machine as having two HD 7900 series cards when in reality it's one HD 6950 and one HD 7950. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
The MilkyWay@Home site shows my machine as having two HD 7900 series cards when in reality it's one HD 6950 and one HD 7950. This happens for me too, I have a pc with dual gpu's in it and it says both are the higher card, when in fact they are not both that card. |
Send message Joined: 20 Nov 07 Posts: 2 Credit: 15,863,875 RAC: 0 |
Hi, I am getting the same errors on about 95% of the Milkyway jobs. I have had this issue with the since the last 3 official versions of BOINC and only on the GPU runs My PC has Windows 8.1 Pro 64 bit (it did this on Win 7 and regular Win 8 too). I have 3 Graphics cards (4 if you count the Intel GPU that some app can use),2x ATi and 1 nVidia in and a i7-3770 CPU with 32GB RAM http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=357704 hope you can see the host It seems only to affect my PC's with dual or more graphics cards |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Hi, Are you using a cc_config.xml file kind of like this one: <cc_config> <options> <use_all_gpus>1</use_all_gpus> <skip_cpu_benchmarks>1</skip_cpu_benchmarks> </options> </cc_config> Essentially it tells Boinc to use all the gpu's it finds, not just one for everything. I am not SURE that is your problem though, some units are working, but some aren't. What you might end up doing is much more complicated, but that doesn't seem to be a problem for you as multiple kinds of gpu's seems to be a piece of cake for you! One of the more complicated ways would be too use another cc_config.xml file like this one: <cc_config> <options> <use_all_gpus>1</use_all_gpus> <exclude_gpu> <url>http://boinc.fzk.de/poem/</url> <device_num>1</device_num> [<type>NVIDIA|ATI|intel_gpu</type>] </exclude_gpu> </options> </cc_config> or simply add <type>ATI</type> inside <exclude_gpu> without <device_num> specified. And simply exclude the Nvidia, or AMD gpu that is causing the problems here and attach to a 2nd project and exclude the cards that work here from that project. One other much more complicated way would be to install Boinc TWICE on the pc, to two different locations, and exclude the Nvidia gpu in one installation and the AMD gpu's in the other installation. |
Send message Joined: 20 Nov 07 Posts: 2 Credit: 15,863,875 RAC: 0 |
thanks for the reply, the GPU's all work by themselves but when combined in one setup the issue comes up and only for MW. My other apps run fine with this config and use all the GPU's like it's Xmas <cc_config> <options> <http_1_0>1</http_1_0> <use_all_gpus>1</use_all_gpus> </options> </cc_config> this is mine, straight forward. SETI, Collatz, Einstein and a few others running perfect with all the resources at its disposal |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
thanks for the reply, the GPU's all work by themselves but when combined in one setup the issue comes up and only for MW. I don't know then, you are way beyond what I have ever done. As part of the troubleshooting check though you might start excluding gpu's until you figure out which one is giving you the troubles, then see if you can fix it or just use it elsewhere. Right now it seems you don't know which one is the problem, right? |
Send message Joined: 30 Oct 10 Posts: 6 Credit: 10,437,789 RAC: 0 |
I'm pretty sure that something is wrong with MW application, but it looks like no project developer is watching the forums ... that's a shame for such a big project :( |
Send message Joined: 4 Oct 11 Posts: 38 Credit: 309,729,457 RAC: 0 |
I PM'd ascension that all his failures are on device 0 on platform 0 which is a turks, device 1 on platform 0 is a cape verde. according to wikipedia turks have no double precision, while cape verde does. so I wonder how his turks worked when by itself? Unless BOINC or Einstein is reporting an incorrect device? ex of failing task Found 2 CL devices Device 'Turks' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Driver version: 1268.1 (VM) Version: OpenCL 1.2 AMD-APP (1268.1) Compute capability: 0.0 Max compute units: 6 Clock frequency: 800 Mhz Global mem size: 1073741824 Local mem size: 32768 Max const buf size: 65536 Double extension: (none) Device doesn't support double precision Failed to calculate likelihood Is there anyway to just disable device 0 on platform 0? Leaving device 1 to process MilkyWay? |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
I PM'd ascension that all his failures are on device 0 on platform 0 which is a You are correct the 66?? series does NOT have DP, good catch!! As for not using it here yes he he can use the exclude_gpu line like this: <cc_config> <options> <use_all_gpus>1</use_all_gpus> <exclude_gpu> <url>http://milkyway.cs.rpi.edu/milkyway/</url> <device_num>0</device_num> </exclude_gpu> </options> </cc_config> If he replaces his current cc_config.xml file with the one above it should work just fine and exclude gpu zero from MilkyWay. To use gpu zero on another project such as Poem for gpu zero he will have to add lines such as: <exclude_gpu> <url>http://boinc.fzk.de/poem/</url> <device_num>1</device_num> </exclude_gpu> <exclude_gpu> <url>http://boinc.fzk.de/poem/</url> <device_num>2</device_num> </exclude_gpu> Adding the above lines would exclude gpu's 1 and 2 from Poem. On the homepage of every Boinc project is a website link that it tells you to use if it isn't on the list of projects, use that address to replace the address above if Poem is not your project of choice. |
Send message Joined: 7 Dec 10 Posts: 1 Credit: 72,892,302 RAC: 0 |
Hi. I've also been getting computation errors from all of the 'de_separation' units I've been running, for a couple weeks now. I'm running an ATI 6950 w/ Cat 13.9 in my system. Should I exclude Milkyway from using it? The 'modified fit' units run without errors. Thanks. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Hi. I have a similar pc and am also running the 'modified fit' units but on my cpu, not my gpu, and they are working just fine. If it were me yes I would exclude the gpu from MilkyWay and the sign onto another project and use it there. I see you also run Seti, I think they have gpu units, so you could run cpu units here and the gpu units from there, contributing to two projects at once. There are several other gpu projects out there that would love to have your gpu help them out..in no particular order there is Collaz, Moo, Prime Grid, DistrRTgen, Poem, GpuGrid, World Community Grid)although they do not always have a gpu project running), Einstein and I am sure I am forgetting some others too. |
©2024 Astroinformatics Group