Message boards :
Number crunching :
6.10.1 Posted.
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Send message Joined: 29 Aug 07 Posts: 327 Credit: 116,463,193 RAC: 0 |
|
Send message Joined: 26 Jun 09 Posts: 47 Credit: 276,827,695 RAC: 0 |
I just noticed that there is a ".2" version now posted for MAC and Linus so maybe there is a fix in the works. Bryan |
Send message Joined: 16 Jun 09 Posts: 85 Credit: 172,476 RAC: 0 |
I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows "If a client is requesting only GPU jobs at a point where it has no CPU jobs, that's a client bug. Please ask them to set the <work_fetch_debug> flag in their cc_config.xml, and send me the resulting message log." If possible attach a log of this happening so it can be resolved. |
Send message Joined: 26 Jun 09 Posts: 47 Credit: 276,827,695 RAC: 0 |
I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows In my case it was the opposite, I had CPU work but it wouldn't request new work for the empty GPU. The system had worked excellently for over 3 hours filling the MW GPU cache continuously. 3 hours later (after installation) it quit requesting GPU work and I couldn't manually force it. Bryan |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
Holy crap...this might actually work! After installing it then uninstalling it when it refused to ask for any MW work. I re-installed it, set all the debts to 0 then got frustrated that it still refused to ask for MW work. I let it be for 10 minutes then the debts started rising and it asked for MW work....WOOOHOO! I'll now let it run and see what happens. It's a shame AQUA have stopped sending out new work as I think with it running the scheduler may still behave incorrectly. XP BOINC 6.10.1 Aqua 24.83% MW 50.34% Seti 24.83% Connect every 0.05 days Additional Work Buffer 0.75 days <MW> avg_ncpus = 0.15 max_ncpus = 1.0 ATI 0.33 cmdline n3 f15 w0.8 </MW> overall ncpus=5 (on my quad) currently all but 4 seti wu's suspended - no Aqua wu available. Might go back to ncpus=4 and take the suspend off the cached seti wu's and see what happens a little later today. But so far so good! |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
Work scheduler is still FUBAR. CPU scheduling Priority (aka short term debt) = 4,412 Down to 2 wu's running from a potential of 6 runnings at once (I have 2 cards installed) and I get 29/08/2009 5:40:12 AM Milkyway@home Sending scheduler request: To fetch work. 29/08/2009 5:40:12 AM Milkyway@home Reporting 6 completed tasks, not requesting new tasks 29/08/2009 5:40:17 AM Milkyway@home Scheduler request completed: got 0 new tasks 29/08/2009 5:41:22 AM Milkyway@home Sending scheduler request: To fetch work. 29/08/2009 5:41:22 AM Milkyway@home Not reporting or requesting tasks 29/08/2009 5:41:27 AM Milkyway@home Scheduler request completed: got 0 new tasks Then when the last 2 MW wu's have completed I get 29/08/2009 5:42:32 AM Milkyway@home Sending scheduler request: To fetch work. 29/08/2009 5:42:32 AM Milkyway@home Reporting 2 completed tasks, not requesting new tasks 29/08/2009 5:42:37 AM Milkyway@home Scheduler request completed: got 0 new tasks Absolutely FUBAR! A minute later 29/08/2009 5:43:42 AM Milkyway@home Sending scheduler request: To fetch work. 29/08/2009 5:43:42 AM Milkyway@home Not reporting or requesting tasks 29/08/2009 5:43:47 AM Milkyway@home Scheduler request completed: got 0 new tasks I checked the short term debt and it's been reset to 0. WTF! How can we utilise the later versions of BOINC that has ATI support when this sort of crap occurs. Back to 6.4.7 -> not perfect, but atleast it will ask for MW work! [edit] And just as I hit the post button, I see 29/08/2009 5:44:52 AM Milkyway@home Sending scheduler request: To fetch work. 29/08/2009 5:44:52 AM Milkyway@home Requesting new tasks 29/08/2009 5:45:02 AM Milkyway@home Scheduler request completed: got 24 new tasks Really...WTF! Can someone explain this to me, because I can't comprehend it and I know JM7 is too defensive about it. [edit#2] Also why aren't the ATI cards listed in the computer details? [edit#3] I've dropped the connect preference to 0.01 and put the work buffer up to 0.9. |
Send message Joined: 28 Jan 09 Posts: 31 Credit: 85,934,108 RAC: 0 |
The drivers I used are:- ATI_Catalyst_Windows7_8.612_no_CCC.exe these have an AMD green logo on the .exe not a red ATI catalyst logo. Anything with an ATI logo, from th ATI site, I downloaded does not work. I got them via alot of wandering on the AMD site and avoiding the ATI Catalyst site. The URL I used is http://support.amd.com/us/gpudownload/windows/9-4/Pages/radeonaiw_vista64.aspx?&lang=English Now go down until you see the 4th download button it should say 'Driver only for Windows 7' and the version is 8.612.1 These work for me. |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
Back to 6.4.7 and SNAFU! 29/08/2009 6:04:45 AM|Milkyway@home|Sending scheduler request: To fetch work. Requesting 346561 seconds of work, reporting 0 completed tasks I can't wait until a real scheduler comes out! |
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows well.. i just had a look at the work fetch code and there's a bug where it logs work requests from ATI cards. Instead it shows the cuda crap, which of course will be ZERO ... so the work_fetch_debug log is useless. I've send DA an email about it. Join Support science! Joinc Team BOINC United now! |
Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0 |
I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows It is the same bug. Over two months ago now I sent them logs on this issue. I had the problem with GPU requests, another user had the issue with CPU requests. Part of the problem is still that no one wants to admit that the internal models used essentially assume that there is only one processing element. This is then used to calculate the "need". Of course when you have 10 or more processing elements perhaps of mixed capability (CPU and GPU) the simple model is likely to come up with wrong answers. Complicating matters is the related issue that there are more than a couple bugs in the whole work fetch and work scheduling sections of the code. Richard H. was looking at one that I demonstrated was an initialization issue that he was seeing pretty consistently and as far as I know that issue remains (net result is that the wrong project is selected for work fetch). |
Send message Joined: 9 Feb 09 Posts: 166 Credit: 27,520,813 RAC: 0 |
I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows The problem with the fetch of units is also on all other versions, i have 3 versions running : 6.4.7 , 6.6.36 and the 6.10.1. On all of these machines i had no work for some hours, so indeed your right guys its the work fetch bug and the multiple capability on machines. The funny thing is when you reinstall boinc it prolly gets direct new units for a couple of hours, or not lol Its new, its relative fast... my new bicycle |
Send message Joined: 18 Feb 09 Posts: 158 Credit: 110,699,054 RAC: 0 |
I've been running 6.10.1 for a few hours and it seems to be fetchign work just fine. It's nice to see all four cores running WCG for a change while Milkyway is going. I did also modify the app_info.xml to run 3 instead of four WUs at a time tho. Good stuff. :) |
Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0 |
The problem with the fetch of units is also on all other versions, i have 3 versions running : 6.4.7 , 6.6.36 and the 6.10.1. The problems with work fetch, exclusive of single vs. multiple core started in the 6.x.y series. The version 5.x.y had a "better" version. Sadly the emphasis has been on new features rather than trying to unbug the versions extant (shades of Microsoft). Even more interesting is the way version changes and some of the "fixes" cover and uncover longstanding bugs rather than to fix the underlying causes. The work fetch problem can sometimes be, as you noted, covered up by reinstall or more simply by resetting debts. More odd, as some note, they don't seem to see the issues. Part of that may be from the project mix they have, phases of the moon, or they just don't notice that the problem happens at all. In my case, for the moment, for example, I am not going to see these issues as I am running all my computers on only two projects; WCG and GPU Grid one to keep the GPUs busy and the other to run up my badge colors (I now have several Emerald, a couple Ruby, 3 Gold, even Beta is now Bronze (Yea!)) ... but, when I have multiple projects I can see that BOINC no longer maintains a "balance" of work from the various attached projects ... what happens is that I get overloads from one project and then another ... granted it balances out, but it should, on a multi-CPU system, maintain a better balance of work... Anyway, hopefully this weekend I can get my ATI card back into the fray with an install of Vista (I bought it with Snow Leopard which seems to be running well)... |
Send message Joined: 29 Jul 08 Posts: 267 Credit: 188,848,188 RAC: 0 |
The problem with the fetch of units is also on all other versions, i have 3 versions running : 6.4.7 , 6.6.36 and the 6.10.1. I think Yer right Paul D. Buck, I'd stopped the MW gpu(Nvidia) work a few hours back and now the MW cpu has started up. I use XP x64 sp2, Boinc 6.10.1 and 190.62(WHQL). |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
I am still getting the 24 units, grunch them, wait a minute and get another 24. Lather, rinse, repeat. |
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
I am still getting the 24 units, grunch them, wait a minute and get another 24. You're lucky then ;) I had another look at the code today and found 6 more bugs and a whole codeblock for the ATIs is completely missing. I guess that someone has some work to do on Monday... Join Support science! Joinc Team BOINC United now! |
Send message Joined: 26 Jun 09 Posts: 47 Credit: 276,827,695 RAC: 0 |
The 1st night I ran 6.10.1 w/ AQUA / MW the GPU went dry for 6 hours and didn't do any work. I reinstalled yesterday and things are more or less working. For quite a while (while I had AQUA work) it would download a new wu when it finished 1. Since last night when I started running Collatz it has been doing what someone else reported. It runs MW dry, waits 3 mintues, and then download 48 units. Even with a 5% hit on MW throughput I would be willing to use it because I've never been able to play w/ the CPU running 100% and the GPU running 99% loading and not have to babysit the system! The problem does appear to be the STD. It goes up to 30,000 over the 48 units, resets to zero, and then it will download new units. BTW, thank you Crunch3r!!!! Bryan |
Send message Joined: 28 Feb 09 Posts: 5 Credit: 10,708,368 RAC: 0 |
Thanks Crunch3r, x64 all - 610.1/W7/8.612 driver Before: After: At end of the current Que I'll likely try the Red Pill and see if that helps with stalled downloads of new MW-WUs. SeriousCrunchers@Home |
Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0 |
Ah, my problem exactly ... Thanks for the hints... Though I opted for a different set of numbers to get what I wanted, 4 WCG tasks running on the 4 cores and one task running on the ATI GPU. So my parameters: <max_ncpus>1.0</max_ncpus> <avg_ncpus>0.05</avg_ncpus> <coproc> <type>ATI</type> <count>1.0</count> </coproc> <cmdline>n1</cmdline> I did not add the line: "<flops>1.0e11</flops>" and it still seems to work as I would expect, with the status message of: "Running (0.05 CPUs + 1.0 ATI GPUs)" or "Waiting to run (yada yada)" Of course, only time will tell if this works long term which has been the ban of many a "fix" that works for a short while but fails after some time... I suspect, that for those that want to run with 3 tasks on the GPU the numbers would be: <max_ncpus>1.0</max_ncpus> <avg_ncpus>0.05</avg_ncpus> <coproc> <type>ATI</type> <count>0.33</count> </coproc> <cmdline>n3</cmdline> But I have not tried this setting set... |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
That is exactly how I have mine set, except for the <avg_ncpus> is at 0.10 on mine. |
©2024 Astroinformatics Group