Welcome to MilkyWay@home

6.10.1 Posted.

Message boards : Number crunching : 6.10.1 Posted.
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Labbie
Avatar

Send message
Joined: 29 Aug 07
Posts: 327
Credit: 116,463,193
RAC: 0
Message 29880 - Posted: 28 Aug 2009, 19:08:04 UTC

@Crunch3r, I saw it for the first time yesterday.


Calm Chaos Forum...Join Calm Chaos Now
ID: 29880 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bryan

Send message
Joined: 26 Jun 09
Posts: 47
Credit: 276,827,695
RAC: 0
Message 29882 - Posted: 28 Aug 2009, 19:27:36 UTC - in response to Message 29880.  

I just noticed that there is a ".2" version now posted for MAC and Linus so maybe there is a fix in the works.
Bryan

ID: 29882 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Anthony Waters

Send message
Joined: 16 Jun 09
Posts: 85
Credit: 172,476
RAC: 0
Message 29883 - Posted: 28 Aug 2009, 19:30:29 UTC - in response to Message 29880.  

I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows

"If a client is requesting only GPU jobs at a point where it has no CPU jobs,
that's a client bug. Please ask them to set the <work_fetch_debug> flag in their cc_config.xml, and send me the resulting message log."

If possible attach a log of this happening so it can be resolved.
ID: 29883 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bryan

Send message
Joined: 26 Jun 09
Posts: 47
Credit: 276,827,695
RAC: 0
Message 29884 - Posted: 28 Aug 2009, 19:34:20 UTC - in response to Message 29883.  

I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows

"If a client is requesting only GPU jobs at a point where it has no CPU jobs,
that's a client bug. Please ask them to set the <work_fetch_debug> flag in their cc_config.xml, and send me the resulting message log."

If possible attach a log of this happening so it can be resolved.


In my case it was the opposite, I had CPU work but it wouldn't request new work for the empty GPU. The system had worked excellently for over 3 hours filling the MW GPU cache continuously. 3 hours later (after installation) it quit requesting GPU work and I couldn't manually force it.
Bryan

ID: 29884 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 29885 - Posted: 28 Aug 2009, 19:35:43 UTC

Holy crap...this might actually work! After installing it then uninstalling it when it refused to ask for any MW work. I re-installed it, set all the debts to 0 then got frustrated that it still refused to ask for MW work. I let it be for 10 minutes then the debts started rising and it asked for MW work....WOOOHOO! I'll now let it run and see what happens. It's a shame AQUA have stopped sending out new work as I think with it running the scheduler may still behave incorrectly.

XP
BOINC 6.10.1
Aqua 24.83%
MW 50.34%
Seti 24.83%
Connect every 0.05 days
Additional Work Buffer 0.75 days
<MW>
avg_ncpus = 0.15
max_ncpus = 1.0
ATI 0.33
cmdline n3 f15 w0.8
</MW>

overall
ncpus=5 (on my quad)

currently all but 4 seti wu's suspended - no Aqua wu available.

Might go back to ncpus=4 and take the suspend off the cached seti wu's and see what happens a little later today.

But so far so good!

ID: 29885 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 29886 - Posted: 28 Aug 2009, 19:46:21 UTC
Last modified: 28 Aug 2009, 19:55:17 UTC

Work scheduler is still FUBAR.

CPU scheduling Priority (aka short term debt) = 4,412

Down to 2 wu's running from a potential of 6 runnings at once (I have 2 cards installed) and I get

29/08/2009 5:40:12 AM Milkyway@home Sending scheduler request: To fetch work.
29/08/2009 5:40:12 AM Milkyway@home Reporting 6 completed tasks, not requesting new tasks
29/08/2009 5:40:17 AM Milkyway@home Scheduler request completed: got 0 new tasks
29/08/2009 5:41:22 AM Milkyway@home Sending scheduler request: To fetch work.
29/08/2009 5:41:22 AM Milkyway@home Not reporting or requesting tasks
29/08/2009 5:41:27 AM Milkyway@home Scheduler request completed: got 0 new tasks

Then when the last 2 MW wu's have completed I get
29/08/2009 5:42:32 AM Milkyway@home Sending scheduler request: To fetch work.
29/08/2009 5:42:32 AM Milkyway@home Reporting 2 completed tasks, not requesting new tasks
29/08/2009 5:42:37 AM Milkyway@home Scheduler request completed: got 0 new tasks

Absolutely FUBAR!

A minute later
29/08/2009 5:43:42 AM Milkyway@home Sending scheduler request: To fetch work.
29/08/2009 5:43:42 AM Milkyway@home Not reporting or requesting tasks
29/08/2009 5:43:47 AM Milkyway@home Scheduler request completed: got 0 new tasks

I checked the short term debt and it's been reset to 0. WTF!

How can we utilise the later versions of BOINC that has ATI support when this sort of crap occurs.

Back to 6.4.7 -> not perfect, but atleast it will ask for MW work!

[edit]
And just as I hit the post button, I see

29/08/2009 5:44:52 AM Milkyway@home Sending scheduler request: To fetch work.
29/08/2009 5:44:52 AM Milkyway@home Requesting new tasks
29/08/2009 5:45:02 AM Milkyway@home Scheduler request completed: got 24 new tasks

Really...WTF! Can someone explain this to me, because I can't comprehend it and I know JM7 is too defensive about it.

[edit#2]
Also why aren't the ATI cards listed in the computer details?

[edit#3]
I've dropped the connect preference to 0.01 and put the work buffer up to 0.9.
ID: 29886 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JockMacMad TSBT
Avatar

Send message
Joined: 28 Jan 09
Posts: 31
Credit: 85,934,108
RAC: 0
Message 29887 - Posted: 28 Aug 2009, 20:05:16 UTC
Last modified: 28 Aug 2009, 20:05:38 UTC

The drivers I used are:-

ATI_Catalyst_Windows7_8.612_no_CCC.exe

these have an AMD green logo on the .exe not a red ATI catalyst logo. Anything with an ATI logo, from th ATI site, I downloaded does not work.

I got them via alot of wandering on the AMD site and avoiding the ATI Catalyst site. The URL I used is http://support.amd.com/us/gpudownload/windows/9-4/Pages/radeonaiw_vista64.aspx?&lang=English

Now go down until you see the 4th download button it should say 'Driver only for Windows 7' and the version is 8.612.1

These work for me.
ID: 29887 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 29888 - Posted: 28 Aug 2009, 20:10:35 UTC

Back to 6.4.7 and SNAFU!

29/08/2009 6:04:45 AM|Milkyway@home|Sending scheduler request: To fetch work. Requesting 346561 seconds of work, reporting 0 completed tasks

I can't wait until a real scheduler comes out!
ID: 29888 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Crunch3r
Volunteer developer
Avatar

Send message
Joined: 17 Feb 08
Posts: 363
Credit: 258,227,990
RAC: 0
Message 29890 - Posted: 28 Aug 2009, 20:29:57 UTC - in response to Message 29883.  

I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows

"If a client is requesting only GPU jobs at a point where it has no CPU jobs,
that's a client bug. Please ask them to set the <work_fetch_debug> flag in their cc_config.xml, and send me the resulting message log."

If possible attach a log of this happening so it can be resolved.


well.. i just had a look at the work fetch code and there's a bug where it logs work requests from ATI cards. Instead it shows the cuda crap, which of course will be ZERO ... so the work_fetch_debug log is useless.

I've send DA an email about it.

Join Support science! Joinc Team BOINC United now!
ID: 29890 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 29892 - Posted: 28 Aug 2009, 21:01:37 UTC - in response to Message 29890.  

I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows

"If a client is requesting only GPU jobs at a point where it has no CPU jobs,
that's a client bug. Please ask them to set the <work_fetch_debug> flag in their cc_config.xml, and send me the resulting message log."

If possible attach a log of this happening so it can be resolved.


well.. i just had a look at the work fetch code and there's a bug where it logs work requests from ATI cards. Instead it shows the cuda crap, which of course will be ZERO ... so the work_fetch_debug log is useless.

I've send DA an email about it.

It is the same bug. Over two months ago now I sent them logs on this issue. I had the problem with GPU requests, another user had the issue with CPU requests.

Part of the problem is still that no one wants to admit that the internal models used essentially assume that there is only one processing element. This is then used to calculate the "need". Of course when you have 10 or more processing elements perhaps of mixed capability (CPU and GPU) the simple model is likely to come up with wrong answers.

Complicating matters is the related issue that there are more than a couple bugs in the whole work fetch and work scheduling sections of the code. Richard H. was looking at one that I demonstrated was an initialization issue that he was seeing pretty consistently and as far as I know that issue remains (net result is that the wrong project is selected for work fetch).
ID: 29892 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile uBronan
Avatar

Send message
Joined: 9 Feb 09
Posts: 166
Credit: 27,520,813
RAC: 0
Message 29901 - Posted: 29 Aug 2009, 0:14:36 UTC - in response to Message 29892.  

I've sent an email to the boinc_projects mailing list about the issue of not receiving CPU work, referenced in the thread Ice posted earlier. The response that I received is as follows

"If a client is requesting only GPU jobs at a point where it has no CPU jobs,
that's a client bug. Please ask them to set the <work_fetch_debug> flag in their cc_config.xml, and send me the resulting message log."

If possible attach a log of this happening so it can be resolved.


well.. i just had a look at the work fetch code and there's a bug where it logs work requests from ATI cards. Instead it shows the cuda crap, which of course will be ZERO ... so the work_fetch_debug log is useless.

I've send DA an email about it.

It is the same bug. Over two months ago now I sent them logs on this issue. I had the problem with GPU requests, another user had the issue with CPU requests.

Part of the problem is still that no one wants to admit that the internal models used essentially assume that there is only one processing element. This is then used to calculate the "need". Of course when you have 10 or more processing elements perhaps of mixed capability (CPU and GPU) the simple model is likely to come up with wrong answers.

Complicating matters is the related issue that there are more than a couple bugs in the whole work fetch and work scheduling sections of the code. Richard H. was looking at one that I demonstrated was an initialization issue that he was seeing pretty consistently and as far as I know that issue remains (net result is that the wrong project is selected for work fetch).


The problem with the fetch of units is also on all other versions, i have 3 versions running : 6.4.7 , 6.6.36 and the 6.10.1.
On all of these machines i had no work for some hours, so indeed your right guys its the work fetch bug and the multiple capability on machines.
The funny thing is when you reinstall boinc it prolly gets direct new units for a couple of hours, or not lol
Its new, its relative fast... my new bicycle
ID: 29901 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Zanth
Avatar

Send message
Joined: 18 Feb 09
Posts: 158
Credit: 110,699,054
RAC: 0
Message 29902 - Posted: 29 Aug 2009, 0:51:41 UTC - in response to Message 29901.  

I've been running 6.10.1 for a few hours and it seems to be fetchign work just fine. It's nice to see all four cores running WCG for a change while Milkyway is going. I did also modify the app_info.xml to run 3 instead of four WUs at a time tho. Good stuff. :)
ID: 29902 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 29904 - Posted: 29 Aug 2009, 1:23:29 UTC - in response to Message 29901.  

The problem with the fetch of units is also on all other versions, i have 3 versions running : 6.4.7 , 6.6.36 and the 6.10.1.
On all of these machines i had no work for some hours, so indeed your right guys its the work fetch bug and the multiple capability on machines.
The funny thing is when you reinstall boinc it prolly gets direct new units for a couple of hours, or not lol

The problems with work fetch, exclusive of single vs. multiple core started in the 6.x.y series. The version 5.x.y had a "better" version.

Sadly the emphasis has been on new features rather than trying to unbug the versions extant (shades of Microsoft).

Even more interesting is the way version changes and some of the "fixes" cover and uncover longstanding bugs rather than to fix the underlying causes.

The work fetch problem can sometimes be, as you noted, covered up by reinstall or more simply by resetting debts. More odd, as some note, they don't seem to see the issues. Part of that may be from the project mix they have, phases of the moon, or they just don't notice that the problem happens at all.

In my case, for the moment, for example, I am not going to see these issues as I am running all my computers on only two projects; WCG and GPU Grid one to keep the GPUs busy and the other to run up my badge colors (I now have several Emerald, a couple Ruby, 3 Gold, even Beta is now Bronze (Yea!)) ... but, when I have multiple projects I can see that BOINC no longer maintains a "balance" of work from the various attached projects ... what happens is that I get overloads from one project and then another ... granted it balances out, but it should, on a multi-CPU system, maintain a better balance of work...

Anyway, hopefully this weekend I can get my ATI card back into the fray with an install of Vista (I bought it with Snow Leopard which seems to be running well)...
ID: 29904 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zoom314
Avatar

Send message
Joined: 29 Jul 08
Posts: 267
Credit: 188,848,188
RAC: 0
Message 29905 - Posted: 29 Aug 2009, 2:12:28 UTC - in response to Message 29904.  

The problem with the fetch of units is also on all other versions, i have 3 versions running : 6.4.7 , 6.6.36 and the 6.10.1.
On all of these machines i had no work for some hours, so indeed your right guys its the work fetch bug and the multiple capability on machines.
The funny thing is when you reinstall boinc it prolly gets direct new units for a couple of hours, or not lol

The problems with work fetch, exclusive of single vs. multiple core started in the 6.x.y series. The version 5.x.y had a "better" version.

Sadly the emphasis has been on new features rather than trying to unbug the versions extant (shades of Microsoft).

Even more interesting is the way version changes and some of the "fixes" cover and uncover longstanding bugs rather than to fix the underlying causes.

The work fetch problem can sometimes be, as you noted, covered up by reinstall or more simply by resetting debts. More odd, as some note, they don't seem to see the issues. Part of that may be from the project mix they have, phases of the moon, or they just don't notice that the problem happens at all.

In my case, for the moment, for example, I am not going to see these issues as I am running all my computers on only two projects; WCG and GPU Grid one to keep the GPUs busy and the other to run up my badge colors (I now have several Emerald, a couple Ruby, 3 Gold, even Beta is now Bronze (Yea!)) ... but, when I have multiple projects I can see that BOINC no longer maintains a "balance" of work from the various attached projects ... what happens is that I get overloads from one project and then another ... granted it balances out, but it should, on a multi-CPU system, maintain a better balance of work...

Anyway, hopefully this weekend I can get my ATI card back into the fray with an install of Vista (I bought it with Snow Leopard which seems to be running well)...

I think Yer right Paul D. Buck, I'd stopped the MW gpu(Nvidia) work a few hours back and now the MW cpu has started up. I use XP x64 sp2, Boinc 6.10.1 and 190.62(WHQL).

ID: 29905 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 29909 - Posted: 29 Aug 2009, 8:00:44 UTC

I am still getting the 24 units, grunch them, wait a minute and get another 24.

Lather, rinse, repeat.
ID: 29909 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Crunch3r
Volunteer developer
Avatar

Send message
Joined: 17 Feb 08
Posts: 363
Credit: 258,227,990
RAC: 0
Message 29916 - Posted: 29 Aug 2009, 17:12:21 UTC - in response to Message 29909.  

I am still getting the 24 units, grunch them, wait a minute and get another 24.

Lather, rinse, repeat.


You're lucky then ;)

I had another look at the code today and found 6 more bugs and a whole codeblock for the ATIs is completely missing.

I guess that someone has some work to do on Monday...

Join Support science! Joinc Team BOINC United now!
ID: 29916 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bryan

Send message
Joined: 26 Jun 09
Posts: 47
Credit: 276,827,695
RAC: 0
Message 29918 - Posted: 29 Aug 2009, 21:13:23 UTC - in response to Message 29916.  
Last modified: 29 Aug 2009, 21:26:43 UTC

The 1st night I ran 6.10.1 w/ AQUA / MW the GPU went dry for 6 hours and didn't do any work. I reinstalled yesterday and things are more or less working. For quite a while (while I had AQUA work) it would download a new wu when it finished 1. Since last night when I started running Collatz it has been doing what someone else reported. It runs MW dry, waits 3 mintues, and then download 48 units.

Even with a 5% hit on MW throughput I would be willing to use it because I've never been able to play w/ the CPU running 100% and the GPU running 99% loading and not have to babysit the system!

The problem does appear to be the STD. It goes up to 30,000 over the 48 units, resets to zero, and then it will download new units.

BTW, thank you Crunch3r!!!!
Bryan

ID: 29918 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Orakk
Avatar

Send message
Joined: 28 Feb 09
Posts: 5
Credit: 10,708,368
RAC: 0
Message 29928 - Posted: 30 Aug 2009, 4:48:19 UTC - in response to Message 29739.  
Last modified: 30 Aug 2009, 4:49:51 UTC


<app_info>
 <app>
 <name>milkyway</name>
 </app>
 <file_info>
  <name>astronomy_0.19_ATI_x64f.exe</name>
  <executable/>
 </file_info>
 <file_info>
  <name>brook.dll</name>
  <executable/>
 </file_info>
 <app_version>
  <app_name>milkyway</app_name>
  <version_num>19</version_num>
<max_ncpus>1.0</max_ncpus>
<avg_ncpus>0.05</avg_ncpus>
<coproc>
<type>ATI</type>
<count>0.25</count>
</coproc>
<flops>1.0e11</flops>
    <cmdline>n4</cmdline>
<file_ref>
   <file_name>astronomy_0.19_ATI_x64f.exe</file_name>
   <main_program/>
  </file_ref>
  <file_ref>
   <file_name>brook.dll</file_name>
  </file_ref>
 </app_version>
</app_info>




Thanks Crunch3r,

x64 all - 610.1/W7/8.612 driver

Before:



After:



At end of the current Que I'll likely try the Red Pill and see if that helps with stalled downloads of new MW-WUs.
SeriousCrunchers@Home
ID: 29928 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 29929 - Posted: 30 Aug 2009, 6:20:34 UTC

Ah, my problem exactly ...

Thanks for the hints...

Though I opted for a different set of numbers to get what I wanted, 4 WCG tasks running on the 4 cores and one task running on the ATI GPU. So my parameters:

<max_ncpus>1.0</max_ncpus>
<avg_ncpus>0.05</avg_ncpus>
<coproc>
<type>ATI</type>
<count>1.0</count>
</coproc>
<cmdline>n1</cmdline>

I did not add the line: "<flops>1.0e11</flops>" and it still seems to work as I would expect, with the status message of:

"Running (0.05 CPUs + 1.0 ATI GPUs)" or "Waiting to run (yada yada)"

Of course, only time will tell if this works long term which has been the ban of many a "fix" that works for a short while but fails after some time...

I suspect, that for those that want to run with 3 tasks on the GPU the numbers would be:

<max_ncpus>1.0</max_ncpus>
<avg_ncpus>0.05</avg_ncpus>
<coproc>
<type>ATI</type>
<count>0.33</count>
</coproc>
<cmdline>n3</cmdline>

But I have not tried this setting set...
ID: 29929 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 29930 - Posted: 30 Aug 2009, 7:48:51 UTC - in response to Message 29929.  



I suspect, that for those that want to run with 3 tasks on the GPU the numbers would be:

<max_ncpus>1.0</max_ncpus>
<avg_ncpus>0.05</avg_ncpus>
<coproc>
<type>ATI</type>
<count>0.33</count>
</coproc>
<cmdline>n3</cmdline>

But I have not tried this setting set...


That is exactly how I have mine set, except for the
<avg_ncpus> is at 0.10 on mine.
ID: 29930 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : 6.10.1 Posted.

©2024 Astroinformatics Group