Welcome to MilkyWay@home

quorum down to 2

Message boards : News : quorum down to 2
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 38165 - Posted: 6 Apr 2010, 2:11:10 UTC

The database is having a bit of trouble keeping up with all the new results due to a quorum of 3, so for the time being I'm dropping it to a quorum of 2.


On another note, we should have source code for the new application available tomorrow.
ID: 38165 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 21 Aug 08
Posts: 625
Credit: 558,425
RAC: 0
Message 38197 - Posted: 6 Apr 2010, 15:41:20 UTC - in response to Message 38165.  
Last modified: 6 Apr 2010, 15:52:10 UTC

The database is having a bit of trouble keeping up with all the new results due to a quorum of 3, so for the time being I'm dropping it to a quorum of 2.
<br>
On another note, we should have source code for the new application available tomorrow.


If things are not better soon, could you consider figuring out a way to utilize "Homogeneous Redundancy"-like capability to group GPUs away from CPUs? I know I'm not adding much in comparison to people with GPUs anymore, but now that you're grouping me with GPUs that are having problems validating, I'm burning electricity for nothing sometimes.

Thanks...

Edit: Oh, and my Pentium 4 got grouped with 3 other 5800 series GPUs, which in another thread you state aren't matching up to other architectures, so my result, which is probably the one you should've accepted, got dumped as invalid simply because they formed a quorum with their matching each other...

WU 90248204
ID: 38197 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 38206 - Posted: 6 Apr 2010, 19:33:17 UTC - in response to Message 38197.  


If things are not better soon, could you consider figuring out a way to utilize "Homogeneous Redundancy"-like capability to group GPUs away from CPUs? I know I'm not adding much in comparison to people with GPUs anymore, but now that you're grouping me with GPUs that are having problems validating, I'm burning electricity for nothing sometimes.


That won't fix the problem (at least on our end). We want the results to be accurate. If we're getting certain results from certain OS/architectures and certain results from others how do we know which ones are the right ones?

I'm really hoping we have this sorted out in the next couple days. I know having invalid results is really frustrating.
ID: 38206 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Crunch3r
Volunteer developer
Avatar

Send message
Joined: 17 Feb 08
Posts: 363
Credit: 258,227,990
RAC: 0
Message 38208 - Posted: 6 Apr 2010, 19:58:59 UTC - in response to Message 38206.  
Last modified: 6 Apr 2010, 20:16:48 UTC


If things are not better soon, could you consider figuring out a way to utilize "Homogeneous Redundancy"-like capability to group GPUs away from CPUs? I know I'm not adding much in comparison to people with GPUs anymore, but now that you're grouping me with GPUs that are having problems validating, I'm burning electricity for nothing sometimes.


That won't fix the problem (at least on our end). We want the results to be accurate. If we're getting certain results from certain OS/architectures and certain results from others how do we know which ones are the right ones?

I'm really hoping we have this sorted out in the next couple days. I know having invalid results is really frustrating.


FWIW... isn't it possible to stop sending WUs to 58xx cards for the moment till the problem is fixed?

Should be possbile to include a check for the bold part below in the sched_customize.cpp, since that one is included in each work request....

<coproc_ati>
<count>1</count>
<name>ATI Radeon HD5800 series (Cypress)</name>
<req_secs>0.000000</req_secs>
<req_instances>0.000000</req_instances>

Join Support science! Joinc Team BOINC United now!
ID: 38208 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 38209 - Posted: 6 Apr 2010, 20:15:59 UTC
Last modified: 6 Apr 2010, 20:28:55 UTC

My HD3850 and HD4850 have been swapped from Collatz to Milkyway. ATM they are running down the Collatz caches, and have Milkyway WUs ready to start crunching.

I took my HD5850 off Milkyway and returned it to Collatz.

I will continue running this way until a resolution has been found, and give their results as my contribution to that resolution and the science of the project.
Go away, I was asleep


ID: 38209 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 38221 - Posted: 6 Apr 2010, 21:28:55 UTC - in response to Message 38209.  

John - my 4850 cards also have work unit validation failures....
ID: 38221 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 38223 - Posted: 6 Apr 2010, 21:33:13 UTC - in response to Message 38221.  

John - my 4850 cards also have work unit validation failures....

If you look at your invalid WUs on your 4850 I think you will find that they were invalidated because a quorum was achieved by 58xx / 59xx cards, a quorum of incorrect results I might add.
ID: 38223 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 38224 - Posted: 6 Apr 2010, 21:44:49 UTC - in response to Message 38223.  

That could be -- though that means as long as 58xx cards are completing work, doing work with 48xx cards suffers as well.

I also get the 'inconclusive' result - which (I believe) tosses the result into limbo -- just as bad I'd guess.

In any event, until the code and validation efforts yield a solution, it seems to make sense to simply go to 'hover mode' regarding MW work.



John - my 4850 cards also have work unit validation failures....

If you look at your invalid WUs on your 4850 I think you will find that they were invalidated because a quorum was achieved by 58xx / 59xx cards, a quorum of incorrect results I might add.


ID: 38224 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 38227 - Posted: 6 Apr 2010, 22:06:51 UTC

It was my impression that the HD58xx problem has tentatively been identifies as THE issue, and if people move their HD58xx to other ATI GPU projects it will help with identification as mentioned here. That is why I've moved my GPUs around.

I appreciate what Barry is saying, as 58xx GPUs will remain in the MW pool and I can get invalidated results as a consequence.

But if enough transferred their 58xx then the remainder may result in fewer inconclusive/invalid results. Win-win?
Go away, I was asleep


ID: 38227 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 38229 - Posted: 6 Apr 2010, 22:24:28 UTC - in response to Message 38224.  

That could be -- though that means as long as 58xx cards are completing work, doing work with 48xx cards suffers as well.

I also get the 'inconclusive' result - which (I believe) tosses the result into limbo -- just as bad I'd guess.

That just means that a quorum has not yet been reached. Seems to me that everyone should leave their 48xx and 38xx cards here so that most of the invalid 58xx/59xx results will be overwhelmed by correct results.

ID: 38229 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 21 Aug 08
Posts: 625
Credit: 558,425
RAC: 0
Message 38230 - Posted: 6 Apr 2010, 22:35:47 UTC - in response to Message 38206.  


If things are not better soon, could you consider figuring out a way to utilize "Homogeneous Redundancy"-like capability to group GPUs away from CPUs? I know I'm not adding much in comparison to people with GPUs anymore, but now that you're grouping me with GPUs that are having problems validating, I'm burning electricity for nothing sometimes.


That won't fix the problem (at least on our end). We want the results to be accurate. If we're getting certain results from certain OS/architectures and certain results from others how do we know which ones are the right ones?

I'm really hoping we have this sorted out in the next couple days. I know having invalid results is really frustrating.


I can appreciate your issue. It, however, is not my issue. I do hope you understand that. I know I'm not adding much very often compared to GPUs, but it has been stated in other threads that the CPU results are definitely more accurate at this point than the ATI 5800 series cards.

As Crunch3r has said, you could restrict 5800 series participation until the issue is sorted with them.

In the meantime, I'm setting to no new tasks...

ID: 38230 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gary Roberts

Send message
Joined: 1 Mar 09
Posts: 56
Credit: 1,984,937,499
RAC: 0
Message 38235 - Posted: 6 Apr 2010, 23:08:05 UTC

I agree that something along the lines of crunch3r's suggestion should be adopted until CP announces a corrected app.

Then the real fun starts. How do you guarantee that all 5800 series owners actually start using the corrected app? Those using stock apps should be converted automatically. Those running under AP (and not paying attention) will continue to pollute the result stream. If there are enough of those they may still be able to cause incorrect results to be validated, particularly with a quorum of only 2.

Maybe it's possible to discriminate based on both GPU series and app version?
Cheers,
Gary.
ID: 38235 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 38238 - Posted: 6 Apr 2010, 23:47:06 UTC - in response to Message 38235.  

I agree that something along the lines of crunch3r's suggestion should be adopted until CP announces a corrected app.

Then the real fun starts. How do you guarantee that all 5800 series owners actually start using the corrected app? Those using stock apps should be converted automatically. Those running under AP (and not paying attention) will continue to pollute the result stream. If there are enough of those they may still be able to cause incorrect results to be validated, particularly with a quorum of only 2.


What we're going to need is to have enough people using non bad 5800 series applications to make the bad ones be mostly invalid (so they get upgraded.

But we're also moving over to a new application (new name, etc); so people won't be able to simply keep using their optimized code as the WUs for that are going to dry up once we have all the stock applications compiled and GPU version available for the new version.
ID: 38238 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : quorum down to 2

©2024 Astroinformatics Group