Message boards :
News :
testing new validator
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
I've started up the new validator, so please be patient as I get all the kinks worked out over the next few days. Validation will now work as follows: Every result that could improve one of our searches will be validated (with a min quorum of 3 -- and the accuracy of the fitness reported must be within 10e-11 of the quorum results, this means that single precision GPU results will be flagged invalid). Results that won't improve a search will be validated 50% of the time until the error rates of hosts stabilizes in the database (this will probably take a couple weeks). Afterwards, for the results that don't improve our searches, we'll be using BOINC's adaptive validation based on hosts error rates (which will be between 10% and 100% depending on how many errors the host typically has). On a side note, we'll also be updating the applications this week. We've made new background models for the milky way that we want to test. Additionally, there are some server related performance improvements that should help the server response time. I'm hoping to have the source code available by tuesday so people can compile their own applications, then make the full swap over to the new application sometime early next week. |
Send message Joined: 19 Jul 08 Posts: 67 Credit: 272,086,462 RAC: 0 |
Are these only test units just to verify the validator? I've had 2 issues with them - Completed, validation inconclusive and when the consensus does come in, I get - Completed, can't validate due to the settings of - max # of error/total/success tasks 1, 6, 1 errors Too many success results |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
Are these only test units just to verify the validator? I've had 2 issues with them - Completed, validation inconclusive and when the consensus does come in, I get - Completed, can't validate due to the settings of - max # of error/total/success tasks 1, 6, 1 errors Too many success results Ahh, thats the issue. I was wondering why WUs weren't coming back for additional validation. This should be fixed with new WUs. |
Send message Joined: 19 Jul 08 Posts: 67 Credit: 272,086,462 RAC: 0 |
Are these only test units just to verify the validator? I've had 2 issues with them - Completed, validation inconclusive and when the consensus does come in, I get - Completed, can't validate due to the settings of - max # of error/total/success tasks 1, 6, 1 errors Too many success results Thanks, I'll give 'em a try after I quit banging on Slicker's server. |
Send message Joined: 29 Aug 07 Posts: 115 Credit: 501,600,687 RAC: 622 |
|
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
So what about the wus that say "Checked, but no consensus yet" also come up as pending. Will they be granted credit eventually? Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
Validation will now work as follows: Every result that could improve one of our searches will be validated (with a min quorum of 3[...] Depending on how the validation goes I'll probably bring it down to 2. But right now I want to flush out all the clients returning bad results, which means a higher quorum -- so there's less chance of two bad clients returning results for the same WU and getting credit. |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
So what about the wus that say "Checked, but no consensus yet" also come up as pending. Will they be granted credit eventually? They'll be granted credit when there's a quorum of 3. Checked but no consensus yet means that the result was looked at but there weren't 2 other similar results to validate it. |
Send message Joined: 29 Aug 07 Posts: 115 Credit: 501,600,687 RAC: 622 |
Most of mine (edit: many, not most) are ending up with "Completed, validation inconclusive". 4 results, with a max of 4, and the no one gets any credits. Something doesn't seem right with this scheme. http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=82607813 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=87913982 |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
Most of mine (edit: many, not most) are ending up with "Completed, validation inconclusive". 4 results, with a max of 4, and the no one gets any credits. Something doesn't seem right with this scheme. The max results should be 6. These must be from some of the older workunits (from the old validator), I'll update the database so hopefully they'll be fixed. |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
There also seems to be a set of applications out there which are giving close, but not close enough results -- accurate to about ~10e-8, when we really need ~10e-11 or more. I'm not sure if this is due to overclocking or single precision GPUs or maybe older optimized versions of the application which need to be updated. Not quite sure what to do about this, as it's going to throw off validation of some results. In the meantime I'm hoping people running the offending applications will update to more recent versions. As a better fix I'll be putting out updated application code this week and we're going to release it as a different application. So people are going to have to update to use that new application (either the stock version or a new optimized application), which should fix these validation issues. |
Send message Joined: 29 Aug 07 Posts: 115 Credit: 501,600,687 RAC: 622 |
Okay, I've run across a few with: minimum quorum 3 initial replication 5 max # of error/total/success tasks 1, 6, 6 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=90198132 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=90151888 Max # of error 1....is that really what you want? Also, why is my machine marked as invalid? It is using the stock app, with a 5870. |
Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0 |
Looks to be more hardware related than application related to me. Many of the results marked invalid are using the stock application that is automatically downloaded. Cypress class 58xx cards are validating against other Cypress if they are in the majority reported first and 48xx cards are being marked as invalid. If 48xx cards are the majority reported first then they validate against other 48xx and 58xx are marked invalid. Uncertain about NVIDIA, but I think they validate against 48xx but not 58xx. Not sure if this is exactly correct in all details but it's something like that. |
Send message Joined: 29 Aug 07 Posts: 115 Credit: 501,600,687 RAC: 622 |
|
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
Looks to be more hardware related than application related to me. Many of the results marked invalid are using the stock application that is automatically downloaded. Now that you mention it... i think i read somewhere that the 58xx cards give incorrect results with the latest SDK.
Source -> http://setiathome.berkeley.edu/forum_thread.php?id=59506&nowrap=true#986347 Since OpenCL is just some sort of wrapper for CAL/brook... Seems to me that someone should do a standalone test and compare results between 48xx and 58xx again, just to make sure everything works properly. Join Support science! Joinc Team BOINC United now! |
Send message Joined: 29 Aug 07 Posts: 115 Credit: 501,600,687 RAC: 622 |
|
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
Seems to me that someone should do a standalone test and compare results between 48xx and 58xx again, just to make sure everything works properly. I'd guess comparing a few numbers against each other shouldn't be that hard to do... But you never know ;) Join Support science! Joinc Team BOINC United now! |
Send message Joined: 27 Nov 09 Posts: 108 Credit: 430,760,953 RAC: 0 |
As stated in corresponding thread, 5xxx ability to work is very questionable right now. Bugs in ATI's OpenCL SDK implementation. They promised to fix those in new SDK release, will see...I recall GPUGRID was saying that ATI OpenCL was completely unusable. Kept locking up the machine at random. Also major problems with 4xxx performance that rendered them useless for any purpose. |
Send message Joined: 1 Mar 09 Posts: 56 Credit: 1,984,937,499 RAC: 0 |
Looks to be more hardware related than application related to me. Many of the results marked invalid are using the stock application that is automatically downloaded. My own observations seem to tie in pretty much with the above. I've checked a number of my results (48xx series - stock app) and everytime so far that I'm teamed up with non-58xx GPUs or even CPUs, the results are valid. If there are three 58xx GPUs, my result is always invalid. I've not yet seen a quorum where both 48xx and 58xx GPUs validate against each other. It does take time to check so I haven't looked at enough quorums yet to be absolutely sure. Here's a quorum that is a bit strange. There are two 48xx results that validate against each other and there are three 58xx results that have been declared invalid. These three did come in last but how did the two manage to trump them when there are supposed to be three for a quorum? Also, the use of 1,6,6 for the error/total/success numbers is a bit strange. If the min quorum is 3 then the max errors should really be 3 also since you could still get 3 successful results and form a quorum. By leaving the errors at 1, a second error will immediately junk an otherwise potentially successful quorum. EDIT: Does anyone know if this is the bit of the returned data that is used for validation purposes? probability calculation (stars) Calculated about 3.34818e+009 floatingpoint ops on FPU. If not, what exactly is used? Cheers, Gary. |
Send message Joined: 7 Feb 09 Posts: 9 Credit: 25,983,618 RAC: 0 |
HD5870 running ati13ati app. factory oc (875, 1250). Getting a 1 invalid/1 pending/1 valid split at the moment (roughly). Also noticing the 48xx/cpu relationship as well. IS validator looking at time taken? |
©2024 Astroinformatics Group