rpi_logo
validator strictness
validator strictness
log in

Advanced search

Message boards : News : validator strictness

1 · 2 · Next
Author Message
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 38106 - Posted: 5 Apr 2010, 16:35:39 UTC
Last modified: 5 Apr 2010, 16:35:49 UTC

I've lowered the strictness of the validator from 10e-11 to 10e-10. I'm hoping this should significantly reduce the number of WUs flagged invalid. If the issue persists I might have to lower it farther to 10e-9. The new application will have the strictness back at 10e-11, so keep that in mind if you're compiling your own versions.


The issue we're having seems to be that the ATI 48xx GPUs and the ATI 58xx GPUs are returning different results, and if too many of either make it into the quorum they will invalidate the other results (including stock results). I'm still trying to determine if the 58xx GPU or the 48xx GPU is the one correctly validating against the stock application.


I've also updated the validator so if you check your tasks they will show what fitness they reported, so you can compare vs other tasks for the same workunit.


I'm hoping we should have this issue straightened out shortly, and thanks for your patience.
____________

SkyeHunter
Send message
Joined: 6 Mar 09
Posts: 41
Credit: 38,856,291
RAC: 0

Message 38136 - Posted: 5 Apr 2010, 20:45:47 UTC

I don't see any decrease in invalids on my 4870's...

Profile The Gas Giant
Avatar
Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0

Message 38137 - Posted: 5 Apr 2010, 20:54:05 UTC

I think the services of Cluster Physik are needed!

You need to add the ATI 38XX series as also being different to the 58XX series.

Cluster Physik
Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0

Message 38141 - Posted: 5 Apr 2010, 22:05:23 UTC - in response to Message 38137.
Last modified: 5 Apr 2010, 22:07:43 UTC

I think the services of Cluster Physik are needed!

You need to add the ATI 38XX series as also being different to the 58XX series.

The HD38xx GPUs return the exact same results as the HD47xx/48xx GPUs (I have testd both series extensively before making the applications available), those validate also against CPU and CUDA. Only the HD5800 series GPUs appear to deviate significantly, I have no idea why, as it executes the exact same code as the other GPUs. Maybe it's a driver/compiler hiccup of some sort. I don't have a HD5800 series GPU, so I have no possibility to test for it. But I found a calculated WU with some different applications.

Fitness values:
-3.19087277379105500000 (CPU 0.20, SSE3, x64)
-3.19087277379105500000 (HD4870/HD38xx, 0.21)
-3.19087277379125100000 (CUDA 0.24)
-3.19087286725516700000 (HD5870 0.21)

As you see, the HD38xx/48xx GPUs deliver the exact same result as the CPU (all my versions starting with 0.20 return the exact same values). The CUDA application arrives at a slightly different one, but the deviation is in the 10^-13 range, completely acceptable (the stock CPU application typically deviates by about the same amount from the CUDA and my versions, I put some special stuff in to "guard" the calculations against the differences between the architectures).
But what goes on with the HD5800 GPUs I've no idea right now. Can anyone check if this depends on the driver version?

Profile The Gas Giant
Avatar
Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0

Message 38153 - Posted: 5 Apr 2010, 23:05:59 UTC

Surely ATI wouldn't have played 'funny buggers' with the compiler and unless you utilise a particular flag that triggers DP during compiling that it compiles for SP?

Profile Arif Mert Kapicioglu
Send message
Joined: 14 Dec 09
Posts: 159
Credit: 576,661,687
RAC: 21,704

Message 38160 - Posted: 6 Apr 2010, 0:33:05 UTC

I'm shifting my 2x5970 from collatz to milkyway now. Hope it helps to fix the issue.

Profile Gary Roberts
Send message
Joined: 1 Mar 09
Posts: 56
Credit: 1,984,937,272
RAC: 0

Message 38161 - Posted: 6 Apr 2010, 0:33:14 UTC - in response to Message 38106.

The issue we're having seems to be that the ATI 48xx GPUs and the ATI 58xx GPUs are returning different results, and if too many of either make it into the quorum they will invalidate the other results (including stock results).

Yes, precisely and here is some info from WU 90332871 which I've just chosen at random which shows a couple of things. Five tasks were needed to get a quorum and the three 5800 series ended up clobbering the two non-5800 cards.

The two that were clobbered had precisely identical results and I doubt you would expect that if their results were tainted by overclocking or some other error causing condition.

Two of the 5800 results had no visible fitness value listed in the taskID output but since they all validated I guess they must have been pretty much identical to the one value that was visible. I've highlighted in red the non-agreement between the 5800 value and the 4800/3800 values so you can easily see that the mis-match was quite woeful (around the e-07 level). I've done this as a separate block below the main block since color tags don't seem to work inside code tags. The three that validated are marked with *.

Fitness value returned -- GPU series -- application used ======================================================== -3.16907880276272100000 - 4800 series - v0.21 (ati13ati) -3.16907889603708600000 - 5800 series - anon v0.20b (Win64, CAL 1.4) by Gipsel* -3.16907880276272100000 - 3800 series - v0.21 (ati13ati) ??????????????????????? - 5800 series - v0.21 (ati13ati)* ??????????????????????? - 5800 series - v0.21 (ati13ati)*

-3.16907880276272100000 - 4800 series
-3.16907889603708600000 - 5800 series

I'm still trying to determine if the 58xx GPU or the 48xx GPU is the one correctly validating against the stock application.

Most of the above are the stock 0.21 app. I seem to recall that initially CP did the 0.21 version to take advantage of special features of 5800 series cards that gave a decent speedup. I also recall that he said it was OK for older cards to use this app - they would get the same answers but just not have the extra speedup of the 5800 series. I remember testing the app on my 4800 cards and finding that there was a minor speedup so I did use 0.21 under AP until it was made the stock app. Now it would seem that 0.21 doesn't give the same answers on 4800/3800 series cards as it does on 5800 series. Maybe CP can throw some light on this.

I've also updated the validator so if you check your tasks they will show what fitness they reported, so you can compare vs other tasks for the same workunit.

Thanks very much for that, it's really useful. Can you comment on why two of the above five didn't actually show a fitness value?

PS: I composed the above before I had seen CP's response so now we have the answer that the 5800 series are giving the wrong answer. Even though the explanation has been given, I'm still going to post what I've been composing because it does highlight the major difference of the 5800 answers using a current 'in the wild' quorum.

Until you can find out why this is happening and then rectify the problem, it would be rather unfair to keep penalising the 3800/4700/4800/CUDA/CPU owners who happen to get teamed up with three 5800 series crunchers. Maybe you should go back to single result validation until this gets sorted. After all, it looks like any 5800 results that get into the database might be rather useless anyway.
____________
Cheers,
Gary.

Cluster Physik
Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0

Message 38162 - Posted: 6 Apr 2010, 0:40:01 UTC - in response to Message 38153.

Surely ATI wouldn't have played 'funny buggers' with the compiler and unless you utilise a particular flag that triggers DP during compiling that it compiles for SP?

No, it doesn't work this way. All ATI cards use the exact same code (specifically using DP), but it gets JIT compiled to the specific GPU by the driver during runtime. Currently I suspect some bug somewhere in this step. Later today I will try to get some dissassembly of the GPU ISA code and look for differences between HD3800, HD4700/4800 and the HD5800 (as said, the code before compiling is exactly the same).

There are some slight differences between the different GPU series, but they are not larger than between CPUs and GPUs, so I would not expect them to be causing this. Also the CUDA applications returns very similar results to the ATI version on HD3800 and HD4800 GPUs. And the architectural differences between them are actually larger than between HD4800 and HD5800 GPUs. So I don't think this is caused by these differences. Some strange bug somewhere is more likely.

Cluster Physik
Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0

Message 38163 - Posted: 6 Apr 2010, 0:55:45 UTC - in response to Message 38161.

Most of the above are the stock 0.21 app. I seem to recall that initially CP did the 0.21 version to take advantage of special features of 5800 series cards that gave a decent speedup. ...

Actually, version 0.20b and the "stock" v0.21 for ATI are identical. Furthermore, no HD5000 specific code is used at all (that was Collatz, where a specific path for HD5000 series GPUs yielded a decent speedup). The (IL) code for all ATI GPUs is exactly the same (but gets compiled to slightly different GPU specific ISA code by the driver). For some reason I still have to figure out, it only returns different values on HD5800 GPUs.

Profile Gary Roberts
Send message
Joined: 1 Mar 09
Posts: 56
Credit: 1,984,937,272
RAC: 0

Message 38168 - Posted: 6 Apr 2010, 3:08:16 UTC - in response to Message 38163.

... no HD5000 specific code is used at all (that was Collatz, where a specific path for HD5000 series GPUs yielded a decent speedup).

Yes, my apologies for the misinformation and my confusion between MW and Collatz. Having thought about it more carefully, I do remember the speedup on my HD4850s from around 17-18 mins to around 15 mins per task - obviously Collatz and not MW from the times.

I wish you every success with your disassembly activities mentioned in your other response. It's really great that you are willing to spend time chasing the source of the problem. I'm sure all participants (let alone the Devs :-) ) must be (as I am) very grateful for your continuing support of this project.

____________
Cheers,
Gary.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 38172 - Posted: 6 Apr 2010, 4:17:40 UTC - in response to Message 38168.

... no HD5000 specific code is used at all (that was Collatz, where a specific path for HD5000 series GPUs yielded a decent speedup).

Yes, my apologies for the misinformation and my confusion between MW and Collatz. Having thought about it more carefully, I do remember the speedup on my HD4850s from around 17-18 mins to around 15 mins per task - obviously Collatz and not MW from the times.

I wish you every success with your disassembly activities mentioned in your other response. It's really great that you are willing to spend time chasing the source of the problem. I'm sure all participants (let alone the Devs :-) ) must be (as I am) very grateful for your continuing support of this project.



We're very thankful for everyones help :) Anthony is also looking into the GPU issue right now. As soon as John gives me the okay with the new astronomy code I'll be making the source available for the new application as well.
____________

TJ
Send message
Joined: 12 Aug 09
Posts: 262
Credit: 92,132,140
RAC: 4,402

Message 38174 - Posted: 6 Apr 2010, 6:23:35 UTC

I am running with a nVidia GTX 285 and have a lot of invalid since 5 April. Before it was crunching well. The RAC is dropping.
What I see is that when mine are invalid, others run with ATI cards and are invalid as well.
Perhaps you can use this information Travis.
____________
Greetings from,
TJ

loeakaodas
Send message
Joined: 2 Jan 09
Posts: 34
Credit: 93,631,891
RAC: 0

Message 38175 - Posted: 6 Apr 2010, 7:17:17 UTC

I was browsing the ATI/AMD Developer KB and ran across this, might it have something to do with the apparent problems with 58xx series cards?
____________

Cluster Physik
Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0

Message 38177 - Posted: 6 Apr 2010, 9:24:05 UTC - in response to Message 38175.

I was browsing the ATI/AMD Developer KB and ran across this, might it have something to do with the apparent problems with 58xx series cards?

Nice catch!
This could be really the culprit. I hope I can build a modified version later today (should be easy, without the need to search for other issues).

Profile The Gas Giant
Avatar
Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0

Message 38178 - Posted: 6 Apr 2010, 10:14:18 UTC - in response to Message 38177.

I was browsing the ATI/AMD Developer KB and ran across this, might it have something to do with the apparent problems with 58xx series cards?

Nice catch!
This could be really the culprit. I hope I can build a modified version later today (should be easy, without the need to search for other issues).

Ahh...good to see I wasn't too far off the mark and it was to do with SP vs DP, but just at the coding level. Sounds like we have a winner!

Cluster Physik
Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0

Message 38179 - Posted: 6 Apr 2010, 10:27:14 UTC - in response to Message 38178.

I was browsing the ATI/AMD Developer KB and ran across this, might it have something to do with the apparent problems with 58xx series cards?

Nice catch!
This could be really the culprit. I hope I can build a modified version later today (should be easy, without the need to search for other issues).

Ahh...good to see I wasn't too far off the mark and it was to do with SP vs DP, but just at the coding level. Sounds like we have a winner!

In fact, it is a difference how the texture units of the HD5000 GPUs work. The new GPUs have additional circuitry to ensure the values loaded from memory are valid encodings of numbers and normalizes floating point numbers for instance. That's why it matters in which format they are declared. With older GPUs no such checks were done and the format declaration was basically a placeholder. As the ATI application was developed with CAL 1.3, it still uses the buffer formats recommended back then, which simply leads to erratic behaviour on newer GPUs.
Im astonished that Collatz doesn't suffer from this. I guess the reason is that I use a texture sampler with point sampling (i.e. no filtering applied) here which obviously enables the checks as described above, but simply load values from a texture (without a sampler) over at Collatz. The difference is basically just the indexing, a texture sampler takes float values as coordinates, a load instruction uses integer values to index into the texture array. Obviously it also bypass the checks.

Profile David Glogau*
Avatar
Send message
Joined: 12 Aug 09
Posts: 172
Credit: 645,240,165
RAC: 0

Message 38231 - Posted: 6 Apr 2010, 22:35:52 UTC - in response to Message 38177.

So should I take my 5970's down for a rest, or has this problem been fixed yet?


I was browsing the ATI/AMD Developer KB and ran across this, might it have something to do with the apparent problems with 58xx series cards?

Nice catch!
This could be really the culprit. I hope I can build a modified version later today (should be easy, without the need to search for other issues).


____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 38233 - Posted: 6 Apr 2010, 22:37:35 UTC - in response to Message 38231.

So should I take my 5970's down for a rest, or has this problem been fixed yet?


I was browsing the ATI/AMD Developer KB and ran across this, might it have something to do with the apparent problems with 58xx series cards?

Nice catch!
This could be really the culprit. I hope I can build a modified version later today (should be easy, without the need to search for other issues).




I think the application is going to need to be updated before the problem gets fixed. I'll make a news post as soon as we have new applications for the 58x0 series.
____________

Profile Gary Roberts
Send message
Joined: 1 Mar 09
Posts: 56
Credit: 1,984,937,272
RAC: 0

Message 38236 - Posted: 6 Apr 2010, 23:28:33 UTC - in response to Message 38233.

I think the application is going to need to be updated before the problem gets fixed. I'll make a news post as soon as we have new applications for the 58x0 series.

Just to clarify things a bit, if CP comes out with a corrected 'current generation' app before you release your new source code, that would provide an immediate solution to the 'invalids' problem if all 5800 series owners were to immediately adopt the new app.

You have mentioned several times about 'releasing the new code' and 'allowing people to compile their own apps' but I don't think you actually spelled out exactly what precompiled apps you would be releasing as well. I might be wrong but I got the impression at one point that you might be building for CPU and CUDA but perhaps not for ATI? In other words, we would need to rely on the continuing services of CP or someone else to port the new code and build the appropriate ATI apps. Is this how things will work when you release the new code?

____________
Cheers,
Gary.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 38237 - Posted: 6 Apr 2010, 23:45:17 UTC - in response to Message 38236.

Just to clarify things a bit, if CP comes out with a corrected 'current generation' app before you release your new source code, that would provide an immediate solution to the 'invalids' problem if all 5800 series owners were to immediately adopt the new app.


Yeah, it seems like CP (and Anthony) are working on new version of the 58x0 application, which will solve this problem. Hopefully they'll be out soon.



You have mentioned several times about 'releasing the new code' and 'allowing people to compile their own apps' but I don't think you actually spelled out exactly what precompiled apps you would be releasing as well. I might be wrong but I got the impression at one point that you might be building for CPU and CUDA but perhaps not for ATI? In other words, we would need to rely on the continuing services of CP or someone else to port the new code and build the appropriate ATI apps. Is this how things will work when you release the new code?


Right now I've compiled some OSX applications, and they're on the server right now as milkyway3 (technically what people are running right now is milkyway2, as before that was just astronomy). I don't have the hardware to compile the windows applications so I'm going to need Anthony to do that. But for the time being I'm going to test it on OS X which should effect the least amount of our users and what I have direct control over upgrading if there are any problems.
____________

1 · 2 · Next
Post to thread

Message boards : News : validator strictness


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group