Message boards :
Application Code Discussion :
Recompiled Linux 32/64 apps
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next
Author | Message |
---|---|
Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0 |
The SSE3 code runs very well on AMD LE-1600, like C2D with the same clock. At run-time the processor is probed and if it's by AMD, then degraded code is run instead of the SSE3 code. See http://techreport.com/discussions.x/8547 for a snippet. HTH |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
And here are flags: You can omit "fast-transcendentals" as this is the default when specifying "-fp-model fast" (or even fast=2). I don't use -unroll-aggressive and -opt-multi-version-aggressive, does it help the performance? I would think that it doesn't bring much to the table. |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
And here are flags: Right, leaving them away doesn't make an difference. Any suggestions to squeeze out some more? mic. |
Send message Joined: 9 Jul 08 Posts: 7 Credit: 11,070,991 RAC: 0 |
The SSE3 code runs very well on AMD LE-1600, like C2D with the same clock. 1. This article is very old: 11:58 AM on July 13, 2005 2. I've got a lot better performance with SSE2 (20% boost) than without it, and slightly better performance with SSE3 than SSE2 (another 1% boost) and I'm talking about AMD chip and milkyway app of course. |
Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0 |
1. This article is very old: 11:58 AM on July 13, 2005 1 - Yet, it's still true. It's been known in the open source community and Intel's response was that they cannot guarantee their compiler except on their processors, fair enough. Is this new enough for you? 2 - 1% is too close to noise to call a boost. |
Send message Joined: 9 Jul 08 Posts: 7 Credit: 11,070,991 RAC: 0 |
1 - It's been known in the open source community and Intel's response was that they cannot guarantee their compiler except on their processors, fair enough. Is this new enough for you? 1. It seems you are right. 2. But I've made more tests: averaged boost in calculation times for 126 runs of milkyway app on idle machine SSE3 app: 121.86% SSE2 app: 119.03% base app: 100.00% I don't think this is 'noise' only... I'm confused now... Maybe this is milkyway specific... |
Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0 |
Hard to explain why. Maybe even though the processor doesn't get to run SSE3 code, the code is different, though SSE2, and the outcome is better, perhaps because of something as mundane as some branches getting aligned favorably. Regardless, I agree that it's more than noise. Thanks. |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
SSE3 might be doing some other optimizations (or have some changes in optimizations) which are better than what was in SSE2, because it's newer. |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
Travis, please take a look at this host, everything coming in after 20:45 UTC is done with the new recompiled v14. mic. |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
Travis, please take a look at this host, everything coming in after 20:45 UTC is done with the new recompiled v14. I'll let you know as soon as I get some more results from it. But it should be OK if it was returning the same results for the test workunits. |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
Travis, please take a look at this host, everything coming in after 20:45 UTC is done with the new recompiled v14. The results of the test-units is exactly the same as my v12. I'll post the v14 as soon as give the ok. :) mic. |
Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0 |
SSE3 might be doing some other optimizations (or have some changes in optimizations) which are better than what was in SSE2, because it's newer. Yes, but the Intel compiler checks if the code is running on an Intel CPU and, if it's not, it runs an alternative SSE2 code instead. It'll run SSE3 or later only on Intel processors. As these results are on an AMD CPU, it's not benefiting from the SSE3 optimizations. HTH |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
Travis, please take a look at this host, everything coming in after 20:45 UTC is done with the new recompiled v14. Looks to me like it's generating good results, so I'd go ahead and release it. *edit* scratch that. Looking at some results, the stock app and other new compiled apps are still having the same issue (however not as frequently). No point in updating it until this whole thing is fixed. |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
|
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
Although these are running a bit faster and not erroring as frequently as before so it's no big deal that they're released :D |
Send message Joined: 9 Nov 08 Posts: 44 Credit: 128,043,914 RAC: 0 |
I'll take that as a yes :-) I just downloaded the SSE version. Unfortunately only my 1.3 GHz Celerons can take advantage of these apps (much to their pleasure I might add). The .12 version saw an increase from 2:02 to 1:37 CPU time as compared to the stock version. My AMD 3800+ x2 and AMD 5600+ x2 can't handle them (as expected). I only tried the 64 bit SSE3 app tho. Maybe the 32 bit SSE2 will work? It's much too late tonight to embark on what may be a major task. I'll try it tomorrow unless someone knows it's futile. |
Send message Joined: 9 Nov 08 Posts: 44 Credit: 128,043,914 RAC: 0 |
The .12 version saw an increase from 2:02 to 1:37 CPU time as compared to the stock version. I meant decrease ...sigh. Stupid computer sends what I type rather than what I meant! |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
Travis, please take a look at this host, everything coming in now is done with the new recompiled v15. mic. |
Send message Joined: 17 Jan 09 Posts: 98 Credit: 72,182,367 RAC: 0 |
You people obviously put a lot of hard work in and I thank you for that, but is there a newbies guide to installing these? I have just started playing with linux and have 2 quad cores that I'd love to try these out on. I have tried searching but to no avail. Thanks again, Neal |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
Travis, please take a look at this host, everything coming in now is done with the new recompiled v15. It looked good until maybe the couple workunits which were bad... However the stock app is STILL doing the same thing ;( I have no clue what's up. |
©2024 Astroinformatics Group