| log in |
Message boards : Application Code Discussion : Recompiled Linux 32/64 apps
| Author | Message |
|---|---|
|
I thought it be better to have them in one place... | |
| ID: 8664 | Rating: 0 | rate:
| |
|
What kind of speedup did you get on these compared to the stock app? And did you use any specific compiler flags? | |
| ID: 8671 | Rating: 0 | rate:
| |
All compiled with Intel icpc from the original code. Can you make them AMD compatibel? I just tried it out and it didn't run of course. :-\ ____________ Member of BOINC@Heidelberg and ATA! My BOINCstats | |
| ID: 8672 | Rating: 0 | rate:
| |
What kind of speedup did you get on these compared to the stock app? And did you use any specific compiler flags? This host is runing the SSE41_64 version. But I don't have numbers for stock... Temujin posted numbers for the 32bit versions. And here are flags: CXX_i686 = icpc CXXFLAGS_i686 = -xSSE3 -O3 -ipo -no-prec-div -static -fp-model fast=2 -fp-speculation=fast -opt-calloc -unroll-aggressive -opt-multi-version-aggressive -fast-transcendentals CXX_x86_64 = icpc CXXFLAGS_x86_64 = -xSSE4.1 -O3 -ipo -no-prec-div -static -fp-model fast=2 -fp-speculation=fast -opt-calloc -unroll-aggressive -opt-multi-version-aggressive -fast-transcendentals ____________ mic. | |
| ID: 8673 | Rating: 0 | rate:
| |
|
Due to the v11 release I removed the apps from the server. | |
| ID: 8676 | Rating: 0 | rate:
| |
|
Stickying this because i think it's something our linux users would like. Also, the linux optimized apps seem to have been returning good results :) | |
| ID: 8677 | Rating: 0 | rate:
| |
Stickying this because i think it's something our linux users would like. Also, the linux optimized apps seem to have been returning good results :) Good to hear, I hope the V11/12 sill does. ;) The SSE41_64 already running on my Q9550, I'll post it (and the other SSE levels) when it still gets credits tomorrow. ____________ mic. | |
| ID: 8687 | Rating: 0 | rate:
| |
Stickying this because i think it's something our linux users would like. Also, the linux optimized apps seem to have been returning good results :) Things look fine on my end, it will be getting credits tomorrow. ____________ | |
| ID: 8720 | Rating: 0 | rate:
| |
|
I suppose the question then has to be asked. | |
| ID: 8722 | Rating: 0 | rate:
| |
I suppose the question then has to be asked. Any code change from my side, just compiler and flags (posted below). ____________ mic. | |
| ID: 8726 | Rating: 0 | rate:
| |
|
New recompiled apps for Linux64 on Intel: | |
| ID: 8730 | Rating: 0 | rate:
| |
New recompiled apps for Linux64 on Intel: Just completed and validated my first WU with the SSE4.1 App. Result here. Running Linux-Ubuntu 8.10 64-bit, Q9450@3.4GHz, WU took 683 secs. Looking good Mic | |
| ID: 8731 | Rating: 0 | rate:
| |
|
Hi speedimic, | |
| ID: 8733 | Rating: 0 | rate:
| |
|
New recompiled apps for Linux32 on Intel: | |
| ID: 8744 | Rating: 0 | rate:
| |
|
As I told here, I have some problem with some wus only with SSE4.1 X86_64 app ! | |
| ID: 8753 | Rating: 0 | rate:
| |
It seems they don't run well on AMD, so it might better for AMD users to try a testfile in standalone mode --> look here Well, of course not. The Intel compiler runs degraded code when run on AMD processors. The highest SSE level that it runs on AMD processors is SSE2. HTH ____________ | |
| ID: 8756 | Rating: 0 | rate:
| |
I suppose the question then has to be asked. Linux and OSX seem to be returning correct results (ie. those in line with the stock app). For windows, it seems there are 2-3 different strains of bad applications out there. ____________ | |
| ID: 8762 | Rating: 0 | rate:
| |
|
How do you install these clients in Ubuntu64? | |
| ID: 8831 | Rating: 0 | rate:
| |
The Intel compiler runs degraded code when run on AMD processors. The highest SSE level that it runs on AMD processors is SSE2. The SSE3 code runs very well on AMD LE-1600, like C2D with the same clock. I use icpc -xO for this. | |
| ID: 8838 | Rating: 0 | rate:
| |
How do you install these clients in Ubuntu64? First stop the service ( /etc/init.d/boinc-client stop ), then put the files from the zip (app_info.xml, milkyway_0.12_SSEwhatever...) to /var/lib/boinc-client/projects/milky..., then restart the client. ____________ mic. | |
| ID: 8842 | Rating: 0 | rate:
| |
The SSE3 code runs very well on AMD LE-1600, like C2D with the same clock. At run-time the processor is probed and if it's by AMD, then degraded code is run instead of the SSE3 code. See http://techreport.com/discussions.x/8547 for a snippet. HTH ____________ | |
| ID: 8850 | Rating: 0 | rate:
| |
And here are flags: You can omit "fast-transcendentals" as this is the default when specifying "-fp-model fast" (or even fast=2). I don't use -unroll-aggressive and -opt-multi-version-aggressive, does it help the performance? I would think that it doesn't bring much to the table. | |
| ID: 8851 | Rating: 0 | rate:
| |
And here are flags: Right, leaving them away doesn't make an difference. Any suggestions to squeeze out some more? ____________ mic. | |
| ID: 8854 | Rating: 0 | rate:
| |
The SSE3 code runs very well on AMD LE-1600, like C2D with the same clock. 1. This article is very old: 11:58 AM on July 13, 2005 2. I've got a lot better performance with SSE2 (20% boost) than without it, and slightly better performance with SSE3 than SSE2 (another 1% boost) and I'm talking about AMD chip and milkyway app of course. | |
| ID: 8898 | Rating: 0 | rate:
| |
1. This article is very old: 11:58 AM on July 13, 2005 1 - Yet, it's still true. It's been known in the open source community and Intel's response was that they cannot guarantee their compiler except on their processors, fair enough. Is this new enough for you? 2 - 1% is too close to noise to call a boost. ____________ | |
| ID: 8903 | Rating: 0 | rate:
| |
1 - It's been known in the open source community and Intel's response was that they cannot guarantee their compiler except on their processors, fair enough. Is this new enough for you? 1. It seems you are right. 2. But I've made more tests: averaged boost in calculation times for 126 runs of milkyway app on idle machine SSE3 app: 121.86% SSE2 app: 119.03% base app: 100.00% I don't think this is 'noise' only... I'm confused now... Maybe this is milkyway specific... | |
| ID: 8922 | Rating: 0 | rate:
| |
Hard to explain why. Maybe even though the processor doesn't get to run SSE3 code, the code is different, though SSE2, and the outcome is better, perhaps because of something as mundane as some branches getting aligned favorably. Regardless, I agree that it's more than noise. Thanks. ____________ | |
| ID: 8923 | Rating: 0 | rate:
| |
SSE3 might be doing some other optimizations (or have some changes in optimizations) which are better than what was in SSE2, because it's newer. ____________ | |
| ID: 8933 | Rating: 0 | rate:
| |
|
Travis, please take a look at this host, everything coming in after 20:45 UTC is done with the new recompiled v14. | |
| ID: 8952 | Rating: 0 | rate:
| |
Travis, please take a look at this host, everything coming in after 20:45 UTC is done with the new recompiled v14. I'll let you know as soon as I get some more results from it. But it should be OK if it was returning the same results for the test workunits. ____________ | |
| ID: 8956 | Rating: 0 | rate:
| |
Travis, please take a look at this host, everything coming in after 20:45 UTC is done with the new recompiled v14. The results of the test-units is exactly the same as my v12. I'll post the v14 as soon as give the ok. :) ____________ mic. | |
| ID: 8960 | Rating: 0 | rate:
| |
SSE3 might be doing some other optimizations (or have some changes in optimizations) which are better than what was in SSE2, because it's newer. Yes, but the Intel compiler checks if the code is running on an Intel CPU and, if it's not, it runs an alternative SSE2 code instead. It'll run SSE3 or later only on Intel processors. As these results are on an AMD CPU, it's not benefiting from the SSE3 optimizations. HTH ____________ | |
| ID: 8976 | Rating: 0 | rate:
| |
Travis, please take a look at this host, everything coming in after 20:45 UTC is done with the new recompiled v14. Looks to me like it's generating good results, so I'd go ahead and release it. *edit* scratch that. Looking at some results, the stock app and other new compiled apps are still having the same issue (however not as frequently). No point in updating it until this whole thing is fixed. ____________ | |
| ID: 9017 | Rating: 0 | rate:
| |
Looks to me like it's generating good results, so I'd go ahead and release it. Ok, new recompiled v14 apps for Linux on Intel CPUs: Linux32 SSE3_32 SSE2_32 SSE_32 Linux64 SSE3_64 SSSE3_64 SSE41_64 Please report errors (or success) here. ____________ mic. | |
| ID: 9023 | Rating: 0 | rate:
| |
|
Although these are running a bit faster and not erroring as frequently as before so it's no big deal that they're released :D | |
| ID: 9025 | Rating: 0 | rate:
| |
|
I'll take that as a yes :-) I just downloaded the SSE version. Unfortunately only my 1.3 GHz Celerons can take advantage of these apps (much to their pleasure I might add). The .12 version saw an increase from 2:02 to 1:37 CPU time as compared to the stock version. | |
| ID: 9028 | Rating: 0 | rate:
| |
The .12 version saw an increase from 2:02 to 1:37 CPU time as compared to the stock version. I meant decrease ...sigh. Stupid computer sends what I type rather than what I meant! | |
| ID: 9050 | Rating: 0 | rate:
| |
|
Travis, please take a look at this host, everything coming in now is done with the new recompiled v15. | |
| ID: 9088 | Rating: 0 | rate:
| |
|
You people obviously put a lot of hard work in and I thank you for that, but is there a newbies guide to installing these? I have just started playing with linux and have 2 quad cores that I'd love to try these out on. | |
| ID: 9095 | Rating: 0 | rate:
| |
Travis, please take a look at this host, everything coming in now is done with the new recompiled v15. It looked good until maybe the couple workunits which were bad... However the stock app is STILL doing the same thing ;( I have no clue what's up. ____________ | |
| ID: 9107 | Rating: 0 | rate:
| |
It looked good until maybe the couple workunits which were bad... However the stock app is STILL doing the same thing ;( I have no clue what's up. Ok, as they are not worse than stock, new recompiled v15 apps for Linux on Intel CPUs: Linux32 SSE3_32 SSE2_32 SSE_32 Linux64 SSE3_64 SSSE3_64 SSE41_64 For the AMD users I got two new apps to try: AMD SSE3_64 AMD SSE2_32 I only had a chance to test the AMD SSE2_32 on my Athlon64 3200+, so the rest of the testing is up to you... ____________ mic. | |
| ID: 9113 | Rating: 0 | rate:
| |
You people obviously put a lot of hard work in and I thank you for that, but is there a newbies guide to installing these? I have just started playing with linux and have 2 quad cores that I'd love to try these out on. Basically you shut down BOINC, put the files from the zip into your project folder and restart BOINC. Did you get BOINC from the website, or did you install it with the package manager? ____________ mic. | |
| ID: 9121 | Rating: 0 | rate:
| |
|
Travis, please take a look at this host, everything coming in now is done with the new recompiled v16. | |
| ID: 9144 | Rating: 0 | rate:
| |
Travis, please take a look at this host, everything coming in now is done with the new recompiled v16. Looks like they're doing well. Haven't had a bad result come back with any compiled v0.16 app yet :) ____________ | |
| ID: 9173 | Rating: 0 | rate:
| |
|
I'll post the the new apps whe I get back from work, around 17.00 UTC. | |
| ID: 9199 | Rating: 0 | rate:
| |
For the AMD users I got two new apps to try: Just wanted to try this one out, but it gave me a "Not found"-message... ____________ Member of BOINC@Heidelberg and ATA! My BOINCstats | |
| ID: 9200 | Rating: 0 | rate:
| |
For the AMD users I got two new apps to try: new version will be up soon - stay tuned ;) ____________ mic. | |
| ID: 9203 | Rating: 0 | rate:
| |
Looks like they're doing well. Haven't had a bad result come back with any compiled v0.16 app yet :) Now then, the new recompiled v16 apps for Linuxs: Linux32 on Intel SSE3_32 SSE2_32 SSE_32 Linux64 on Intel SSE3_64 SSSE3_64 SSE41_64 For AMD users: AMD SSE3_64 AMD SSE2_32 I only had the chance to test the AMD SSE2_32 on my Athlon64 3200+, so the rest of the testing is up to you... Please report! ____________ mic. | |
| ID: 9208 | Rating: 0 | rate:
| |
AMD SSE3_64 Hi speedimic. Just finished the first two results with the 64-bit version. Here's one example. Unfortunately it takes about 24 minutes, which is about 2 1/2 times longer than the Windows one. Gives a good result anyway, but it's not worth the 64-bit and so gives less credits. ____________ Member of BOINC@Heidelberg and ATA! My BOINCstats | |
| ID: 9212 | Rating: 0 | rate:
| |
|
Hi | |
| ID: 9213 | Rating: 0 | rate:
| |
|
Since x86-64 guarantees that at least SSE2 is available, did you make sure to enable vectorization through GCC's -ftree-vectorize option (implied by -O3 in versions 4.3 and later)? For that matter, any SSE build could benefit from vectorization. | |
| ID: 9219 | Rating: 0 | rate:
| |
Since x86-64 guarantees that at least SSE2 is available, did you make sure to enable vectorization through GCC's -ftree-vectorize option (implied by -O3 in versions 4.3 and later)? For that matter, any SSE build could benefit from vectorization. -O3 is on, but I use the Intel compiler. ;) ____________ mic. | |
| ID: 9222 | Rating: 0 | rate:
| |
... You can't compare apples with pears. :) How long is the stock app running? Gives a good result anyway, but it's not worth the 64-bit and so gives less credits. If it gives less credit, you might be running into the 108 credit barrier. What do mean by 'it's not worth the 64bit'? ____________ mic. | |
| ID: 9224 | Rating: 0 | rate:
| |
|
Here's apples to apples. SSE_32 and AMD SSE3_64 are both winners. First of all they run! Here's what I've seen so far (remember sample size is 1 for the new app). | |
| ID: 9230 | Rating: 0 | rate:
| |
|
Many thanks from me too! | |
| ID: 9240 | Rating: 0 | rate:
| |
How long is the stock app running? Erm, sorry, I didn't try the 64-bit 0.16 stock app yet, I automatically thought your one would be faster... *shame-on-me* *rolleyes* If it gives less credit, you might be running into the 108 credit barrier. Erm, no, with that run time it gave me about 66 credits/hour (on an AMD X2 64 5200 with Suse 11 64-Bit). No limit reached, but it should normally... What do mean by 'it's not worth the 64bit'? Was no intended harm, I meant it that way that a 64-Bit app normally should be faster than a 32-Bit app. And that is not the case. ;-) I don't know how Crunch3r managed that but as he made his optimized app way earlier he made also a Linux one. Compared to his Windows app (5-6 minutes on my X2) it took then 4-5 minutes... ____________ Member of BOINC@Heidelberg and ATA! My BOINCstats | |
| ID: 9260 | Rating: 0 | rate:
| |
|
Here are some results from this host:
| |
| ID: 9265 | Rating: 0 | rate:
| |
|
Wow, that's fast!!! | |
| ID: 9267 | Rating: 0 | rate:
| |
I don't know how Crunch3r managed that but as he made his optimized app way earlier he made also a Linux one. Compared to his Windows app (5-6 minutes on my X2) it took then 4-5 minutes... I think Crunch3r has never given his optimized app to anyone. And you can't call the old 1.24 Linux version an optimized version, it was just using a better compiler enabling auto vectorization (SSE2) on 64Bit systems afaik. And you should also remember the current WUs are 4 to 4.2 times the length of the old 260credit WUs for the 1.22 version. | |
| ID: 9268 | Rating: 0 | rate:
| |
What do mean by 'it's not worth the 64bit'? Normally the 64bit (stock) apps are compiled with SSE2 enabled, because a 64bit-capable cpu is also capable of at least SSE2. That's not the case for all 32bit cpus, so the 32bit apps are usually compiled without SSE2 and thus slower. ____________ mic. | |
| ID: 9270 | Rating: 0 | rate:
| |
Normally the 64bit (stock) apps are compiled with SSE2 enabled, because a 64bit-capable cpu is also capable of at least SSE2. And, if the compiler is capable of auto-vectorization, it should always be enabled for x86-64. For GCC, the option is -ftree-vectorize, which is implied by -O3 on versions 4.3 and later. Unfortunately, MS VS does not support auto-vectorization. For Windows the Intel compiler could be used instead. HTH ____________ | |
| ID: 9271 | Rating: 0 | rate:
| |
I found out that PNI = SSE3 According to this intel document, my Intel Xeon L5420 supports SSE4.1 cat /proc/cpuinfo flags shows - fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr I know pni = SSE3 but how is SSE4.1 indicated? | |
| ID: 9297 | Rating: 0 | rate:
| |
|
Looks like it is indicated by sse4_1 | |
| ID: 9300 | Rating: 0 | rate:
| |
cat /proc/cpuinfo flags shows - Update your kernel. HTH ____________ | |
| ID: 9302 | Rating: 0 | rate:
| |
|
Hi mic, Looks like they're doing well. Haven't had a bad result come back with any compiled v0.16 app yet :) ____________ | |
| ID: 9704 | Rating: 0 | rate:
| |
Hi mic, Sure. Always good to have everything in one place. ____________ mic. | |
| ID: 9852 | Rating: 0 | rate:
| |
Hi mic, cool ;) ____________ | |
| ID: 9856 | Rating: 0 | rate:
| |
|
If someone feels like crunching full speed on linux, here are the new v18d apps. | |
| ID: 11020 | Rating: 0 | rate:
| |
|
Thanks speedimic! Just dumped it onto my linux32 laptop. | |
| ID: 11022 | Rating: 0 | rate:
| |
|
Yeah, 32bit SSSE3 Linux dropped from 22 to 9 minutes. | |
| ID: 11029 | Rating: 0 | rate:
| |
|
Thanks mic, I now have it running on my Fedora9_64 boxes | |
| ID: 11040 | Rating: 0 | rate:
| |
|
My laptop is now running around 22-25 minutes with the optimised Linux app. | |
| ID: 11043 | Rating: 0 | rate:
| |
If someone feels like crunching full speed on linux, here are the new v18d apps. The above has been updated to zslip, thanks speedimic ;) ____________ | |
| ID: 11046 | Rating: 0 | rate:
| |
|
If there's any other Linux32/64 - SSE-level combination needed, just tell me! | |
| ID: 11052 | Rating: 0 | rate:
| |
|
Just got myself the shiny new version of the Intel compiler... and made new apps. | |
| ID: 11508 | Rating: 0 | rate:
| |
What has changed is that everything up to and including sse3 is AMD compatible, and I made all sse-levels the compiler offered me. If that's a ICC 11 version, than SSE is just a synonyme for x87 (no enhanced instruction set). It won't put any SSE instructions in the code. The difference to the stock app is just the change gcc -> ICC. But maybe that gains some speed for older machines, too. | |
| ID: 11514 | Rating: 0 | rate:
| |
If that's a ICC 11 version, than SSE is just a synonyme for x87 (no enhanced instruction set). It won't put any SSE instructions in the code. someone called for that, so I made it... didn't get any feedback on the crunch time and I didn't try it. ____________ mic. | |
| ID: 11518 | Rating: 0 | rate:
| |
Just got myself the shiny new version of the Intel compiler... and made new apps. SSE3 AMD is a really useful improvement, thank you! | |
| ID: 11535 | Rating: 0 | rate:
| |
|
What about Intel Linux 64 Bit SSE2 ? I have a couple of Q6600's on SSE2 PC's | |
| ID: 11973 | Rating: 0 | rate:
| |
|
Q6600 has SSE3 and SSSE3, doesn't it? SSE3 is denoted by the code pni. | |
| ID: 11989 | Rating: 0 | rate:
| |
|
OK on one Linux PC it just said SSE2 but on the windows using CPU-Z it says SSE3 so I guess I am fine and do not need a SSE2 version. I am not up on all this stuff :) | |
| ID: 11995 | Rating: 0 | rate:
| |
|
No worries, judging by your stats for many projects you're struggling along quite well. | |
| ID: 12017 | Rating: 0 | rate:
| |
|
Turns out that Ubuntu (cat /proc/cpuinfo) shows it as ssse3, and the optimized ssse3 works fine, well so far anyway. | |
| ID: 12021 | Rating: 0 | rate:
| |
|
Yes that's fine, SSSE3 is different to SSE3 though, it has some extra instructions and an extra "S". :) | |
| ID: 12034 | Rating: 0 | rate:
| |
Not sure of the difference in speed between SSE3 and SSSE3 versions... It should be zilch, since it's unlikely that the compiler will find opportunities in MW code to fit the multimedia-like SSSE3 instructions. HTH ____________ | |
| ID: 12403 | Rating: 0 | rate:
| |
Not sure of the difference in speed between SSE3 and SSSE3 versions... right, not much difference - but a higher SSE-level surely looks faster. :D ____________ mic. | |
| ID: 12411 | Rating: 0 | rate:
| |
|
Is there any plan for a linux app for ATI/GPU in the pipeline quad or i7 | |
| ID: 13746 | Rating: 0 | rate:
| |
|
Yes, release ATI app for Linux, please. CUDA for all systems is under construction now, but it makes sense to make app for stronger cards first and ATI is the one. I hate running my machine under Win just to be able to run Milky on GPU. :-( And I do not think there are more guys with NVidia GTS/GTX 200 cards than Linux ones with ATI 38xx/48xx. | |
| ID: 16864 | Rating: 0 | rate:
| |
|
Still a little disappointed with the linux apps. | |
| ID: 17624 | Rating: 0 | rate:
| |
|
After reading that Intel compiler flag "-fp-model fast=2" was being used on these applications, I figured I would run a test, and results are as expected. [admin@ntellx4 test_files]$ ./milkyway_0.18_SSSE3_x86_64-pc-linux-gnu [admin@ntellx4 test_files]$ more out searchname parameters [8]: 0.733171635575244 14.657212876628332 -1.705465347395041 16.911711745343634 28.077212666463502 -1.203290851581461 3.527360643924728 2.224821450587501 metadata: this is the metadata fitness: -3.027909854710229 speedimic_SSSE3_64: 0.18 Correct output: 86: searchname parameters [8]: 0.733171635575244 14.657212876628332 -1.705465347395041 16.911711745343634 28.077212666463502 -1.203290851581461 3.527360643924728 2.224821450587501 metadata: this is the metadata fitness: -3.027909854710189 your_app_name: 0.18 Travis, i'm starting to think a quorum should be used as in other BOINC projects, since the code is open source. ____________ | |
| ID: 17633 | Rating: 0 | rate:
| |
The Linux 64bit SSSE3 0.18d mkII app downloaded from zslip.com gives invalid results. No. Travis has given a range for an allowed deviation from the result you quoted. And speedimics results are within those requirements. You also get small deviations between the results using the x87 FPU, SSE2, PowerPC FPU or AltiVec. As long they are small enough it is okay. | |
| ID: 17634 | Rating: 0 | rate:
| |
|
If that is the case, then why does Travis have a thread about testing custom apps, saying "The results should look like the following:" instead of "should be within a deviation of the following:"? | |
| ID: 17637 | Rating: 0 | rate:
| |
|
He did post awhile ago that the results must be to the x place, I believe it was the 10th. Those results are equal to the 12th. | |
| ID: 17639 | Rating: 0 | rate:
| |
Still a little disappointed with the linux apps... It might be because Linux manages power differently from Windows, running BOINC applications at a slow CPU frequency in order to save energy. See more details here. HTH ____________ | |
| ID: 17665 | Rating: 0 | rate:
| |
It might be because Linux manages power differently from Windows, running BOINC applications at a slow CPU frequency in order to save energy. See more details here. Afraid not, speedstep, thermal throttling are all disabled in the BIOS. Solid overclock applied and powernow/acpi daemons are not running once booted. The linux client is simply inefficient compared to win32. ____________ | |
| ID: 17668 | Rating: 0 | rate:
| |
The linux client is simply inefficient compared to win32. Different compilers perhaps? ____________ | |
| ID: 17677 | Rating: 0 | rate:
| |
The linux client is simply inefficient compared to win32. I thought the fastest version of Speedimic is quite close. Just looked it up, speedimics own Q9550 (45nm, 2.83GHz) is taking about 1030 seconds for a 27.77 credit WU under Linux. He has another host (Q6600, 65nm, appears to be overclocked to 2.7-2.8GHz from the benchmark values) completing the same tasks in about 1150 seconds under Windows. It is a somehow bad comparison because of the different clockspeed and that the 45nm CPUs should be faster per clock than their 65nm counterparts, but nonetheless it is clear that his Linux version can't be that bad. | |
| ID: 17678 | Rating: 0 | rate:
| |
The linux client is simply inefficient compared to win32. Here are my tasks .... Laptop win32 : http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=39511370 Linux : http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=39523720 Thanks for the pointers I'll recheck if I'm using the wrong client ____________ | |
| ID: 17680 | Rating: 0 | rate:
| |
OK the q6600 is closer to arch than the q9550. He's using SSE4.1 on the q9550. Odd - He has a different client to me on his q6600. Any ideas where I can get it from ? ____________ | |
| ID: 17681 | Rating: 0 | rate:
| |
Here are my tasks .... You could use the SSSE3 version (instead of only SSE3) on your Core2. The SSE3 binary retains compatibilty to AMD CPUs and may be a bit slower. | |
| ID: 17683 | Rating: 0 | rate:
| |
You could use the SSSE3 version (instead of only SSE3) on your Core2. The SSE3 binary retains compatibilty to AMD CPUs and may be a bit slower. OK thanks - I don't think it'll make much difference - I'll give that a try... Seems also that he's using version 0.19.... This is looking more and more like me needing to compile 0.19 from source unless there are any binaries laying around for the lazy ;) ____________ | |
| ID: 17684 | Rating: 0 | rate:
| |
The linux client is simply inefficient compared to win32. To be honest, the win-apps by ClusterPhysik are a bit faster. Those Q6600ers aren't OC'd (the benchmark results might be a little off because of the 6.1 BOINC client) If I recall it right, Cluster made some more modifications which didn't make it into stock code... As stated before my apps are compiled from stock source and approved by Travis. @mfl0p: The compiler flag "-fp-model fast=2" doesn't make any difference - nighter in speed nor in accuracy (at least on my test box). ____________ mic. | |
| ID: 17734 | Rating: 0 | rate:
| |
[@mfl0p: With Intel compiler 11.0 default -fp-model is fast=1. Using -fp-model precise on my linux machine gives the exact same results as expected in the test files forum thread. Changing to -fp-model fast (or any variation of fast) drops about 60 seconds off test parameters 86 runtime, at the expense of the deviated fitness result, like your app. But, it's irrelevant anyway, since it still produces a "close enough" answer, as mentioned in a few posts in this thread. Figured I would mention it, for accuracy's sake, from Intel docs: Recommendation: /fp:precise /fp:source (-fp-model precise –fp-model source) is the recommended form for the majority of situations where enhanced floating point consistency and reproducibility are needed. Re: fpus on different architectures giving different results, I have observed this on the PPC platform, not using any math shortcuts in the compiler, all of the test file results match x86 except parameters 20, which has a devation of 1 at the 15th decimal place. | |
| ID: 17767 | Rating: 0 | rate:
| |
Yes, release ATI app for Linux, please. CUDA for all systems is under construction now, but it makes sense to make app for stronger cards first and ATI is the one. I hate running my machine under Win just to be able to run Milky on GPU. :-( And I do not think there are more guys with NVidia GTS/GTX 200 cards than Linux ones with ATI 38xx/48xx. Count me in on the Linux 64b (Ubuntu) w/ HD3870 tally. ____________ - da shu @ HeliOS, "A child's exposure to technology should never be predicated on an ability to afford it." | |
| ID: 24914 | Rating: 0 | rate:
| |
|
PLEASE, PLEASE, PLEASE!!! make ATI GPU 0.20 app for linux x64. I'm pissed off using windows... Right now I'm gonna build a rig for MW, but if there is a chance to avoid this - it will perfect :-) | |
| ID: 36136 | Rating: 0 | rate:
| |
PLEASE, PLEASE, PLEASE!!! make ATI GPU 0.20 app for linux x64. I'm pissed off using windows... Right now I'm gonna build a rig for MW, but if there is a chance to avoid this - it will perfect :-) Please see http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=1475, once the code is available I plan on attempting to compile a version for GNU/Linux. | |
| ID: 36184 | Rating: 0 | rate:
| |
|
WOW, that's great news :-) Hopefully u'll get code soon | |
| ID: 36239 | Rating: 0 | rate:
| |
|
how about a linux ati app. | |
| ID: 36665 | Rating: 0 | rate:
| |
how about a linux ati app. +1. guys, what's the news? if there is any chance to get it done? PLEASE, I wanna switch my 2nd rig to linux ____________ | |
| ID: 37425 | Rating: 0 | rate:
| |
|
I'll add my request in for a Linux ATI GPU application. Hasn't the source code been available for a while now? | |
| ID: 39351 | Rating: 0 | rate:
| |
|
The Linux x86-64 SSE4.1 app consistently crashed on me about 30% of the way through, with a SIGSEGV (segfault) error. I have a C2Q 9400 that should support it. | |
| ID: 39592 | Rating: 0 | rate:
| |
|
looks no1 really care :( that's weird... | |
| ID: 40316 | Rating: 0 | rate:
| |
|
i also have a hd5870 just asking to crunch under linux64... is this being worked on or is my GPU doomed to only crunch collatz? | |
| ID: 40814 | Rating: 0 | rate:
| |
PLEASE, PLEASE, PLEASE!!! make ATI GPU 0.20 app for linux x64. I'm pissed off using windows... Right now I'm gonna build a rig for MW, but if there is a chance to avoid this - it will perfect :-) 7 months later ... any change? My linux64/ati boxes have been parked on colatz for what seems like forever... | |
| ID: 41394 | Rating: 0 | rate:
| |
7 months later ... any change? It would be nice to get an update on this since so much time has gone by. I'd still love to be able to use the ATI GPU in my Linux system. | |
| ID: 41569 | Rating: 0 | rate:
| |
7 months later ... any change? I'm working on a new OpenCL one which should work everywhere(Linux/OS X/Windows/Nvidia/ATI) right now. I hope to have it ready to release in 1-2 weeks. | |
| ID: 41570 | Rating: 0 | rate:
| |
I'm working on a new OpenCL one which should work everywhere(Linux/OS X/Windows/Nvidia/ATI) right now. I hope to have it ready to release in 1-2 weeks. That's very nice to hear, Matt! There are a lot of people looking forward to testing it out when it's ready. | |
| ID: 41573 | Rating: 0 | rate:
| |
|
Hi Matt, | |
| ID: 42014 | Rating: 0 | rate:
| |
|
I am thinking of going Linux with Intel and ATI/AMD GPU farm. How is this project shaping up? Any advice for set up? | |
| ID: 43363 | Rating: 0 | rate:
| |
Message boards :
Application Code Discussion :
Recompiled Linux 32/64 apps