Message boards :
Application Code Discussion :
Recompiled Linux 32/64 apps
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0 |
Normally the 64bit (stock) apps are compiled with SSE2 enabled, because a 64bit-capable cpu is also capable of at least SSE2. And, if the compiler is capable of auto-vectorization, it should always be enabled for x86-64. For GCC, the option is -ftree-vectorize, which is implied by -O3 on versions 4.3 and later. Unfortunately, MS VS does not support auto-vectorization. For Windows the Intel compiler could be used instead. HTH |
Send message Joined: 12 Oct 07 Posts: 77 Credit: 404,471,187 RAC: 0 |
I found out that PNI = SSE3 According to this intel document, my Intel Xeon L5420 supports SSE4.1 cat /proc/cpuinfo flags shows - fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr I know pni = SSE3 but how is SSE4.1 indicated? |
Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0 |
Looks like it is indicated by sse4_1 My Xeon X3350 cat /proc/cpuinfo flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmovpat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm Thank you for your SSE41_64 version speedimic. It is working well on my computer @ 3.3GHz. |
Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0 |
cat /proc/cpuinfo flags shows - Update your kernel. HTH |
Send message Joined: 6 Apr 08 Posts: 2018 Credit: 100,142,856 RAC: 0 |
Hi mic, I've added these to zslip I hope that's OK Looks like they're doing well. Haven't had a bad result come back with any compiled v0.16 app yet :) |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
Hi mic, Sure. Always good to have everything in one place. mic. |
Send message Joined: 6 Apr 08 Posts: 2018 Credit: 100,142,856 RAC: 0 |
Hi mic, cool ;) |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
If someone feels like crunching full speed on linux, here are the new v18d apps. Crunch time is cut to half compared to the prior version (on my quad at least...). Linux32 Pack (with new SSSE3-version) Linux64 Pack LinuxAMD Pack Report problems & errors here... mic. |
Send message Joined: 25 Nov 07 Posts: 25 Credit: 54,443,893 RAC: 0 |
Thanks speedimic! Just dumped it onto my linux32 laptop. |
Send message Joined: 11 Nov 07 Posts: 41 Credit: 1,000,181 RAC: 0 |
Yeah, 32bit SSSE3 Linux dropped from 22 to 9 minutes. 64bit SSSE3 runs them in 8.5 minutes compared to 18 with the stock app. Both on Intel Q6600. Very nice :) |
Send message Joined: 12 Oct 07 Posts: 77 Credit: 404,471,187 RAC: 0 |
Thanks mic, I now have it running on my Fedora9_64 boxes |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
My laptop is now running around 22-25 minutes with the optimised Linux app. |
Send message Joined: 6 Apr 08 Posts: 2018 Credit: 100,142,856 RAC: 0 |
If someone feels like crunching full speed on linux, here are the new v18d apps. The above has been updated to zslip, thanks speedimic ;) |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
If there's any other Linux32/64 - SSE-level combination needed, just tell me! mic. |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
Just got myself the shiny new version of the Intel compiler... and made new apps. Codebase is still 18d. Due to the varying WUs I can't really tell if those are any faster - that' up to you to decide... ;) So I would call updating optional! What has changed is that everything up to and including sse3 is AMD compatible, and I made all sse-levels the compiler offered me. Linux32-SSE Linux32-SSE2 Linux32-SSE3 Linux32-SSSE3 Linux32-SSE4.1 Linux32-SSE4.2 Linux64-SSE3 Linux64-SSSE3 Linux64-SSE4.1 Linux64-SSE4.2 mic. |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
What has changed is that everything up to and including sse3 is AMD compatible, and I made all sse-levels the compiler offered me. If that's a ICC 11 version, than SSE is just a synonyme for x87 (no enhanced instruction set). It won't put any SSE instructions in the code. The difference to the stock app is just the change gcc -> ICC. But maybe that gains some speed for older machines, too. |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
If that's a ICC 11 version, than SSE is just a synonyme for x87 (no enhanced instruction set). It won't put any SSE instructions in the code. someone called for that, so I made it... didn't get any feedback on the crunch time and I didn't try it. mic. |
Send message Joined: 13 Feb 08 Posts: 1124 Credit: 46,740 RAC: 0 |
Just got myself the shiny new version of the Intel compiler... and made new apps. SSE3 AMD is a really useful improvement, thank you! |
Send message Joined: 28 Aug 07 Posts: 35 Credit: 88,736,663 RAC: 1,858 |
What about Intel Linux 64 Bit SSE2 ? I have a couple of Q6600's on SSE2 PC's Proud Founder and member of Have a look at my WebCam |
Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0 |
Q6600 has SSE3 and SSSE3, doesn't it? SSE3 is denoted by the code pni. PNI (Prescott New Instructions) was the original engineering code name for SSE3. |
©2024 Astroinformatics Group