Welcome to MilkyWay@home

Recompiled Linux 32/64 apps


Advanced search

Message boards : Application Code Discussion : Recompiled Linux 32/64 apps
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
ebahapo
Avatar

Send message
Joined: 6 Sep 07
Posts: 66
Credit: 586,206
RAC: 157
500 thousand credit badge10 year member badge
Message 9271 - Posted: 27 Jan 2009, 19:51:25 UTC - in response to Message 9270.  

Normally the 64bit (stock) apps are compiled with SSE2 enabled, because a 64bit-capable cpu is also capable of at least SSE2.

And, if the compiler is capable of auto-vectorization, it should always be enabled for x86-64. For GCC, the option is -ftree-vectorize, which is implied by -O3 on versions 4.3 and later. Unfortunately, MS VS does not support auto-vectorization. For Windows the Intel compiler could be used instead.

HTH

ID: 9271 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Temujin

Send message
Joined: 12 Oct 07
Posts: 77
Credit: 404,471,187
RAC: 0
300 million credit badge10 year member badge
Message 9297 - Posted: 28 Jan 2009, 11:54:30 UTC - in response to Message 9240.  

I found out that PNI = SSE3

According to this intel document, my Intel Xeon L5420 supports SSE4.1
cat /proc/cpuinfo flags shows -
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr

I know pni = SSE3 but how is SSE4.1 indicated?
ID: 9297 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilekashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 148,905,504
RAC: 0
100 million credit badge10 year member badge
Message 9300 - Posted: 28 Jan 2009, 14:13:55 UTC

Looks like it is indicated by sse4_1

My Xeon X3350 cat /proc/cpuinfo flags:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmovpat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm

Thank you for your SSE41_64 version speedimic. It is working well on my computer @ 3.3GHz.

ID: 9300 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
Avatar

Send message
Joined: 6 Sep 07
Posts: 66
Credit: 586,206
RAC: 157
500 thousand credit badge10 year member badge
Message 9302 - Posted: 28 Jan 2009, 15:51:51 UTC - in response to Message 9297.  

cat /proc/cpuinfo flags shows -
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr

I know pni = SSE3 but how is SSE4.1 indicated?

Update your kernel.

HTH

ID: 9302 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileGalaxyIce
Avatar

Send message
Joined: 6 Apr 08
Posts: 2018
Credit: 100,142,856
RAC: 0
100 million credit badge10 year member badge
Message 9704 - Posted: 5 Feb 2009, 3:31:22 UTC - in response to Message 9208.  

Hi mic,

I've added these to zslip

I hope that's OK


Looks like they're doing well. Haven't had a bad result come back with any compiled v0.16 app yet :)

Now then, the new recompiled v16 apps for Linuxs:

Linux32 on Intel

SSE3_32
SSE2_32
SSE_32

Linux64 on Intel

SSE3_64
SSSE3_64
SSE41_64

For AMD users:

AMD SSE3_64
AMD SSE2_32

I only had the chance to test the AMD SSE2_32 on my Athlon64 3200+, so the rest of the testing is up to you... Please report!



ID: 9704 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilespeedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
50 million credit badge10 year member badge
Message 9852 - Posted: 7 Feb 2009, 8:04:03 UTC - in response to Message 9704.  

Hi mic,

I've added these to zslip

I hope that's OK


Sure.
Always good to have everything in one place.

mic.


ID: 9852 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileGalaxyIce
Avatar

Send message
Joined: 6 Apr 08
Posts: 2018
Credit: 100,142,856
RAC: 0
100 million credit badge10 year member badge
Message 9856 - Posted: 7 Feb 2009, 8:35:57 UTC - in response to Message 9852.  

Hi mic,

I've added these to zslip

I hope that's OK


Sure.
Always good to have everything in one place.

cool ;)


ID: 9856 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilespeedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
50 million credit badge10 year member badge
Message 11020 - Posted: 16 Feb 2009, 14:59:23 UTC

If someone feels like crunching full speed on linux, here are the new v18d apps.
Crunch time is cut to half compared to the prior version (on my quad at least...).

Linux32 Pack
(with new SSSE3-version)

Linux64 Pack

LinuxAMD Pack

Report problems & errors here...

mic.


ID: 11020 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileDaniel

Send message
Joined: 25 Nov 07
Posts: 25
Credit: 54,173,252
RAC: 0
50 million credit badge10 year member badge
Message 11022 - Posted: 16 Feb 2009, 15:07:26 UTC

Thanks speedimic! Just dumped it onto my linux32 laptop.
ID: 11022 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cwhyl

Send message
Joined: 11 Nov 07
Posts: 41
Credit: 1,000,181
RAC: 0
1 million credit badge10 year member badge
Message 11029 - Posted: 16 Feb 2009, 16:23:19 UTC
Last modified: 16 Feb 2009, 16:28:32 UTC

Yeah, 32bit SSSE3 Linux dropped from 22 to 9 minutes.
64bit SSSE3 runs them in 8.5 minutes compared to 18 with the stock app.
Both on Intel Q6600.
Very nice :)
ID: 11029 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Temujin

Send message
Joined: 12 Oct 07
Posts: 77
Credit: 404,471,187
RAC: 0
300 million credit badge10 year member badge
Message 11040 - Posted: 16 Feb 2009, 19:34:42 UTC - in response to Message 11029.  

Thanks mic, I now have it running on my Fedora9_64 boxes
ID: 11040 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilearkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
50 million credit badge10 year member badge
Message 11043 - Posted: 16 Feb 2009, 20:54:17 UTC

My laptop is now running around 22-25 minutes with the optimised Linux app.
ID: 11043 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileGalaxyIce
Avatar

Send message
Joined: 6 Apr 08
Posts: 2018
Credit: 100,142,856
RAC: 0
100 million credit badge10 year member badge
Message 11046 - Posted: 16 Feb 2009, 21:20:14 UTC - in response to Message 11020.  
Last modified: 16 Feb 2009, 21:20:41 UTC

If someone feels like crunching full speed on linux, here are the new v18d apps.
Crunch time is cut to half compared to the prior version (on my quad at least...).

Linux32 Pack
(with new SSSE3-version)

Linux64 Pack

LinuxAMD Pack

Report problems & errors here...

The above has been updated to zslip, thanks speedimic ;)

ID: 11046 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilespeedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
50 million credit badge10 year member badge
Message 11052 - Posted: 16 Feb 2009, 21:58:30 UTC

If there's any other Linux32/64 - SSE-level combination needed, just tell me!

mic.


ID: 11052 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilespeedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
50 million credit badge10 year member badge
Message 11508 - Posted: 18 Feb 2009, 21:59:57 UTC

Just got myself the shiny new version of the Intel compiler... and made new apps.

Codebase is still 18d.
Due to the varying WUs I can't really tell if those are any faster - that' up to you to decide... ;)
So I would call updating optional!

What has changed is that everything up to and including sse3 is AMD compatible, and I made all sse-levels the compiler offered me.

Linux32-SSE
Linux32-SSE2
Linux32-SSE3
Linux32-SSSE3
Linux32-SSE4.1
Linux32-SSE4.2

Linux64-SSE3
Linux64-SSSE3
Linux64-SSE4.1
Linux64-SSE4.2



mic.


ID: 11508 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
50 million credit badge10 year member badgeextraordinary contributions badge
Message 11514 - Posted: 18 Feb 2009, 22:33:38 UTC - in response to Message 11508.  

What has changed is that everything up to and including sse3 is AMD compatible, and I made all sse-levels the compiler offered me.

Linux32-SSE

If that's a ICC 11 version, than SSE is just a synonyme for x87 (no enhanced instruction set). It won't put any SSE instructions in the code.
The difference to the stock app is just the change gcc -> ICC. But maybe that gains some speed for older machines, too.
ID: 11514 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilespeedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
50 million credit badge10 year member badge
Message 11518 - Posted: 18 Feb 2009, 22:54:58 UTC - in response to Message 11514.  

If that's a ICC 11 version, than SSE is just a synonyme for x87 (no enhanced instruction set). It won't put any SSE instructions in the code.
The difference to the stock app is just the change gcc -> ICC. But maybe that gains some speed for older machines, too.


someone called for that, so I made it... didn't get any feedback on the crunch time and I didn't try it.
mic.


ID: 11518 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfilePhil
Avatar

Send message
Joined: 13 Feb 08
Posts: 1124
Credit: 46,740
RAC: 0
10 thousand credit badge10 year member badge
Message 11535 - Posted: 19 Feb 2009, 0:33:15 UTC - in response to Message 11508.  

Just got myself the shiny new version of the Intel compiler... and made new apps.

Codebase is still 18d.
Due to the varying WUs I can't really tell if those are any faster - that' up to you to decide... ;)
So I would call updating optional!

What has changed is that everything up to and including sse3 is AMD compatible, and I made all sse-levels the compiler offered me.

SSE3 AMD is a really useful improvement, thank you!
ID: 11535 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileDingo
Avatar

Send message
Joined: 28 Aug 07
Posts: 35
Credit: 64,596,920
RAC: 16,292
50 million credit badge10 year member badge
Message 11973 - Posted: 21 Feb 2009, 2:44:09 UTC - in response to Message 11052.  
Last modified: 21 Feb 2009, 2:47:41 UTC

What about Intel Linux 64 Bit SSE2 ? I have a couple of Q6600's on SSE2 PC's

Proud Founder and member of



Have a look at my WebCam
ID: 11973 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilekashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 148,905,504
RAC: 0
100 million credit badge10 year member badge
Message 11989 - Posted: 21 Feb 2009, 3:12:37 UTC - in response to Message 11973.  

Q6600 has SSE3 and SSSE3, doesn't it? SSE3 is denoted by the code pni.

PNI (Prescott New Instructions) was the original engineering code name for SSE3.
ID: 11989 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Application Code Discussion : Recompiled Linux 32/64 apps

©2019 Astroinformatics Group