Welcome to MilkyWay@home

Windows and linux compiler optimization differences.

Message boards : Number crunching : Windows and linux compiler optimization differences.
Message board moderation

To post messages, you must log in.

AuthorMessage
kotenok2000
Avatar

Send message
Joined: 22 May 11
Posts: 71
Credit: 5,685,114
RAC: 0
Message 76992 - Posted: 2 Apr 2024, 15:08:07 UTC
Last modified: 2 Apr 2024, 15:10:25 UTC

Why does milkyway_nbody_1.83_windows_x86_64__mt.exe has only BMI, CMOV, MODE64, SSE1 and SSE2 instruction set extensions, but milkyway_nbody_1.83_x86_64-pc-linux-gnu__mt has AVX, AVX2, AVX512, BMI, CMOV, FMA4, MODE64, NOT64BITMODE, NOVLX, RTM, SSE1, SSE2, SSE3, SSE41 and SSE42 extensions?
Modern cpu users running Windows aren't using their cpu to full potential.
ID: 76992 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 624
Credit: 19,299,762
RAC: 2,614
Message 76993 - Posted: 2 Apr 2024, 16:30:11 UTC - in response to Message 76992.  

Modern cpu users running Windows aren't using their cpu to full potential.
Apparently the application can't even use the full potential of my Core 2 Duo, which I don't consider modern anymore. But have you checked the current version, i.e. 1.86 with Orbit Fitting?
ID: 76993 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 May 11
Posts: 71
Credit: 5,685,114
RAC: 0
Message 76994 - Posted: 2 Apr 2024, 16:35:47 UTC - in response to Message 76993.  

milkyway_nbody_orbit_fitting_1.86_windows_x86_64__mt.exe shows BMI, CMOV, MODE64, SSE1, SSE2
ID: 76994 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xii5ku

Send message
Joined: 1 Jan 17
Posts: 37
Credit: 111,041,580
RAC: 35,436
Message 76995 - Posted: 2 Apr 2024, 18:46:54 UTC
Last modified: 2 Apr 2024, 18:48:40 UTC

I only have Linux, thus can't compare Linux and Windows performance myself. But I am pretty sure that at least vector math is not used to any extent that would matter.

1.) Zen 4, Zen 2, and Broadwell-EP computers pull rather little electric power when running nbody or nbody with orbit fitting, compared to when they run other projects.

2.) Broadwell-EP does not engage the negative AVX2 clock rate offset when it runs nbody or nbody with orbit fitting.
ID: 76995 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ahorek's team

Send message
Joined: 8 Sep 07
Posts: 7
Credit: 2,357,594
RAC: 104
Message 77136 - Posted: 14 May 2024, 23:24:13 UTC

elfx86exts is a handy indicator of what's in the binary, but it could be misinterpreted. Reporting an AVX instruction doesn't mean it's optimized for AVX. It means, there's at least 1 AVX instruction somewhere, but it could be 1 instruction out of millions...

the current app has no manual SIMD optimizations, It is utilizing SSE2 (which is a requirement for x86_64) thanks to auto-vectorization done by the compiler. Recompiling it with AVX+ makes no difference

additional instructions on Linux are likely from dependent libraries like glibc (for memcpy etc.). The algorithm (where the app spends time) must efficiently utilize SIMD, to make the app faster and it usually requires much more work than just enabling an AVX flag :)
ID: 77136 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Windows and linux compiler optimization differences.

©2024 Astroinformatics Group