Windows and linux compiler optimization differences.

Author	Message
kotenok2000 Send message Joined: 22 May 11 Posts: 75 Credit: 5,755,726 RAC: 306	Message 76992 - Posted: 2 Apr 2024, 15:08:07 UTC Last modified: 2 Apr 2024, 15:10:25 UTC Why does milkyway_nbody_1.83_windows_x86_64__mt.exe has only BMI, CMOV, MODE64, SSE1 and SSE2 instruction set extensions, but milkyway_nbody_1.83_x86_64-pc-linux-gnu__mt has AVX, AVX2, AVX512, BMI, CMOV, FMA4, MODE64, NOT64BITMODE, NOVLX, RTM, SSE1, SSE2, SSE3, SSE41 and SSE42 extensions? Modern cpu users running Windows aren't using their cpu to full potential. ID: 76992 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 19 Jul 10 Posts: 819 Credit: 21,101,158 RAC: 5,585	Message 76993 - Posted: 2 Apr 2024, 16:30:11 UTC - in response to Message 76992. Modern cpu users running Windows aren't using their cpu to full potential. Apparently the application can't even use the full potential of my Core 2 Duo, which I don't consider modern anymore. But have you checked the current version, i.e. 1.86 with Orbit Fitting? ID: 76993 · Rating: 0 · rate: / Reply Quote

kotenok2000 Send message Joined: 22 May 11 Posts: 75 Credit: 5,755,726 RAC: 306	Message 76994 - Posted: 2 Apr 2024, 16:35:47 UTC - in response to Message 76993. milkyway_nbody_orbit_fitting_1.86_windows_x86_64__mt.exe shows BMI, CMOV, MODE64, SSE1, SSE2 ID: 76994 · Rating: 0 · rate: / Reply Quote

xii5ku Send message Joined: 1 Jan 17 Posts: 39 Credit: 122,936,686 RAC: 56,827	Message 76995 - Posted: 2 Apr 2024, 18:46:54 UTC Last modified: 2 Apr 2024, 18:48:40 UTC I only have Linux, thus can't compare Linux and Windows performance myself. But I am pretty sure that at least vector math is not used to any extent that would matter. 1.) Zen 4, Zen 2, and Broadwell-EP computers pull rather little electric power when running nbody or nbody with orbit fitting, compared to when they run other projects. 2.) Broadwell-EP does not engage the negative AVX2 clock rate offset when it runs nbody or nbody with orbit fitting. ID: 76995 · Rating: 0 · rate: / Reply Quote

ahorek's team Send message Joined: 8 Sep 07 Posts: 8 Credit: 2,556,084 RAC: 39	Message 77136 - Posted: 14 May 2024, 23:24:13 UTC elfx86exts is a handy indicator of what's in the binary, but it could be misinterpreted. Reporting an AVX instruction doesn't mean it's optimized for AVX. It means, there's at least 1 AVX instruction somewhere, but it could be 1 instruction out of millions... the current app has no manual SIMD optimizations, It is utilizing SSE2 (which is a requirement for x86_64) thanks to auto-vectorization done by the compiler. Recompiling it with AVX+ makes no difference additional instructions on Linux are likely from dependent libraries like glibc (for memcpy etc.). The algorithm (where the app spends time) must efficiently utilize SIMD, to make the app faster and it usually requires much more work than just enabling an AVX flag :) ID: 77136 · Rating: 0 · rate: / Reply Quote