Welcome to MilkyWay@home

app v0.9

Message boards : Number crunching : app v0.9
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 8636 - Posted: 19 Jan 2009, 4:08:08 UTC - in response to Message 8621.  

osx, x86_64:
-O3 -mfpmath=sse -mtune=nocona -msse3 -ftree-vectorize -funroll-loops
...


Yeah, the OS X flags look fine to me (where I specialize ;-) ).


Please use -march=nocona instead of -mtune=nocona.
-mtune will tune the code for specific processor, but -march will actually use the processor specific instructions (and implicitly set -mtune for the target architecture)


hm this might be why the x86_64 version for os x isn't running as fast as it should. (i bumped it up from sse2 to sse3).
ID: 8636 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 8639 - Posted: 19 Jan 2009, 4:19:36 UTC - in response to Message 8636.  

osx, x86_64:
-O3 -mfpmath=sse -mtune=nocona -msse3 -ftree-vectorize -funroll-loops
...


Yeah, the OS X flags look fine to me (where I specialize ;-) ).


Please use -march=nocona instead of -mtune=nocona.
-mtune will tune the code for specific processor, but -march will actually use the processor specific instructions (and implicitly set -mtune for the target architecture)


hm this might be why the x86_64 version for os x isn't running as fast as it should. (i bumped it up from sse2 to sse3).


I updated the x86_64 os x app, so hopefully it should be running as fast as it should.
ID: 8639 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jedirock
Avatar

Send message
Joined: 8 Nov 08
Posts: 178
Credit: 6,140,854
RAC: 0
Message 8640 - Posted: 19 Jan 2009, 4:59:37 UTC - in response to Message 8636.  

Please use -march=nocona instead of -mtune=nocona.
-mtune will tune the code for specific processor, but -march will actually use the processor specific instructions (and implicitly set -mtune for the target architecture)


hm this might be why the x86_64 version for os x isn't running as fast as it should. (i bumped it up from sse2 to sse3).

As I remember, SSE2 is automatically enabled for x86_64 processors, so you don't need it. As for the -mtune and -march arguments, I keep forgetting the difference between the two, but that makes sense why -march would give better results.
ID: 8640 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : app v0.9

©2024 Astroinformatics Group