Message boards :
Number crunching :
Need specific instructions on how to optimize AMD Phenom for SSEa, and Milkyway@home for X64.
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Feb 11 Posts: 5 Credit: 21,146,417 RAC: 0 |
Hello, I'm looking to optimize BOINC manager in a way that will improve processing performance, and not affect BOINC stability. My computer is equipped with a Phenom 9850BE processor clocked to 2.9Ghz, and has support for SSE1,2,3, and 4A. My video processor is an HD4850 that is working fine with BOINC at the moment. My BOINC client version is X64, 6.10.58. What I would like to improve: Milkyway@home processes are running in *32 bit mode, and in SSE2. Is there a way that I could have these processes run in X64, and SSE4a or SSE3? The BOINC client itself is running fine as an X64 process. Thanks again in advance! =) |
Send message Joined: 11 Feb 11 Posts: 5 Credit: 21,146,417 RAC: 0 |
I understand that there are many helpful guides on this website. I just want to know if there is a safe way to optimize Milkyway@home for SSE4 or SSE3, without affecting the quality of data received on the Milkyway@home end? |
Send message Joined: 10 Dec 09 Posts: 18 Credit: 9,456,111 RAC: 0 |
I'm not entirely sure, on terms of 32 vs 64 bit performance regarding this project, but what i can tell you is that SSE's are instruction sets, not necessarily optimizations in themselves (although the instructions themselves can serve to execute an action much faster than without) I don't believe that our wu's would have any benefit from SSE 3 or 4/4a, as the instruction sets would have to correlate with what the application is trying to achieve, in order to see a noticeable change in WU completion. Just adding them won't optimize anything, unless instructions only found in sse3 or sse4/4a greatly speeds up a necessary calculation. Compared to SSE2, SSE3 and SSE4 are merely minor updates to the SSE instruction set, and thus don't have as much of an impact on speed as compared to with and without SSE2. |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
For non N-Body apps, it's already been done. See this thread. |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
For non N-Body apps, it's already been done. See this thread. Not any more, the CPU apps have all been depreciated. |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
For non N-Body apps, it's already been done. See this thread. Hmm, I see that now. Dang. I thought the optimised CPU apps for the 'standard' MW app was still valid? |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
They were all giving invalid results after they updated the core apps. |
Send message Joined: 11 Feb 11 Posts: 5 Credit: 21,146,417 RAC: 0 |
Thanks for the information gentleman. I have decided to keep the official software; I can't risk compromising the computation results. I'll hold out until SSE3 or SSE4a comes integrated in official releases, if ever it does. I have one more question though. My GPU is computing de_seperation tasks. Are these tasks gpu-only, or can the CPU process these too? |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
Your task manager screenshot shows, you were running 3 nbody WUs on CPU 1 separation WU on CPU 1 separation WU on GPU |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
...has support for SSE1,2,3, and 4A. My video processor is an HD4850 that is working fine with BOINC at the moment. My BOINC client version is X64, 6.10.58. These instruction sets require either the compiler to be able to find ways to use them, or to hand-write them. Current compilers usually aren't particularly great at finding ways to use all of the special instructions potentially available to them. I see how some stuff in SSE3 could help if done by hand (I'm looking at haddpd and hsubpd), but the others I don't think are particularly useful. The jump from using the antique x87 FPU to SSE2 is huge, which is part of why by default you should get SSE2 applications (The N-body requires SSE2 since x87 it's a pain / in some cases impossible to get consistent results from it). I added to the build system a while ago an easy way to rebuild everything with every SSE level, and building with SSE3, SSE4* etc. didn't really show any improvement with GCC or clang. |
Send message Joined: 11 Feb 11 Posts: 5 Credit: 21,146,417 RAC: 0 |
Great, thank you very much! |
©2024 Astroinformatics Group