Message boards :
Number crunching :
Lastest Stock Apps - Optimized or Not
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Mar 09 Posts: 15 Credit: 15,856,582 RAC: 0 |
I have jut restarted MW after the last Optimized apps were dropped arounf April time. I have just reattached to MW and using the stock apps on a Win7 64bit with GTX460 Are the stock apps optimised versions or are there going to be NEW optimized version coming soon especially for CUDA I want to run 2 WU on my GTX460, the current stock app is only running one WU on the GTX460, can some one provide an app_info.xml file for the stock apps so I can modify the CUDA count to allow 2 WU Thanks |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
I have jut restarted MW after the last Optimized apps were dropped arounf April time. Just the past week a 'optimized' stock app was released. I am down to 9-10 hours on my P4 Xp for the de_separation_13_3s tasks. A large improvement over the previous app. Still seems slightly slower than the old Opti apps. I know it was posted somewhere. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
<app_info> <app> <name>milkyway</name> </app> <file_info> <name>milkyway_separation_0.82_windows_intelx86__cuda_opencl.exe</name> <executable /> </file_info> <app_version> <app_name>milkyway</app_name> <version_num>82</version_num> <flops>1.0e11</flops> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>1</max_ncpus> <plan_class>cuda</plan_class> <coproc> <type>CUDA</type> <count>1</count> </coproc> <cmdline></cmdline> <file_ref> <file_name>milkyway_separation_0.82_windows_intelx86__cuda_opencl.exe</file_name> <main_program/> </file_ref> </app_version> </app_info> New features: |
Send message Joined: 30 Mar 09 Posts: 15 Credit: 15,856,582 RAC: 0 |
Thanks, after a bit of searching I found these OPTI apps on your site. I did notice a mistake with the downloaded Win64bit CPU files, in the app_info for the CPU version you are missing a < from the ending tag of name for the first opti app. see below. Upon running BOINC it complained about the missing app and delete it, upon placing the < into the file makes it work. <app_info> <app> <name>milkyway</name> </app> <file_info> <name>milkyway_separation_0.88_windows_x86_64.exe/name> <executable /> </file_info> <app_version> <app_name>milkyway</app_name> <version_num>88</version_num> <cmdline></cmdline> <file_ref> <file_name>milkyway_separation_0.88_windows_x86_64.exe</file_name> <main_program/> </file_ref> </app_version> </app_info> <app_info> <app> <name>milkyway</name> </app> <file_info> <name>milkyway_separation_0.82_windows_intelx86__cuda_opencl.exe</name> <executable /> </file_info> <app_version> <app_name>milkyway</app_name> <version_num>82</version_num> <flops>1.0e11</flops> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>1</max_ncpus> <plan_class>cuda</plan_class> <coproc> <type>CUDA</type> <count>1</count> </coproc> <cmdline></cmdline> <file_ref> <file_name>milkyway_separation_0.82_windows_intelx86__cuda_opencl.exe</file_name> <main_program/> </file_ref> </app_version> </app_info> |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
That is what happens when you are copying and pasting names quickly. |
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
Yes, the new stock opti apps are slower. The stock app is using a dispatcher that chooses a code path (SSE level) which is supported on your CPU and the rest of the code is not optimized at all. (The whole new build system is also preventing me from releasing some more tuned binaries using the intel compiler etc... it's a real pain in the ass and i hate that cmake crap!!) A major part of the optimizations is still missing, hopfully Matt will find the time to integrate it into the stock app. That one will boost performance again and should outperform the old optimized cpu apps by a few percent. Join Support science! Joinc Team BOINC United now! |
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
Here's a 'optimized' CPU app which was compiled using the Intel(R) C++ Compiler XE 12.0.4.196 for Windows. A SSE2 compatible CPU is required (AMD & Intel)! Difference is that we're using Intels LibM especially the exp(e^x) function, which is faster than the 'stock' SSE2 polyn. eval.... download -> MilkyWay Separation SSE2 Intel&AMD Join Support science! Joinc Team BOINC United now! |
Send message Joined: 28 Feb 10 Posts: 120 Credit: 109,840,492 RAC: 0 |
When I use the app_info.xml, must there also be an entry for the N-body in it? |
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
When I use the app_info.xml, must there also be an entry for the N-body in it? You don't have to run n-body at all (why isn't it possible in the user prefs to disable n-body ???). The included app-info.xml doesn't have an entry for n-body so you're only going to run separation WUs. Join Support science! Joinc Team BOINC United now! |
Send message Joined: 28 Feb 10 Posts: 120 Credit: 109,840,492 RAC: 0 |
You don't have to run n-body at all (why isn't it possible in the user prefs to disable n-body ???). Ähm I'm a little confused about that - why should't I do n-body wu's? When they pay less - that's not a reason, somebody has to do it. |
Send message Joined: 8 Aug 08 Posts: 30 Credit: 74,566,409 RAC: 0 |
Here's a 'optimized' CPU app which was compiled using the Intel(R) C++ Compiler XE 12.0.4.196 for Windows. I am getting a 404 when i'm trying to download that opti app. |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
Here's a 'optimized' CPU app which was compiled using the Intel(R) C++ Compiler XE 12.0.4.196 for Windows. Can you do a Win 32bit ATI version with that actual code version too? Would be great to have the working initial wait, which is in the code since v0.88. |
Send message Joined: 28 Feb 10 Posts: 120 Credit: 109,840,492 RAC: 0 |
The new opti app (0.91) performes good. I watched an increase of performance up to 8% on my old XEON's. thx a lot. But today the download link is empty - why? greetings Franz |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
The new opti app (0.91) performes good. I watched an increase of performance up to 8% on my old XEON's. Downloaded too many times? Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 28 Feb 10 Posts: 120 Credit: 109,840,492 RAC: 0 |
I think I know!! They get all invalid :-( I return to stock app. |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
(The whole new build system is also preventing me from releasing some more tuned binaries using the intel compiler etc... it's a real pain in the ass and i hate that cmake crap!!)It shouldn't be. If you have a problem building with ICC I'll fix it. |
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
I think I know!! Yes, that's why i pulled the download...(and sorry for the long time it took me to reply).. anyway with a propper solution file for MS VS without having to digg through useless cmake txt CRAP files that can't even generate proper makefiles without crashing/erroring out using the latest cmake downloads, it was just a guessing game miswsing some vital config parameters (TG Math... yeah, we got that using ICC and MW_SINCOS is computing CRAP using ICC...)... Anyway... while digging through this unessassary cmake txt file BS... i finally got it working,linking and validating.... (if anyone want's to compile the code using a proper VS solution project (2005 or 2010)without the retarded hassle to digg through useless cmake txt files... let me know... i'll upload them to ease the pain i was ging through to get everything compiled and linked...) So.. now that it works... we got a few new apps supporting Intel SSE4.1, Intel SSE3 and Intel/AMD SSE2 (PENTIUM4_SSE2 and AMD SSE2) for those that know code take a look at -> http://board.mpits.net/viewtopic.php?f=32&t=77.. that one includes source code that replaces the stock 0.91 source from github (should hopefully be integratated in the next stock source code...!(Matt, that's your part :p)) Anyway... all new optimized apps are linked, downloadable at http://www.mpits.net/opt_mw.php (do not hotlink the zip files or modify them without permission!!!) Optimized apps for Pentium4 (SSE2/SSE3) and AMD CPUs using AMD_SSE2 tuned ops(K8,K10) will be added tomorrow... stay tuned and look at http://www.mpits.net/opt_mw.php or http://board.mpits.net/viewtopic.php?f=32&t=77 for updates!!! Changelog: - NEW using GROMACS exp_pd function for SSE2 and SSE4.1(addidional 5% faster)(see code) - NEW using _mm_fsqrt_pd (SSE approx. converting to SSE1(RCP_SRQT) and SSE2 newton raphson stuff... up to 52 bit precission) (see code) - NEW using PENTIUM4 _mm_div_pd replacement function (see code) - NEW, faster AMD_SSE(K8,K9,K10) _mm_div_pd replacement (see code) JOIN BOINC United to get exclusive access to new prelelease optimized GPU & CPU apps! Join Support science! Joinc Team BOINC United now! |
Send message Joined: 17 Feb 08 Posts: 363 Credit: 258,227,990 RAC: 0 |
My own machines: Dual Quad Xeon 5365 ES (8 cores)-> SSE3 app -> http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=62008 Core i3 @ 2.13 GHz / HT enabled (4 threads) SSE4.1 app -> http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=281287 Join Support science! Joinc Team BOINC United now! |
Send message Joined: 25 Mar 09 Posts: 65 Credit: 53,099,671 RAC: 0 |
I too show a 404 error for the optimized link to 4.1 and 4.2 compatible Chipset. I downloaded the sse3 for now till it is fixed. Or should just I keep the version optimized v0.88 for now? Well..... rendered about 8 unusable. Reverted back to 0.88 for now until it finishes the rest. I have a NVidia GTX 285 and a Q9550 processor..... I also plan on reutilizing my 2nd computer that has a ATI 4870 running the old 0.19 apps on it. It did quite great but I disconnected it a while ago for basement reno. Can I just restart it and running it again or this is all obsolete stuff now? Martin |
Send message Joined: 28 Feb 10 Posts: 120 Credit: 109,840,492 RAC: 0 |
I'll test it on one of my machines and than i give feedback in a few hours. greetings Franz |
©2024 Astroinformatics Group