source v0.14 released

Author	Message
Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 8938 - Posted: 24 Jan 2009, 19:22:01 UTC Last modified: 24 Jan 2009, 19:37:43 UTC I've released the source for v0.14 in the code release directory. This should fix the checkpoint errors as discussed in this thread. Additionally, I was able to remove a multiply and divide from the inner loop of calculate_integrals, which got about a 4% peformance increase from the application here: ir[i] = ((next_r * next_r * next_r) - (r * r * r))/3.0; to line 401: irv[i] = (((next_r * next_r * next_r) - (r * r * r))/3.0) * ia->mu_step_size / deg; and V = ir[ia->r_step_current] * ia->mu_step_size / deg; to line 477: V = irv[ia->r_step_current] * id; I also added two new #define's, NEW_FORMULA and WEDGE_ALLOW_ZERO which removed some of the conditionals from the inner loop, and split calculate_integral into calculate_integral_convolved and calculate_integral_unconvolved (the calculate_integral_unconvolved is pretty much deprecated), I made a similar change in calculate_likelihood as well. This removed warnings evaluation_optimized.c was throwing, and had a slight performance improvement as well. Again, when compiling the new version please use the: APP_VERSION = 0.14 APP_NAME = your_app_name -DBOINC_APP_VERSION=$(APP_VERSION) and -DBOINC_APP_NAME='"$(APP_NAME)"' flags to make sure your binary gets credit. I've also updated the_parameters.sh script so to test a WU you just need to run: ./set_parameters The results should be the same as in the last post: 79: searchname parameters [8]: 0.342173733203920 25.951791084662300 -2.170941473882660 38.272511356953906 30.225190442596112 2.214906001337289 0.323161690642917 2.774024471628528 metadata: this is the metadata fitness: -2.946683357256020 your_app_name: 0.14 82: searchname parameters [8]: 0.405879611547422 17.529961843393409 -1.857514527214484 29.360893891378243 31.228263575178566 -1.551741065334000 0.064096152599308 2.554282099127810 metadata: this is the metadata fitness: -2.985569777902147 your_app_name: 0.14 86: searchname parameters [8]: 0.733171635575244 14.657212876628332 -1.705465347395041 16.911711745343634 28.077212666463502 -1.203290851581461 3.527360643924728 2.224821450587501 metadata: this is the metadata fitness: -3.027909854710189 your_app_name: 0.14 ID: 8938 · Rating: 0 · rate: / Reply Quote

ebahapo Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0	Message 8973 - Posted: 24 Jan 2009, 22:44:37 UTC - in response to Message 8938. ir[i] = ((next_r * next_r * next_r) - (r * r * r))/3.0; to line 401: irv[i] = (((next_r * next_r * next_r) - (r * r * r))/3.0) * ia->mu_step_size / deg; You could remove yet another division by changing line 401 to: irv [i] = ((next_r * next_r * next_r) - (r * r * r)) * ia->mu_step_size / (3.0 * deg); Since a division is typically 10x slower than a multiplication, it could improve the performance of this line alone by about 40%. HTH ID: 8973 · Rating: 0 · rate: / Reply Quote

Cluster Physik Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0	Message 8975 - Posted: 24 Jan 2009, 22:58:57 UTC - in response to Message 8973. You could remove yet another division by changing line 401 to: irv [i] = ((next_r * next_r * next_r) - (r * r * r)) * ia->mu_step_size / (3.0 * deg); Since a division is typically 10x slower than a multiplication, it could improve the performance of this line alone by about 40%. As it is divided by a constant that is known at compile time (the 3.0 is hardcoded), any decent compiler will exchange it with a multipication by 1/3 (calculated at compile time) either way. This kind of changes are only necessary if one uses an ancient compiler or turns off optimizations. With -O2 or even -O3 I bet it won't make a difference. And even if it would not be a literal, with the fastmath option the compiler should do this kind of optimizations even for variables. Compilers got quite clever in such things. But from my old days, when compilers didn't optimize that well, I also prefer doing such things by hand. One never knows ;) ID: 8975 · Rating: 0 · rate: / Reply Quote

ebahapo Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0	Message 8977 - Posted: 24 Jan 2009, 23:23:43 UTC - in response to Message 8975. As it is divided by a constant that is known at compile time (the 3.0 is hardcoded), any decent compiler will exchange it with a multipication by 1/3 (calculated at compile time) either way. This kind of changes are only necessary if one uses an ancient compiler or turns off optimizations. Actually, for floating-point data, only with -ffast-math would this optimization be automatically performed by the compiler. And, since this option cannot be used for this project, tipping the scale for the compiler is a good rule-of-thumb. Moreover, since the compiler does not change the order of floating-point computations, this code should have an edge too: irv [i] = ((next_r * next_r * next_r * ia->mu_step_size) - (r * r * r * ia->mu_step_size)) / (3.0 * deg); Reducing the dependency sequence on out-of-order processors reduces the latency of long operations like these. HTH ID: 8977 · Rating: 0 · rate: / Reply Quote

John Clark Send message Joined: 4 Oct 08 Posts: 1734 Credit: 64,228,409 RAC: 0	Message 8983 - Posted: 25 Jan 2009, 0:22:14 UTC Last modified: 25 Jan 2009, 0:32:23 UTC Is this release one that will Download on a detach and reattach, or will 0.13 for Windos be pulled down again? No need to answer as I see I have downloaded 0.14 WUs. So, I assume the latest Windos MW client has been downloaded by BOINC already. ID: 8983 · Rating: 0 · rate: / Reply Quote

Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 9015 - Posted: 25 Jan 2009, 2:08:22 UTC - in response to Message 8977. Last modified: 25 Jan 2009, 2:08:57 UTC As it is divided by a constant that is known at compile time (the 3.0 is hardcoded), any decent compiler will exchange it with a multipication by 1/3 (calculated at compile time) either way. This kind of changes are only necessary if one uses an ancient compiler or turns off optimizations. Actually, for floating-point data, only with -ffast-math would this optimization be automatically performed by the compiler. And, since this option cannot be used for this project, tipping the scale for the compiler is a good rule-of-thumb. Moreover, since the compiler does not change the order of floating-point computations, this code should have an edge too: irv [i] = ((next_r * next_r * next_r * ia->mu_step_size) - (r * r * r * ia->mu_step_size)) / (3.0 * deg); Reducing the dependency sequence on out-of-order processors reduces the latency of long operations like these. HTH The linux and osx binaries are compiled with -ffast-math. I'm pretty sure Dave is compiling windows with it as well. also, the irv[i] = ... is being calculated before the 3 main integral loops, so it's only calculated 'ia->r_steps' times, not once every interior loop which is done ia->r_steps * ia->mu-steps * ia->nu_steps times. So this would probably not be noticable. ID: 9015 · Rating: 0 · rate: / Reply Quote

speedimic Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0	Message 9018 - Posted: 25 Jan 2009, 2:16:21 UTC - in response to Message 9015. The linux and osx binaries are compiled with -ffast-math. [...] hmm, no signs of that in the makefiles... what about your post from november, where you told us not to use it? mic. ID: 9018 · Rating: 0 · rate: / Reply Quote

Logan Send message Joined: 15 Aug 08 Posts: 163 Credit: 3,876,869 RAC: 0	Message 9019 - Posted: 25 Jan 2009, 2:19:58 UTC - in response to Message 9018. Last modified: 25 Jan 2009, 2:20:38 UTC The linux and osx binaries are compiled with -ffast-math. [...] hmm, no signs of that in the makefiles... what about your post from november, where you told us not to use it? :D :D :D :D :D..... Logan. BOINC FAQ Service (Ahora, tambiÃ©n disponible en EspaÃ±ol/Now available in Spanish) ID: 9019 · Rating: 0 · rate: / Reply Quote

Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 9020 - Posted: 25 Jan 2009, 2:21:10 UTC - in response to Message 9019. The linux and osx binaries are compiled with -ffast-math. [...] hmm, no signs of that in the makefiles... what about your post from november, where you told us not to use it? :) :) :) :) :)..... lol my bad, i think i was thinking of -funroll-loops. Either way that line is only executed ~700 times so I don't think optimizing it will have much effect ;P ID: 9020 · Rating: 0 · rate: / Reply Quote