Posts by Thierry Godefroy

1) Message boards : Number crunching : new workunit limit (Message 6972) Posted 29 Nov 2008 by Thierry Godefroy Post: This is pure non-sense... Even with 20 WUs per core the queue was stalling unless I manually requested more work. The problem being that when BOINC gets replied "Reached CPU limit" several times in a raw, it starts delaying the work request, and in the end it gets delayed by over 3 hours... And as 20 WUs are crunched in under 110 minutes, you get a queue stall (not to mention it's a Hell of a nightmare to get just a few more WUs at the next request). The solution is simple: make it so that the optimized apps will need 60 minutes or so to crunch each WU (multiply the work to do per WU by 12). As it is, I will rather crunch for another project than let the queue stall and the computer staying powered on for nothing at all...
2) Message boards : Application Code Discussion : compiler optimization flags (Message 6877) Posted 27 Nov 2008 by Thierry Godefroy Post: You may be right, that in the general case the output may change (most probably very slightly and unnoticable, one really needs some special cases to see real changes), but I would regard such an algorithm close to numerical unstable. And believe me, Milkyway isn't at that point. Hell, the bug with the number of the integration points didn't change the (decimal) output! It is noticeable enough in Seti that it makes -ffast-math a no-no for it and does give INVALID results. See: http://www.pperry.f2s.com/boinc-compile-seti.htm (look at the very bottom, two paragraphs before the end). What you guys don't seem to understand is that an error, even on the 15th decimal in one operation can spread (especially during multiplications) till it becomes quite significant (on the 3rd decimal of the final result, for example). If you still don't want to understand and insist on using -ffast-math, then my guess is that the project admins will end up turning up the validation by results comparisons (like SETI does), meaning the throughput for the project will be divided by three (as they will need to make each WUs calculated by at least three different computers and then compare the results, only returning the ones that are close enough to each others to denote a non-crippled result). This is my last post on this topic.
3) Message boards : Application Code Discussion : compiler optimization flags (Message 6827) Posted 26 Nov 2008 by Thierry Godefroy Post: This would be true (if it were not more a problem of truncation rather than 5/5 rounding) for addition, but this is no more true with multiplication. The error goes both ways for multiplications too. As I said before, given that the finite math of floating-point calculations on computers implies in an error of 0.5 bit of the mantissa, by your rationale all calculations done on a computer would be increasingly wrong, which is patently incorrect. You obviously don't have studied numeric analysis... I did (even if it was looong ago). We are not speaking about benchmarking here, but about science application where accuracy does matter. I mentioned SPEC because it's a benchmark that REQUIRES AND VERIFIES correct results. So, if -ffast-math doesn't affect the results of over 20 scientific applications from SPEC CPU2006, I doubt that it would affect Milkyway. I think the admins said it in this very thread: they WANT maximum accuracy. Deal with it. But if you prefer to live with your preconceptions and ignore the facts, fine. These are not preconceptions but actual facts I could verify several times by myself. If you don't trust me, perhaps will you trust someone else: here is one of the many examples (here for Seti) of -ffast-math problems (Google a bit and you'll find many others): http://www.pperry.f2s.com/boinc-compile-seti.htm (look at the very bottom, two paragraphs before the end). Enough said. Indeed !
4) Message boards : Application Code Discussion : compiler optimization flags (Message 6815) Posted 26 Nov 2008 by Thierry Godefroy Post: An error is an error in both directions, sometimes up, sometimes down. So it stays in the 15th digit. This would be true (if it were not more a problem of truncation rather than 5/5 rounding) for addition, but this is no more true with multiplication. -ffast-math is used in SPEC benchmarks, which validate the results for acceptance, without any issue. We are not speaking about benchmarking here, but about science application where accuracy does matter. Running optimized apps is a good thing, because you can compute more results in less time, meaning also that you will need to consume less power for each result (good for the planet). But this must not be at the cost of invalid results as then the calculations you did are in pure waste for the project itself. Worst: unlike Seti and many other BOINC projects, Milkyway does not compare your results with the ones of others, meaning that if you send slightly wrong result, thier validator will not be able to notice the problem and this actually invalid result will pollute the science project. Optimization must not be done at the cost of poorer science results. Period.
5) Message boards : Application Code Discussion : compiler optimization flags (Message 6798) Posted 26 Nov 2008 by Thierry Godefroy Post: We're talking about a difference smaller than 15 decimal digits! If the output of the application is truncated to the default 5 digits, it'll never even show up. You don't seem to understand that in a chain of many operations (or worst: in a loop with the same operation using the results from the previous iteration, such as in suites), your 15th decimal error will grow to the 14th, then the 13th, etc.. and this at each dozen of operations. In the end, the error might show on the 5th, 4th or even third decimal, depending on how many loops you went through... and precisely, calculations such as BOINC's all rely on complex calculations done within numerous loops. Don't use -ffast-math. Period.
6) Message boards : Application Code Discussion : compiler optimization flags (Message 6785) Posted 26 Nov 2008 by Thierry Godefroy Post: And here's what they mean: -fno-math-errno: don't bother to set the global variable errno in case of a math error (diagnostic purposes) -funsafe-math-optimizations: IEEE754 calls for following the order in the source code, ruling out commutative operations, such as two multiplies, or associative operations, such as multiplication of a sum by a factor; this option does away with this restriction. -fno-trapping-math: some math operations can result in math errors, such as log (-1), so compilers tip-toe around such operations at the expense of performance; this option assumes that such invalid operation won't occur (diagnostic purposes). -ffinite-math-only: assumes that intermediary results will always be normal numbers (e.g., never infinite). -fno-rounding-math: this is the default anyways. -fno-signaling-nans: assumes that invalid results will not raise an exception (diagnostic purposes). As you can see, most options are for error reporting purposes and the only option that causes different results is -funsafe-math-optimizations Here what 'man gcc' says: -funsafe-math-optimizations Allow optimizations for floating-point arithmetic that (a) assume that arguments and results are valid and (b) may violate IEEE or ANSI standards. When used at link-time, it may include libraries or startup files that change the default FPU control word or other similar optimizations. This option is not turned on by any -O option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications for math functions. It may, however, yield faster code for programs that do not require the guarantees of these specifications. Enables -fno-signed-zeros, -fno-trapping-math, -fassociative-math and -freciprocal-math. The default is -fno-unsafe-math-optimizations. I can assure you that using -ffast-math in optimized apps such as Seti's can lead to INVALID results (i.e. results considered as not precise enough when Seto@Home validates your results by comparing it with others. but different by one or two bits in the mantissa, which never shows up when outputting the values in decimal format. One or tow bits of mantissa, perhaps, but for each operation: the result after many consecutive ops can be quite significant. Let me give you an example. Let's consider we only have 7 decimal positions of precision for a FPU (there are much more in modern FPUs, but that's just to make it easier in this example), and take this simple operation: 15 * 10 / 1000000000 = 0.00000015 (truncated as 0.0000001 because of or 7 decimals limitations) should it be optimized (for example, because of out or order ops optimizations) as: 10 / 1000000000 * 15 then you get 10 / 1000000000 = 0.000000001 = 0.0000000 (7 decimals) and 0.0000000 * 15 = 0.0000000 in the end... Believe me, the above effect is far from negligible...
7) Message boards : Application Code Discussion : milkyway code releases (Message 6746) Posted 26 Nov 2008 by Thierry Godefroy Post: Could just be that there isn't much work at the moment. There are over 1200 WUs available for download following the server status page, and they won't download either when I set back Milkyway to the old app (same "No work sent" report indicating there are WUs but that they don't match the app version). I'd rather say that there are no old app WUs left and that I didn't get the version and/or app name right in my app_info.xml file...
8) Message boards : Application Code Discussion : compiler optimization flags (Message 6745) Posted 26 Nov 2008 by Thierry Godefroy Post: Here are the opt flags I'm using on a Core2 Duo in 32bits Linux (gcc 4.1): -O2 -fomit-frame-pointer -frename-registers -fweb -fexpensive-optimizations -fno-strict-aliasing -march=i686 -msse3 -mfpmath=sse Note about flags seen above in this thread: -ffast-math shall not be used in projects where strict IEEE math is required (can cause problems because it skips a lot of validity tests and math exceptions, and may also lead to bad rounding ups (inferior precision on decimals): a no-no for Seti, for instance. I don't know for Milkyway).
9) Message boards : Application Code Discussion : milkyway code releases (Message 6743) Posted 26 Nov 2008 by Thierry Godefroy Post: I managed to compile the code (with BOINC compiled in /usr/src/rpm/BUILD/BOINC) for 32bits Linux. Here is my Makefile: #-- Makefile -- # # usage: # # make -f make.linux app_x86_64 # # make -f make.linux app_i686 # # # change BOINC_DIR locally or define it on the make command line # APP_VERSION = 0.3 BOINC_DIR = /usr/src/rpm/BUILD/boinc BOINC_API_DIR = $(BOINC_DIR)/api BOINC_LIB_DIR = $(BOINC_DIR)/lib BOINC_LINK_DIR = /usr/src/rpm/BUILD/boinc BOINC_LIB_LINK_DIR = $(BOINC_LINK_DIR)/lib BOINC_API_LINK_DIR = $(BOINC_LINK_DIR)/api VARIANTFLAGS = -DGMLE_BOINC CXXFLAGS_ALL = $(VARIANTFLAGS) \ -O2 -fomit-frame-pointer -frename-registers -fweb -fexpensive-optimizations -fno-strict-aliasing -march=i686 -msse3 -mfpmath=sse -pipe \ -I$(BOINC_DIR) -I$(BOINC_LIB_DIR) -I$(BOINC_API_DIR) CXX_i686 = g++ CXXFLAGS_i686 = -m32 $(CXXFLAGS_ALL) LDFLAGS_i686 = -L/usr/X11R6/lib -L$(BOINC_LIB_LINK_DIR) -L$(BOINC_API_LINK_DIR) CXX_x86_64 = g++ CXXFLAGS_x86_64 = $(CXXFLAGS_ALL) LDFLAGS_x86_64 = -L/usr/local/lib #LDFLAGS_x86_64 = -L/usr/local/lib -L/usr/X11R6/lib #LDFLAGS_x86_64 = -L$(BOINC_LIB_LINK_DIR) -L$(BOINC_API_LINK_DIR) APP_DIR = ../astronomy FGDO_DIR = .. SCHED_DIR = $(FGDO_DIR)/evaluation SEARCH_DIR = $(FGDO_DIR)/searches UTIL_DIR = $(FGDO_DIR)/util APP_OBJS = \ $(APP_DIR)/boinc_astronomy.o \ $(APP_DIR)/atSurveyGeometry.o \ $(APP_DIR)/numericalIntegration.o \ $(APP_DIR)/parameters.o \ $(APP_DIR)/probability.o \ $(APP_DIR)/stCoords.o \ $(APP_DIR)/stCnum.o \ $(APP_DIR)/stMath.o \ $(APP_DIR)/stVector.o \ $(APP_DIR)/star_points.o \ $(APP_DIR)/evaluation_optimized.o \ $(APP_DIR)/evaluation_state.o SEARCH_OBJS = \ $(SEARCH_DIR)/search_parameters.o UTIL_OBJS = \ $(UTIL_DIR)/io_util.o \ $(UTIL_DIR)/settings.o PROGS = milkyway_$(APP_VERSION)_i686-pc-linux-gnu milkyway_$(APP_VERSION)_x86_64-pc-linux-gnu all: $(PROGS) app_i686: OBJ_CXX = $(CXX_i686) app_i686: OBJ_CXXFLAGS = $(CXXFLAGS_i686) app_i686: $(APP_OBJS) $(SEARCH_OBJS) $(UTIL_OBJS) $(BOINC_API_LINK_DIR)/libboinc_api.a $(BOINC_LIB_LINK_DIR)/libboinc.a $(CXX_i686) $(LDFLAGS_i686) $(CXXFLAGS_i686) -Wl --export_dynamic -o milkyway_$(APP_VERSION)_i686-pc-linux-gnu $(APP_OBJS) $(SEARCH_OBJS) $(UTIL_OBJS) -lm -lboinc_api -lboinc -pthread app_x86_64: OBJ_CXX = $(CXX_x86_64) app_x86_64: OBJ_CXXFLAGS = $(CXXFLAGS_x86_64) app_x86_64: $(APP_OBJS) $(SEARCH_OBJS) $(UTIL_OBJS) $(BOINC_API_LINK_DIR)/libboinc_api.a $(BOINC_LIB_LINK_DIR)/libboinc.a $(CXX_x86_64) $(LDFLAGS_x86_64) $(CXXFLAGS_x86_64) -Wl --export_dynamic -o milkyway_$(APP_VERSION)_x86_64-pc-linux-gnu $(APP_OBJS) $(SEARCH_OBJS) $(UTIL_OBJS) -lm -lboinc_api -lboinc -pthread .C.o: $(OBJ_CXX) $(OBJ_CXXFLAGS) $(INC) -Wall -x c++ -c $< -o $@ .c.o: $(OBJ_CXX) $(OBJ_CXXFLAGS) $(INC) -Wall -x c++ -c $< -o $@ clean: rm -f $(APP_OBJS) $(UTIL_OBJS) $(SEARCH_OBJS); clean_all: rm -f $(PROGS) $(APP_OBJS) $(UTIL_OBJS) $(SEARCH_OBJS); However, it looks like I can't get BOINC to download WUs for the new app, it reports: mer. 26 nov. 2008 01:53:35 CET\|Milkyway@home\|Sending scheduler request: To fetch work. Requesting 227032 seconds of work, reporting 0 completed tasks mer. 26 nov. 2008 01:53:40 CET\|Milkyway@home\|Scheduler request succeeded: got 0 new tasks mer. 26 nov. 2008 01:53:40 CET\|Milkyway@home\|Message from server: No work sent while there is work waiting on the servers (the "Message from server: No work sent" is usually an indication that there is no WU matching the application version). I guess I'm having something wrong in my app_info.xml file, but I can't find out hat exactly. Here it is: <app_info> <app> <name>milkyway</name> </app> <file_info> <name>milkyway_0.3_i686-pc-linux-gnu</name> <executable/> </file_info> <app_version> <app_name>milkyway</app_name> <version_num>0.3</version_num> <file_ref> <file_name>milkyway_0.3_i686-pc-linux-gnu</file_name> <main_program/> </file_ref> </app_version> <app_info> I tried to change name & app_name for "astronomy" and version_num for 2, 3, 0.2, 0.3 & 1.22, but all to no avail... What am I doing wrong, please ?