Message boards :
News :
N-body updated to 0.40
Message board moderation
Author | Message |
---|---|
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
The N-body simulation has been updated to 0.40. All systems are now using OpenMP for threading. The old static JSON configuration we were using has been replaced with Lua (so we can have totally arbitrary initial distributions of particles). Now we'll be fitting dwarf models with multiple components (e.g. a dark matter shell around the dwarf galaxy). The old applications won't work. Any old search workunits still left also won't work. Since some people sometimes want download links, here's the source and binaries: Source: http://milkyway.cs.rpi.edu/milkyway/download/src/milkyway_nbody_0.40.tar.xz Linux: http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_x86_64-pc-linux-gnu__mt http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_i686-pc-linux-gnu__mt OS X: http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_x86_64-apple-darwin__mt Windows: (also need the dlls) http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_windows_intelx86__mt.exe http://milkyway.cs.rpi.edu/milkyway/download/libgomp-1.dll http://milkyway.cs.rpi.edu/milkyway/download/pthreadGC2.dll http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_windows_x86_64__mt.exe http://milkyway.cs.rpi.edu/milkyway/download/libgomp_64-1.dll http://milkyway.cs.rpi.edu/milkyway/download/pthreadGC2_64.dll |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
My initial work unit with this new engine, work unit 261601923 errored out. This also happened to all other crunchers who tried to handle this. Could you please check to see what is going on? My guess is that your server is trying to send out old JSON work units with the new Lua engine, based on what is being written in the stderr out belonging to returned results. |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
My initial work unit with this new engine, work unit 261601923 errored out. This also happened to all other crunchers who tried to handle this. Could you please check to see what is going on? My guess is that your server is trying to send out old JSON work units with the new Lua engine, based on what is being written in the stderr out belonging to returned results.A lot of old units are still being sent out for some reason even though the searches are stopped. |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
I just restarted the search (actually 3 times). There are still some junk workunits left in the system which don't work correctly. The de_nbody_orphan_test_2model_4_* workunits are the current good search, and seem to be working fine for me. Some of the new junk workunits fail instantly when they can't find the input file. Some of the others run way too quickly and don't appear to have progress bars because the order of some of the parameters were switched from what they should be. |
Send message Joined: 5 Apr 09 Posts: 71 Credit: 6,120,786 RAC: 0 |
Hello, all my units N-body 0.40 parte error ... damage <core_client_version>6.12.19</core_client_version> <![CDATA[ <message> - exit code -1073741515 (0xc0000135) </message> ]]> Team Alliance francophone, boinc: 7.0.18 GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470 |
Send message Joined: 13 Mar 11 Posts: 1 Credit: 269,865 RAC: 0 |
I keep getting errors on milkyway_nbody_0.40_windows_x86_64__mt.exe WU...I shouldn't have to download that seperately, should I? ANd if so, where would I even install it too |
Send message Joined: 29 Sep 10 Posts: 54 Credit: 1,386,559 RAC: 0 |
WAIT ONE MINUTE! How do I set the number of CPUs that the nbody 0.40 app uses inside my app_info.xml file? Each process is using 4 CPUs, but BOINC doesn't seem aware of this. Edit: Scratch that; each process uses ALL my CPUs without regard to any BOINC preferences about MAX CPUs! How do I limit how many threads this new nBody app can use ?!? |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
WAIT ONE MINUTE!BOINC controls the number of threads used. If you want to set the number, you use something like |
Send message Joined: 29 Sep 10 Posts: 54 Credit: 1,386,559 RAC: 0 |
BOINC controls the number of threads used. If you want to set the number, you use something like <avg_ncpus> and <max_ncpus> in app_info.xml. You also might be able to control it more directly with the command line control that I think exists in app_info.xml by adding something like <cmdline>--nthreads 4</cmdline>, though I'm not sure how that will play with how BOINC wants to schedule things. <avg_ncpus> and <max_ncpus> only TELL boinc how many CPUs a sci app uses; it does NOT dictate to the sci app. I tried param --nthreads=4 and set the ncpus params both to 4. Hard-coding the number of threads seems to work. Thanks for the tip. |
Send message Joined: 25 Jun 10 Posts: 3 Credit: 20,219,110 RAC: 0 |
my milkyway_nbody_0.40_windows_x86_64__mt.... keeps failing and "has stopped working" (32 of those stopped working message windows for me to to close this morning). Looking in BOINC it shows that de_nobody_orphan_test_2model_4_57647_1301567303_0 is what was being processed and it had a computation error (don't know about the other 31 "stopped working" windows. This has been happening since the release of the 0.40 version. My OS is Windows Vista x64 on a quad core Intel processor. |
Send message Joined: 16 Jun 10 Posts: 6 Credit: 7,402,186 RAC: 0 |
I noticed that the workunits I've been getting run for about 2 seconds before erroring out. Is it a 0.40 issue or a workunit issue? |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
my milkyway_nbody_0.40_windows_x86_64__mt.... keeps failing and "has stopped working" (32 of those stopped working message windows for me to to close this morning).From looking at your tasks, the only one left that I can check had the failed to load DLL error. Did you try manually installing it and not get the DLLs? Try resetting the project. |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
I noticed that the workunits I've been getting run for about 2 seconds before erroring out. Is it a 0.40 issue or a workunit issue?For some reason there are still some lingering old workunits (though they should have stopped completely, the percentage of them is decreasing), and the new version rejects the old format, so it's a workunit issue. |
Send message Joined: 29 Sep 10 Posts: 54 Credit: 1,386,559 RAC: 0 |
I noticed that the workunits I've been getting run for about 2 seconds before erroring out. Is it a 0.40 issue or a workunit issue? Funny enough, I tried to keep both versions in my app_info.xml file, so I could process both old and new WU. But BOINC will delete the old version from disk upon startup! I hope there are still a few people out there running the old version long enough to empty the system of the older WU. |
Send message Joined: 14 Dec 07 Posts: 9 Credit: 10,455,243 RAC: 1,227 |
I seem to have received one that works! The previous one errored out fast. name de_nbody_orphan_test_2model_4_44832_1301540627_1 Workunit 263052631 Created 31 Mar 2011 17:14:49 UTC Sent 31 Mar 2011 17:25:20 UTC Received 31 Mar 2011 17:36:51 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 205494 Report deadline 8 Apr 2011 17:25:20 UTC Run time 276.415709 CPU time 2744.775 stderr out <core_client_version>6.10.60</core_client_version> <![CDATA[ <stderr_txt> <search_application>milkywayathome nbody 0.40 Windows x86_64 double OpenMP Crlibm</search_application> 13:25:06: Using OpenMP 18 max threads on a system with 12 processors 13:30:28: Making final checkpoint 13:30:28: Simulation complete <search_likelihood>-469.276081432873</search_likelihood> 13:30:28 (6080): called boinc_finish </stderr_txt> ]]> Validate state Checked, but no consensus yet Claimed credit 19.9490033746374 Granted credit 0 application version MilkyWay@Home N-Body Simulation v0.40 (mt) |
Send message Joined: 5 Apr 09 Posts: 71 Credit: 6,120,786 RAC: 0 |
BOINC controls the number of threads used. If you want to set the number, you use something like <avg_ncpus> and <max_ncpus> in app_info.xml. You also might be able to control it more directly with the command line control that I think exists in app_info.xml by adding something like <cmdline>--nthreads 4</cmdline>, though I'm not sure how that will play with how BOINC wants to schedule things. ok Team Alliance francophone, boinc: 7.0.18 GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470 |
Send message Joined: 7 Oct 07 Posts: 25 Credit: 35,904,118 RAC: 2,744 |
can someone please post a working app_info.xml for the new application? When life gives you lemons, make lemonade! |
Send message Joined: 29 Sep 10 Posts: 54 Credit: 1,386,559 RAC: 0 |
can someone please post a working app_info.xml for the new application? See my first post: http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2301 |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
On Windows you have to include the dll names in the app_info.xml too. Only made a quick test, but it seems to basically work on my win32 system.
Add that to your app_info.xml and watch out for the bold marked stuff. The plan_class line is not really necessary. For win64 you have to use: milkyway_nbody_0.40_windows_x86_64__mt.exe libgomp_64-1.dll pthreadGC2_64.dll If you want to change the number of threads used, you have to change the cmdline param, and let boinc know about it with the 2 ncpu params. Example 1: <avg_ncpus>2</avg_ncpus> <max_ncpus>2</max_ncpus> <cmdline>--nthreads=2</cmdline> This does run a WU with 2 threads and boinc is getting told that 2 cores per WU are used. Example 2: <avg_ncpus>1</avg_ncpus> <max_ncpus>1</max_ncpus> <cmdline>--nthreads=1</cmdline> This does run 1 WU per core, no multithreading is used. This one should not need the 2 ncpu lines. Please test yourself. Be warned: There seem to be problems running nbody and a gpu app at the same time, using app_info settings for both. My actual guess is that is has to do with the ncpu counting of all active apps that boinc is doing. I saw situations with 1 core unused (when using less threads than number of cores I have) or even the gpu unused. Never saw that before when the cpu apps did not need any ncpu settings. Solutions?. (Beside swithing off multithreading and not using the ncpu lines?) Thoughts, additions and corrections welcome :) |
Send message Joined: 10 Nov 09 Posts: 20 Credit: 30,395,419 RAC: 0 |
after i make a app_info file and then copy this <app><!-- std app for N-Body 0.40 mt 32bit --> <name>milkyway_nbody</name> <user_friendly_name>MilkyWay@Home nbody</user_friendly_name> </app> <file_info> <name>milkyway_nbody_0.40_windows_intelx86__mt.exe</name> <executable/> </file_info> <file_info> <name>libgomp-1.dll</name> <executable/> </file_info> <file_info> <name>pthreadGC2.dll</name> <executable/> </file_info> <app_version> <app_name>milkyway_nbody</app_name> <version_num>40</version_num> <plan_class>mt</plan_class> <avg_ncpus>4</avg_ncpus> <max_ncpus>4</max_ncpus> <cmdline>--nthreads=4</cmdline> <file_ref> <file_name>milkyway_nbody_0.40_windows_intelx86__mt.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libgomp-1.dll</file_name> </file_ref> <file_ref> <file_name>pthreadGC2.dll</file_name> </file_ref> </app_version> into it i keep getting a "parse error in app_info.xml; check XML synatx" in red in my messages ? |
©2024 Astroinformatics Group