rpi_logo
N-body updated to 0.40
N-body updated to 0.40
log in

Advanced search

Message boards : News : N-body updated to 0.40

1 · 2 · 3 · Next
Author Message
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0

Message 46796 - Posted: 28 Mar 2011, 22:45:24 UTC
Last modified: 29 Mar 2011, 3:27:02 UTC

The N-body simulation has been updated to 0.40. All systems are now using OpenMP for threading. The old static JSON configuration we were using has been replaced with Lua (so we can have totally arbitrary initial distributions of particles). Now we'll be fitting dwarf models with multiple components (e.g. a dark matter shell around the dwarf galaxy).

The old applications won't work. Any old search workunits still left also won't work.

Since some people sometimes want download links, here's the source and binaries:
Source:
http://milkyway.cs.rpi.edu/milkyway/download/src/milkyway_nbody_0.40.tar.xz

Linux:
http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_x86_64-pc-linux-gnu__mt
http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_i686-pc-linux-gnu__mt

OS X:
http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_x86_64-apple-darwin__mt

Windows: (also need the dlls)
http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_windows_intelx86__mt.exe
http://milkyway.cs.rpi.edu/milkyway/download/libgomp-1.dll
http://milkyway.cs.rpi.edu/milkyway/download/pthreadGC2.dll

http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_windows_x86_64__mt.exe
http://milkyway.cs.rpi.edu/milkyway/download/libgomp_64-1.dll
http://milkyway.cs.rpi.edu/milkyway/download/pthreadGC2_64.dll

Jesse Viviano
Send message
Joined: 4 Feb 11
Posts: 82
Credit: 36,682,582
RAC: 15,561

Message 46800 - Posted: 28 Mar 2011, 23:46:34 UTC

My initial work unit with this new engine, work unit 261601923 errored out. This also happened to all other crunchers who tried to handle this. Could you please check to see what is going on? My guess is that your server is trying to send out old JSON work units with the new Lua engine, based on what is being written in the stderr out belonging to returned results.

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0

Message 46801 - Posted: 28 Mar 2011, 23:58:37 UTC - in response to Message 46800.

My initial work unit with this new engine, work unit 261601923 errored out. This also happened to all other crunchers who tried to handle this. Could you please check to see what is going on? My guess is that your server is trying to send out old JSON work units with the new Lua engine, based on what is being written in the stderr out belonging to returned results.
A lot of old units are still being sent out for some reason even though the searches are stopped.

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0

Message 46807 - Posted: 29 Mar 2011, 2:29:34 UTC - in response to Message 46796.
Last modified: 29 Mar 2011, 2:31:17 UTC

I just restarted the search (actually 3 times). There are still some junk workunits left in the system which don't work correctly. The de_nbody_orphan_test_2model_4_* workunits are the current good search, and seem to be working fine for me. Some of the new junk workunits fail instantly when they can't find the input file. Some of the others run way too quickly and don't appear to have progress bars because the order of some of the parameters were switched from what they should be.

Profile [AF>EDLS] Polynesia
Avatar
Send message
Joined: 5 Apr 09
Posts: 71
Credit: 6,120,786
RAC: 0

Message 46824 - Posted: 29 Mar 2011, 18:51:14 UTC

Hello,

all my units N-body 0.40 parte error ... damage

<core_client_version>6.12.19</core_client_version>
<![CDATA[
<message>
- exit code -1073741515 (0xc0000135)
</message>
]]>
____________
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470

Retslag1
Send message
Joined: 13 Mar 11
Posts: 1
Credit: 269,865
RAC: 0

Message 46855 - Posted: 31 Mar 2011, 1:50:28 UTC
Last modified: 31 Mar 2011, 1:55:27 UTC

I keep getting errors on milkyway_nbody_0.40_windows_x86_64__mt.exe WU...I shouldn't have to download that seperately, should I? ANd if so, where would I even install it too

DJStarfox
Send message
Joined: 29 Sep 10
Posts: 53
Credit: 924,662
RAC: 0

Message 46858 - Posted: 31 Mar 2011, 5:43:02 UTC - in response to Message 46796.
Last modified: 31 Mar 2011, 6:11:22 UTC

WAIT ONE MINUTE!

How do I set the number of CPUs that the nbody 0.40 app uses inside my app_info.xml file? Each process is using 4 CPUs, but BOINC doesn't seem aware of this.

Edit: Scratch that; each process uses ALL my CPUs without regard to any BOINC preferences about MAX CPUs!

How do I limit how many threads this new nBody app can use ?!?

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0

Message 46860 - Posted: 31 Mar 2011, 13:32:43 UTC - in response to Message 46858.

WAIT ONE MINUTE!

How do I set the number of CPUs that the nbody 0.40 app uses inside my app_info.xml file? Each process is using 4 CPUs, but BOINC doesn't seem aware of this.

Edit: Scratch that; each process uses ALL my CPUs without regard to any BOINC preferences about MAX CPUs!

How do I limit how many threads this new nBody app can use ?!?
BOINC controls the number of threads used. If you want to set the number, you use something like and in app_info.xml. You also might be able to control it more directly with the command line control that I think exists in app_info.xml by adding something like --nthreads 4, though I'm not sure how that will play with how BOINC wants to schedule things.

DJStarfox
Send message
Joined: 29 Sep 10
Posts: 53
Credit: 924,662
RAC: 0

Message 46861 - Posted: 31 Mar 2011, 14:34:54 UTC - in response to Message 46860.

BOINC controls the number of threads used. If you want to set the number, you use something like <avg_ncpus> and <max_ncpus> in app_info.xml. You also might be able to control it more directly with the command line control that I think exists in app_info.xml by adding something like <cmdline>--nthreads 4</cmdline>, though I'm not sure how that will play with how BOINC wants to schedule things.


<avg_ncpus> and <max_ncpus> only TELL boinc how many CPUs a sci app uses; it does NOT dictate to the sci app. I tried param --nthreads=4 and set the ncpus params both to 4. Hard-coding the number of threads seems to work. Thanks for the tip.

Vicki
Send message
Joined: 25 Jun 10
Posts: 3
Credit: 19,202,309
RAC: 34,879

Message 46863 - Posted: 31 Mar 2011, 15:55:59 UTC

my milkyway_nbody_0.40_windows_x86_64__mt.... keeps failing and "has stopped working" (32 of those stopped working message windows for me to to close this morning).

Looking in BOINC it shows that de_nobody_orphan_test_2model_4_57647_1301567303_0 is what was being processed and it had a computation error (don't know about the other 31 "stopped working" windows.

This has been happening since the release of the 0.40 version.

My OS is Windows Vista x64 on a quad core Intel processor.

pstehno
Avatar
Send message
Joined: 16 Jun 10
Posts: 6
Credit: 7,402,186
RAC: 0

Message 46864 - Posted: 31 Mar 2011, 16:42:22 UTC

I noticed that the workunits I've been getting run for about 2 seconds before erroring out. Is it a 0.40 issue or a workunit issue?

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0

Message 46865 - Posted: 31 Mar 2011, 16:44:27 UTC - in response to Message 46863.

my milkyway_nbody_0.40_windows_x86_64__mt.... keeps failing and "has stopped working" (32 of those stopped working message windows for me to to close this morning).

Looking in BOINC it shows that de_nobody_orphan_test_2model_4_57647_1301567303_0 is what was being processed and it had a computation error (don't know about the other 31 "stopped working" windows.

This has been happening since the release of the 0.40 version.

My OS is Windows Vista x64 on a quad core Intel processor.
From looking at your tasks, the only one left that I can check had the failed to load DLL error. Did you try manually installing it and not get the DLLs? Try resetting the project.

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0

Message 46866 - Posted: 31 Mar 2011, 16:46:48 UTC - in response to Message 46864.

I noticed that the workunits I've been getting run for about 2 seconds before erroring out. Is it a 0.40 issue or a workunit issue?
For some reason there are still some lingering old workunits (though they should have stopped completely, the percentage of them is decreasing), and the new version rejects the old format, so it's a workunit issue.

DJStarfox
Send message
Joined: 29 Sep 10
Posts: 53
Credit: 924,662
RAC: 0

Message 46868 - Posted: 31 Mar 2011, 17:43:38 UTC - in response to Message 46866.

I noticed that the workunits I've been getting run for about 2 seconds before erroring out. Is it a 0.40 issue or a workunit issue?

For some reason there are still some lingering old workunits (though they should have stopped completely, the percentage of them is decreasing), and the new version rejects the old format, so it's a workunit issue.


Funny enough, I tried to keep both versions in my app_info.xml file, so I could process both old and new WU. But BOINC will delete the old version from disk upon startup!

I hope there are still a few people out there running the old version long enough to empty the system of the older WU.

grumpy
Send message
Joined: 14 Dec 07
Posts: 6
Credit: 2,816,104
RAC: 3

Message 46869 - Posted: 31 Mar 2011, 17:54:13 UTC

I seem to have received one that works!
The previous one errored out fast.





name de_nbody_orphan_test_2model_4_44832_1301540627_1
Workunit 263052631
Created 31 Mar 2011 17:14:49 UTC
Sent 31 Mar 2011 17:25:20 UTC
Received 31 Mar 2011 17:36:51 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 205494
Report deadline 8 Apr 2011 17:25:20 UTC
Run time 276.415709
CPU time 2744.775
stderr out

<core_client_version>6.10.60</core_client_version>
<![CDATA[
<stderr_txt>
<search_application>milkywayathome nbody 0.40 Windows x86_64 double OpenMP Crlibm</search_application>
13:25:06: Using OpenMP 18 max threads on a system with 12 processors
13:30:28: Making final checkpoint
13:30:28: Simulation complete
<search_likelihood>-469.276081432873</search_likelihood>
13:30:28 (6080): called boinc_finish

</stderr_txt>
]]>

Validate state Checked, but no consensus yet
Claimed credit 19.9490033746374
Granted credit 0
application version MilkyWay@Home N-Body Simulation v0.40 (mt)

Profile [AF>EDLS] Polynesia
Avatar
Send message
Joined: 5 Apr 09
Posts: 71
Credit: 6,120,786
RAC: 0

Message 46873 - Posted: 31 Mar 2011, 20:42:54 UTC - in response to Message 46861.
Last modified: 31 Mar 2011, 20:43:58 UTC

BOINC controls the number of threads used. If you want to set the number, you use something like <avg_ncpus> and <max_ncpus> in app_info.xml. You also might be able to control it more directly with the command line control that I think exists in app_info.xml by adding something like <cmdline>--nthreads 4</cmdline>, though I'm not sure how that will play with how BOINC wants to schedule things.


<avg_ncpus> and <max_ncpus> only TELL boinc how many CPUs a sci app uses; it does NOT dictate to the sci app. I tried param --nthreads=4 and set the ncpus params both to 4. Hard-coding the number of threads seems to work. Thanks for the tip.

ok
____________
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470

Raimund Barbeln
Send message
Joined: 7 Oct 07
Posts: 25
Credit: 29,063,967
RAC: 0

Message 46880 - Posted: 1 Apr 2011, 6:55:45 UTC - in response to Message 46861.

can someone please post a working app_info.xml for the new application?
____________
When life gives you lemons, make lemonade!

DJStarfox
Send message
Joined: 29 Sep 10
Posts: 53
Credit: 924,662
RAC: 0

Message 46886 - Posted: 1 Apr 2011, 13:56:20 UTC - in response to Message 46880.

can someone please post a working app_info.xml for the new application?


See my first post:
http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2301

Len LE/GE
Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0

Message 46895 - Posted: 2 Apr 2011, 0:01:32 UTC
Last modified: 2 Apr 2011, 0:04:17 UTC

On Windows you have to include the dll names in the app_info.xml too.

Only made a quick test, but it seems to basically work on my win32 system.


<app><!-- std app for N-Body 0.40 mt 32bit -->
<name>milkyway_nbody</name>
<user_friendly_name>MilkyWay@Home nbody</user_friendly_name>
</app>
<file_info>
<name>milkyway_nbody_0.40_windows_intelx86__mt.exe</name>
<executable/>
</file_info>
<file_info>
<name>libgomp-1.dll</name>
<executable/>
</file_info>
<file_info>
<name>pthreadGC2.dll</name>
<executable/>
</file_info>

<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>40</version_num>
<plan_class>mt</plan_class>
<avg_ncpus>4</avg_ncpus>
<max_ncpus>4</max_ncpus>
<cmdline>--nthreads=4</cmdline>
<file_ref>
<file_name>milkyway_nbody_0.40_windows_intelx86__mt.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libgomp-1.dll</file_name>
</file_ref>
<file_ref>
<file_name>pthreadGC2.dll</file_name>
</file_ref>
</app_version>


Add that to your app_info.xml and watch out for the bold marked stuff.
The plan_class line is not really necessary.

For win64 you have to use:
milkyway_nbody_0.40_windows_x86_64__mt.exe
libgomp_64-1.dll
pthreadGC2_64.dll


If you want to change the number of threads used, you have to change the cmdline param, and let boinc know about it with the 2 ncpu params.
Example 1:
<avg_ncpus>2</avg_ncpus>
<max_ncpus>2</max_ncpus>
<cmdline>--nthreads=2</cmdline>
This does run a WU with 2 threads and boinc is getting told that 2 cores per WU are used.
Example 2:
<avg_ncpus>1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<cmdline>--nthreads=1</cmdline>
This does run 1 WU per core, no multithreading is used. This one should not need the 2 ncpu lines. Please test yourself.


Be warned: There seem to be problems running nbody and a gpu app at the same time, using app_info settings for both. My actual guess is that is has to do with the ncpu counting of all active apps that boinc is doing. I saw situations with 1 core unused (when using less threads than number of cores I have) or even the gpu unused. Never saw that before when the cpu apps did not need any ncpu settings.
Solutions?. (Beside swithing off multithreading and not using the ncpu lines?)


Thoughts, additions and corrections welcome :)

Who Know's
Send message
Joined: 10 Nov 09
Posts: 20
Credit: 30,395,419
RAC: 0

Message 46905 - Posted: 2 Apr 2011, 2:16:16 UTC - in response to Message 46895.

after i make a app_info file and then copy this

<app><!-- std app for N-Body 0.40 mt 32bit -->
<name>milkyway_nbody</name>
<user_friendly_name>MilkyWay@Home nbody</user_friendly_name>
</app>
<file_info>
<name>milkyway_nbody_0.40_windows_intelx86__mt.exe</name>
<executable/>
</file_info>
<file_info>
<name>libgomp-1.dll</name>
<executable/>
</file_info>
<file_info>
<name>pthreadGC2.dll</name>
<executable/>
</file_info>

<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>40</version_num>
<plan_class>mt</plan_class>
<avg_ncpus>4</avg_ncpus>
<max_ncpus>4</max_ncpus>
<cmdline>--nthreads=4</cmdline>
<file_ref>
<file_name>milkyway_nbody_0.40_windows_intelx86__mt.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libgomp-1.dll</file_name>
</file_ref>
<file_ref>
<file_name>pthreadGC2.dll</file_name>
</file_ref>
</app_version>

into it i keep getting a "parse error in app_info.xml; check XML synatx" in red in my messages ?

1 · 2 · 3 · Next
Post to thread

Message boards : News : N-body updated to 0.40


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group