Welcome to MilkyWay@home

N-body updated to 0.40

Message boards : News : N-body updated to 0.40
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 46796 - Posted: 28 Mar 2011, 22:45:24 UTC
Last modified: 29 Mar 2011, 3:27:02 UTC

The N-body simulation has been updated to 0.40. All systems are now using OpenMP for threading. The old static JSON configuration we were using has been replaced with Lua (so we can have totally arbitrary initial distributions of particles). Now we'll be fitting dwarf models with multiple components (e.g. a dark matter shell around the dwarf galaxy).

The old applications won't work. Any old search workunits still left also won't work.

Since some people sometimes want download links, here's the source and binaries:
Source:
http://milkyway.cs.rpi.edu/milkyway/download/src/milkyway_nbody_0.40.tar.xz

Linux:
http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_x86_64-pc-linux-gnu__mt
http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_i686-pc-linux-gnu__mt

OS X:
http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_x86_64-apple-darwin__mt

Windows: (also need the dlls)
http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_windows_intelx86__mt.exe
http://milkyway.cs.rpi.edu/milkyway/download/libgomp-1.dll
http://milkyway.cs.rpi.edu/milkyway/download/pthreadGC2.dll

http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_0.40_windows_x86_64__mt.exe
http://milkyway.cs.rpi.edu/milkyway/download/libgomp_64-1.dll
http://milkyway.cs.rpi.edu/milkyway/download/pthreadGC2_64.dll
ID: 46796 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 46800 - Posted: 28 Mar 2011, 23:46:34 UTC

My initial work unit with this new engine, work unit 261601923 errored out. This also happened to all other crunchers who tried to handle this. Could you please check to see what is going on? My guess is that your server is trying to send out old JSON work units with the new Lua engine, based on what is being written in the stderr out belonging to returned results.
ID: 46800 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 46801 - Posted: 28 Mar 2011, 23:58:37 UTC - in response to Message 46800.  

My initial work unit with this new engine, work unit 261601923 errored out. This also happened to all other crunchers who tried to handle this. Could you please check to see what is going on? My guess is that your server is trying to send out old JSON work units with the new Lua engine, based on what is being written in the stderr out belonging to returned results.
A lot of old units are still being sent out for some reason even though the searches are stopped.
ID: 46801 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 46807 - Posted: 29 Mar 2011, 2:29:34 UTC - in response to Message 46796.  
Last modified: 29 Mar 2011, 2:31:17 UTC

I just restarted the search (actually 3 times). There are still some junk workunits left in the system which don't work correctly. The de_nbody_orphan_test_2model_4_* workunits are the current good search, and seem to be working fine for me. Some of the new junk workunits fail instantly when they can't find the input file. Some of the others run way too quickly and don't appear to have progress bars because the order of some of the parameters were switched from what they should be.
ID: 46807 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [AF>EDLS] Polynesia
Avatar

Send message
Joined: 5 Apr 09
Posts: 71
Credit: 6,120,786
RAC: 0
Message 46824 - Posted: 29 Mar 2011, 18:51:14 UTC

Hello,

all my units N-body 0.40 parte error ... damage

<core_client_version>6.12.19</core_client_version>
<![CDATA[
<message>
- exit code -1073741515 (0xc0000135)
</message>
]]>
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470
ID: 46824 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Retslag1

Send message
Joined: 13 Mar 11
Posts: 1
Credit: 269,865
RAC: 0
Message 46855 - Posted: 31 Mar 2011, 1:50:28 UTC
Last modified: 31 Mar 2011, 1:55:27 UTC

I keep getting errors on milkyway_nbody_0.40_windows_x86_64__mt.exe WU...I shouldn't have to download that seperately, should I? ANd if so, where would I even install it too
ID: 46855 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 29 Sep 10
Posts: 54
Credit: 1,342,886
RAC: 0
Message 46858 - Posted: 31 Mar 2011, 5:43:02 UTC - in response to Message 46796.  
Last modified: 31 Mar 2011, 6:11:22 UTC

WAIT ONE MINUTE!

How do I set the number of CPUs that the nbody 0.40 app uses inside my app_info.xml file? Each process is using 4 CPUs, but BOINC doesn't seem aware of this.

Edit: Scratch that; each process uses ALL my CPUs without regard to any BOINC preferences about MAX CPUs!

How do I limit how many threads this new nBody app can use ?!?
ID: 46858 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 46860 - Posted: 31 Mar 2011, 13:32:43 UTC - in response to Message 46858.  

WAIT ONE MINUTE!

How do I set the number of CPUs that the nbody 0.40 app uses inside my app_info.xml file? Each process is using 4 CPUs, but BOINC doesn't seem aware of this.

Edit: Scratch that; each process uses ALL my CPUs without regard to any BOINC preferences about MAX CPUs!

How do I limit how many threads this new nBody app can use ?!?
BOINC controls the number of threads used. If you want to set the number, you use something like and in app_info.xml. You also might be able to control it more directly with the command line control that I think exists in app_info.xml by adding something like --nthreads 4, though I'm not sure how that will play with how BOINC wants to schedule things.
ID: 46860 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 29 Sep 10
Posts: 54
Credit: 1,342,886
RAC: 0
Message 46861 - Posted: 31 Mar 2011, 14:34:54 UTC - in response to Message 46860.  

BOINC controls the number of threads used. If you want to set the number, you use something like <avg_ncpus> and <max_ncpus> in app_info.xml. You also might be able to control it more directly with the command line control that I think exists in app_info.xml by adding something like <cmdline>--nthreads 4</cmdline>, though I'm not sure how that will play with how BOINC wants to schedule things.


<avg_ncpus> and <max_ncpus> only TELL boinc how many CPUs a sci app uses; it does NOT dictate to the sci app. I tried param --nthreads=4 and set the ncpus params both to 4. Hard-coding the number of threads seems to work. Thanks for the tip.
ID: 46861 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Vicki

Send message
Joined: 25 Jun 10
Posts: 3
Credit: 20,219,110
RAC: 0
Message 46863 - Posted: 31 Mar 2011, 15:55:59 UTC

my milkyway_nbody_0.40_windows_x86_64__mt.... keeps failing and "has stopped working" (32 of those stopped working message windows for me to to close this morning).

Looking in BOINC it shows that de_nobody_orphan_test_2model_4_57647_1301567303_0 is what was being processed and it had a computation error (don't know about the other 31 "stopped working" windows.

This has been happening since the release of the 0.40 version.

My OS is Windows Vista x64 on a quad core Intel processor.
ID: 46863 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pstehno
Avatar

Send message
Joined: 16 Jun 10
Posts: 6
Credit: 7,402,186
RAC: 0
Message 46864 - Posted: 31 Mar 2011, 16:42:22 UTC

I noticed that the workunits I've been getting run for about 2 seconds before erroring out. Is it a 0.40 issue or a workunit issue?
ID: 46864 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 46865 - Posted: 31 Mar 2011, 16:44:27 UTC - in response to Message 46863.  

my milkyway_nbody_0.40_windows_x86_64__mt.... keeps failing and "has stopped working" (32 of those stopped working message windows for me to to close this morning).

Looking in BOINC it shows that de_nobody_orphan_test_2model_4_57647_1301567303_0 is what was being processed and it had a computation error (don't know about the other 31 "stopped working" windows.

This has been happening since the release of the 0.40 version.

My OS is Windows Vista x64 on a quad core Intel processor.
From looking at your tasks, the only one left that I can check had the failed to load DLL error. Did you try manually installing it and not get the DLLs? Try resetting the project.
ID: 46865 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 46866 - Posted: 31 Mar 2011, 16:46:48 UTC - in response to Message 46864.  

I noticed that the workunits I've been getting run for about 2 seconds before erroring out. Is it a 0.40 issue or a workunit issue?
For some reason there are still some lingering old workunits (though they should have stopped completely, the percentage of them is decreasing), and the new version rejects the old format, so it's a workunit issue.
ID: 46866 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 29 Sep 10
Posts: 54
Credit: 1,342,886
RAC: 0
Message 46868 - Posted: 31 Mar 2011, 17:43:38 UTC - in response to Message 46866.  

I noticed that the workunits I've been getting run for about 2 seconds before erroring out. Is it a 0.40 issue or a workunit issue?

For some reason there are still some lingering old workunits (though they should have stopped completely, the percentage of them is decreasing), and the new version rejects the old format, so it's a workunit issue.


Funny enough, I tried to keep both versions in my app_info.xml file, so I could process both old and new WU. But BOINC will delete the old version from disk upon startup!

I hope there are still a few people out there running the old version long enough to empty the system of the older WU.
ID: 46868 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
grumpy

Send message
Joined: 14 Dec 07
Posts: 9
Credit: 10,208,774
RAC: 556
Message 46869 - Posted: 31 Mar 2011, 17:54:13 UTC

I seem to have received one that works!
The previous one errored out fast.





name de_nbody_orphan_test_2model_4_44832_1301540627_1
Workunit 263052631
Created 31 Mar 2011 17:14:49 UTC
Sent 31 Mar 2011 17:25:20 UTC
Received 31 Mar 2011 17:36:51 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 205494
Report deadline 8 Apr 2011 17:25:20 UTC
Run time 276.415709
CPU time 2744.775
stderr out

<core_client_version>6.10.60</core_client_version>
<![CDATA[
<stderr_txt>
<search_application>milkywayathome nbody 0.40 Windows x86_64 double OpenMP Crlibm</search_application>
13:25:06: Using OpenMP 18 max threads on a system with 12 processors
13:30:28: Making final checkpoint
13:30:28: Simulation complete
<search_likelihood>-469.276081432873</search_likelihood>
13:30:28 (6080): called boinc_finish

</stderr_txt>
]]>

Validate state Checked, but no consensus yet
Claimed credit 19.9490033746374
Granted credit 0
application version MilkyWay@Home N-Body Simulation v0.40 (mt)
ID: 46869 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [AF>EDLS] Polynesia
Avatar

Send message
Joined: 5 Apr 09
Posts: 71
Credit: 6,120,786
RAC: 0
Message 46873 - Posted: 31 Mar 2011, 20:42:54 UTC - in response to Message 46861.  
Last modified: 31 Mar 2011, 20:43:58 UTC

BOINC controls the number of threads used. If you want to set the number, you use something like <avg_ncpus> and <max_ncpus> in app_info.xml. You also might be able to control it more directly with the command line control that I think exists in app_info.xml by adding something like <cmdline>--nthreads 4</cmdline>, though I'm not sure how that will play with how BOINC wants to schedule things.


<avg_ncpus> and <max_ncpus> only TELL boinc how many CPUs a sci app uses; it does NOT dictate to the sci app. I tried param --nthreads=4 and set the ncpus params both to 4. Hard-coding the number of threads seems to work. Thanks for the tip.

ok
Team Alliance francophone, boinc: 7.0.18

GA-P55-UD5, i7 860, Win 7 64 bits, 8g DDR3, GTX 470
ID: 46873 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Raimund Barbeln

Send message
Joined: 7 Oct 07
Posts: 25
Credit: 35,401,003
RAC: 5,711
Message 46880 - Posted: 1 Apr 2011, 6:55:45 UTC - in response to Message 46861.  

can someone please post a working app_info.xml for the new application?
When life gives you lemons, make lemonade!
ID: 46880 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 29 Sep 10
Posts: 54
Credit: 1,342,886
RAC: 0
Message 46886 - Posted: 1 Apr 2011, 13:56:20 UTC - in response to Message 46880.  

can someone please post a working app_info.xml for the new application?


See my first post:
http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2301
ID: 46886 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
Message 46895 - Posted: 2 Apr 2011, 0:01:32 UTC
Last modified: 2 Apr 2011, 0:04:17 UTC

On Windows you have to include the dll names in the app_info.xml too.

Only made a quick test, but it seems to basically work on my win32 system.


<app><!-- std app for N-Body 0.40 mt 32bit -->
<name>milkyway_nbody</name>
<user_friendly_name>MilkyWay@Home nbody</user_friendly_name>
</app>
<file_info>
<name>milkyway_nbody_0.40_windows_intelx86__mt.exe</name>
<executable/>
</file_info>
<file_info>
<name>libgomp-1.dll</name>
<executable/>
</file_info>
<file_info>
<name>pthreadGC2.dll</name>
<executable/>
</file_info>

<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>40</version_num>
<plan_class>mt</plan_class>
<avg_ncpus>4</avg_ncpus>
<max_ncpus>4</max_ncpus>
<cmdline>--nthreads=4</cmdline>
<file_ref>
<file_name>milkyway_nbody_0.40_windows_intelx86__mt.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libgomp-1.dll</file_name>
</file_ref>
<file_ref>
<file_name>pthreadGC2.dll</file_name>
</file_ref>
</app_version>


Add that to your app_info.xml and watch out for the bold marked stuff.
The plan_class line is not really necessary.

For win64 you have to use:
milkyway_nbody_0.40_windows_x86_64__mt.exe
libgomp_64-1.dll
pthreadGC2_64.dll


If you want to change the number of threads used, you have to change the cmdline param, and let boinc know about it with the 2 ncpu params.
Example 1:
<avg_ncpus>2</avg_ncpus>
<max_ncpus>2</max_ncpus>
<cmdline>--nthreads=2</cmdline>
This does run a WU with 2 threads and boinc is getting told that 2 cores per WU are used.
Example 2:
<avg_ncpus>1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<cmdline>--nthreads=1</cmdline>
This does run 1 WU per core, no multithreading is used. This one should not need the 2 ncpu lines. Please test yourself.


Be warned: There seem to be problems running nbody and a gpu app at the same time, using app_info settings for both. My actual guess is that is has to do with the ncpu counting of all active apps that boinc is doing. I saw situations with 1 core unused (when using less threads than number of cores I have) or even the gpu unused. Never saw that before when the cpu apps did not need any ncpu settings.
Solutions?. (Beside swithing off multithreading and not using the ncpu lines?)


Thoughts, additions and corrections welcome :)
ID: 46895 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Who Know's

Send message
Joined: 10 Nov 09
Posts: 20
Credit: 30,395,419
RAC: 0
Message 46905 - Posted: 2 Apr 2011, 2:16:16 UTC - in response to Message 46895.  

after i make a app_info file and then copy this

<app><!-- std app for N-Body 0.40 mt 32bit -->
<name>milkyway_nbody</name>
<user_friendly_name>MilkyWay@Home nbody</user_friendly_name>
</app>
<file_info>
<name>milkyway_nbody_0.40_windows_intelx86__mt.exe</name>
<executable/>
</file_info>
<file_info>
<name>libgomp-1.dll</name>
<executable/>
</file_info>
<file_info>
<name>pthreadGC2.dll</name>
<executable/>
</file_info>

<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>40</version_num>
<plan_class>mt</plan_class>
<avg_ncpus>4</avg_ncpus>
<max_ncpus>4</max_ncpus>
<cmdline>--nthreads=4</cmdline>
<file_ref>
<file_name>milkyway_nbody_0.40_windows_intelx86__mt.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libgomp-1.dll</file_name>
</file_ref>
<file_ref>
<file_name>pthreadGC2.dll</file_name>
</file_ref>
</app_version>

into it i keep getting a "parse error in app_info.xml; check XML synatx" in red in my messages ?
ID: 46905 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : News : N-body updated to 0.40

©2024 Astroinformatics Group