New NBody test searches
log in

Advanced search

Message boards : News : New NBody test searches

Author Message
Steve
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 2 Jul 12
Posts: 4
Credit: 0
RAC: 0
Message 54980 - Posted: 3 Jul 2012 | 20:37:14 UTC

Hello Everyone,

My name is Steve, I am a physics/math undergrad at RPI and have taken on the task of handling the nbody code for the summer. For the past few weeks I have been working on the comparison software that measures how well the simulations match the known data.

I just uploaded two searches, de_plum_slice_EMD_NEW_100K and ps_plum_slice_EMD_NEW_100K. Hopefully these will run smoothly but let me know if anyone has any problems.

Other than that if anyone has any questions about me, what I do, or about Milky Way at Home, please feel free to ask. I would be happy to answer them.

Penguin5540
Send message
Joined: 4 Mar 12
Posts: 33
Credit: 16,757,408
RAC: 22,375
Message 54985 - Posted: 3 Jul 2012 | 22:01:23 UTC

Getting computation errors on workunits named de_plum_slice_EMD_NEW_100K... all failing after 100% work done.

Penguin5540
Send message
Joined: 4 Mar 12
Posts: 33
Credit: 16,757,408
RAC: 22,375
Message 54986 - Posted: 3 Jul 2012 | 22:03:33 UTC - in response to Message 54985.

Also on ps_plus_slice workunits. same thing. Computes then comes back with a computation error when at 100%

Corsair
Avatar
Send message
Joined: 13 Aug 09
Posts: 7
Credit: 24,197,936
RAC: 6,086
Message 54987 - Posted: 3 Jul 2012 | 22:06:10 UTC

All workunits ps & ds crunched are getting error when reached 100% of computation.
____________
Corsair

over the sailors grave never grows grass.

Tag
Send message
Joined: 22 Mar 12
Posts: 1
Credit: 1,786,387
RAC: 398
Message 54989 - Posted: 3 Jul 2012 | 23:03:29 UTC - in response to Message 54985.

Have just uploaded de_plum_slice_EMD_NEW_100K & ps_plum_slice_EMD_NEW_100K
All have finished and ended with computation error.

TAG

JZD
Send message
Joined: 31 Dec 11
Posts: 3
Credit: 2,536,069
RAC: 6,797
Message 54990 - Posted: 4 Jul 2012 | 0:07:11 UTC - in response to Message 54989.
Last modified: 4 Jul 2012 | 0:12:42 UTC

All have finished and ended with computation error.
workunits http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=248916712, http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=248828315, http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=248917607

Profile Overtonesinger
Send message
Joined: 15 Feb 10
Posts: 52
Credit: 1,494,904
RAC: 1,287
Message 54995 - Posted: 4 Jul 2012 | 7:05:47 UTC

All WUs from this set ended with computation error at 100 percent of completion:
ps_plum_slice_EMD_NEW_100K_*

example WU:
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=195811396

P.S. dear Steve, please, tell us a bit about Thyself! :)
What is Thy greatest hobby? And Thy second hobby?
(mine are: overtonesinging and computers). What is Thy current job or what does Thou study? :)

Profile Overtonesinger
Send message
Joined: 15 Feb 10
Posts: 52
Credit: 1,494,904
RAC: 1,287
Message 54996 - Posted: 4 Jul 2012 | 7:27:17 UTC - in response to Message 54995.

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=195808791 - WU name: de_plum_slice_EMD_NEW_100K_1341346802_15268

exit errorcode:

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
System nemue nalezt uvedenou jednotku. (0xf) - exit code 15 (0xf)
</message>
<stderr_txt>
<search_application> milkyway_nbody_0.80_windows_x86_64__mt.exe 0.80 Windows x86 double OpenMP, Crlibm </search_application>
Using OpenMP 2 max threads on a system with 2 processors
Could not load Ktm32.dll (1815): (null)Error reading histogram line 37: massPerParticle = 0.000100
09:14:29 (2428): called boinc_finish

</stderr_txt>
]]>


* "System nemuze nalezt uvedenou jednotku." is in Czech, so I think it is error-message of the Windows XP SP3 CZ running on that computer.

- and it means: (Operating) System cannot find specified unit. (disk drive???)
I guess there is some problem with a PATH to some file. :)[/code]

Profile Overtonesinger
Send message
Joined: 15 Feb 10
Posts: 52
Credit: 1,494,904
RAC: 1,287
Message 54998 - Posted: 4 Jul 2012 | 9:54:28 UTC

Very interesting and weird Work-Unit! It has had veeeeery long run time for an NBody unit! :O how a pity it also errored out at 100 percent:

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=195808492

Profile TOTEM
Send message
Joined: 22 Jun 12
Posts: 2
Credit: 377,594
RAC: 0
Message 54999 - Posted: 4 Jul 2012 | 10:11:19 UTC

MW@H 1.02 opencl_and n-body sim 0.80 showing computational error at 100%. Definitely weird.

readingdancer
Send message
Joined: 11 May 11
Posts: 2
Credit: 1,991,722
RAC: 0
Message 55000 - Posted: 4 Jul 2012 | 10:35:18 UTC

Hi Steve,

Unfortunately these are also all producing a Computation error when they reach 100% for me too.

Cheers,

Chris

Cartoonman
Send message
Joined: 10 Dec 09
Posts: 18
Credit: 6,518,358
RAC: 0
Message 55001 - Posted: 4 Jul 2012 | 10:38:00 UTC

It appears, based on reviewing the stderr outputs of all of the users that have seen errors, that the application fails when reaching Histogram number 37

Error reading histogram line 37: massPerParticle = 0.000100


Another variant:


Could not load Ktm32.dll (1815): (null)Error reading histogram line 37: massPerParticle = 0.000100

greg_be
Send message
Joined: 18 Aug 09
Posts: 44
Credit: 951,344
RAC: 987
Message 55004 - Posted: 4 Jul 2012 | 12:53:46 UTC - in response to Message 55001.

It appears, based on reviewing the stderr outputs of all of the users that have seen errors, that the application fails when reaching Histogram number 37

Error reading histogram line 37: massPerParticle = 0.000100


Another variant:


Could not load Ktm32.dll (1815): (null)Error reading histogram line 37: massPerParticle = 0.000100



Another variant on the Ktm stuff: Could not load Ktm32.dll (126): The specified module could not be found.

Saikrishna
Send message
Joined: 3 Apr 12
Posts: 1
Credit: 60,145
RAC: 206
Message 55005 - Posted: 4 Jul 2012 | 12:56:37 UTC
Last modified: 4 Jul 2012 | 12:58:50 UTC

Same for me; all WUs related to plum_slice failed. Error was the histogram line 37 error, along with

The system cannot find the drive specified. (0xf) - exit code 15 (0xf)


http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=195678535
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=195678534
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=195678533
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=195678532
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=195673796
____________

TJ
Send message
Joined: 12 Aug 09
Posts: 251
Credit: 77,776,909
RAC: 1,061
Message 55006 - Posted: 4 Jul 2012 | 13:32:16 UTC

Hi Steve,

These "things": ps_plum_slice_EMD_NEW_100K_1341346802_25338, error out, not only at my place, also with wingman.
So you have to study a wee bit more on the code...
____________
Greetings from,
TJ

Prroto
Send message
Joined: 21 Jun 12
Posts: 1
Credit: 1,122,762
RAC: 0
Message 55013 - Posted: 4 Jul 2012 | 18:45:17 UTC
Last modified: 4 Jul 2012 | 18:45:46 UTC

Error, error, error... OMG!

http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=454572&offset=0&show_names=0&state=5&appid=

Penguin5540
Send message
Joined: 4 Mar 12
Posts: 33
Credit: 16,757,408
RAC: 22,375
Message 55014 - Posted: 4 Jul 2012 | 21:24:05 UTC - in response to Message 55013.

errors continuing...

Profile Blurf
Volunteer moderator
Project administrator
Send message
Joined: 13 Mar 08
Posts: 617
Credit: 25,447,954
RAC: 0
Message 55017 - Posted: 4 Jul 2012 | 21:36:29 UTC
Last modified: 4 Jul 2012 | 21:37:25 UTC

It's the 4th of July Holiday in the US....please don't expect any work to be done today.
____________

Stephan Volkmann
Send message
Joined: 17 Mar 09
Posts: 7
Credit: 752,811
RAC: 1
Message 55019 - Posted: 4 Jul 2012 | 23:30:48 UTC - in response to Message 55017.

Paket 195954817
Stephan Volkmann | Abmelden
Name de_plum_slice_EMD_NEW_100K_1341346802_24224
Anwendung MilkyWay@Home N-Body Simulation
erstellt 4 Jul 2012 | 12:34:45 UTC
Mindestanzahl 1
Anfängliche Kopien 1
max # von Fehler/Gesamt/Erfolg Aufgaben 3, 9, 6
Fehler Zu viele Fehler (evtl. ein Bug)
Aufgabe
anklicken für Einzelheiten Computer Gesendet Meldezeit
oder Ablaufdatum
Erklärung Status Laufzeit
(sek) CPU Zeit
(sek) Punkte Anwendung
249287862 454178 4 Jul 2012 | 12:40:07 UTC 4 Jul 2012 | 21:23:38 UTC Fehler beim Berechnen 9.39 68.39 --- MilkyWay@Home N-Body Simulation v0.84 (mt)
249553467 450029 4 Jul 2012 | 21:23:53 UTC 4 Jul 2012 | 22:24:38 UTC Fehler beim Berechnen 8.03 76.51 --- MilkyWay@Home N-Body Simulation v0.88 (mt)
249584683 275574 4 Jul 2012 | 22:24:43 UTC 4 Jul 2012 | 23:25:16 UTC Fehler beim Berechnen 7.18 75.49 --- MilkyWay@Home N-Body Simulation v0.84 (mt)
249613873 454203 4 Jul 2012 | 23:25:21 UTC 4 Jul 2012 | 23:26:53 UTC Fehler beim Berechnen 19.04 33.69 --- MilkyWay@Home N-Body Simulation v0.88 (mt)

Stephan Volkmann
Send message
Joined: 17 Mar 09
Posts: 7
Credit: 752,811
RAC: 1
Message 55020 - Posted: 4 Jul 2012 | 23:32:07 UTC - in response to Message 55019.

248966335 439343 4 Jul 2012 | 1:28:43 UTC 4 Jul 2012 | 2:29:10 UTC Fehler beim Berechnen 256.37 909.62 --- MilkyWay@Home N-Body Simulation v0.88 (mt)
248999555 295489 4 Jul 2012 | 2:35:31 UTC 4 Jul 2012 | 3:45:32 UTC Fehler beim Berechnen 167.39 764.86 --- MilkyWay@Home N-Body Simulation v0.80 (mt)
249035935 432507 4 Jul 2012 | 3:52:18 UTC 4 Jul 2012 | 23:25:13 UTC Fehler beim Berechnen 302.74 1,114.24 --- MilkyWay@Home N-Body Simulation v0.80 (mt)
249613868 454203 4 Jul 2012 | 23:25:21 UTC 4 Jul 2012 | 23:29:39 UTC Fehler beim Berechnen 224.41 440.12 --- MilkyWay@Home N-Body Simulation v0.88 (mt)

Profile TimeRanger
Send message
Joined: 31 Oct 10
Posts: 19
Credit: 959,222
RAC: 1,504
Message 55023 - Posted: 5 Jul 2012 | 9:22:25 UTC

6 of the EMD_NEW_100K units - 3 of the "de" and 3 "ps" .. all failed.

Steve
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 2 Jul 12
Posts: 4
Credit: 0
RAC: 0
Message 55026 - Posted: 5 Jul 2012 | 16:40:58 UTC
Last modified: 5 Jul 2012 | 16:49:09 UTC

Sorry for the late response time everyone, I was out of town yesterday for Forth of July shenanigans.

It looks like the simulator is mostly running fine, but the program crashes at 100% when trying to load the new format of the output histograms. This should be a quick fix and we should have some proper simulations running soon. I'll post a message when I fix this problem and put up some more simulations. Thank you all for being patient.

@Overtonesinger
My greatest hobby would have to be skiing, it's pretty much all I do during the winter. Come the summertime I busy myself with rock climbing, juggling, and an online game call League of Legends. I would like to add piano playing to this list but sadly I don't really have access to one right now. My job for the summer is to get these simulations working again, and at RPI I study physics and mathematics of operations research.

Profile Overtonesinger
Send message
Joined: 15 Feb 10
Posts: 52
Credit: 1,494,904
RAC: 1,287
Message 55032 - Posted: 6 Jul 2012 | 6:43:59 UTC

Steve has written:

I just uploaded two searches, de_plum_slice_EMD_NEW_100K and ps_plum_slice_EMD_NEW_100K. Hopefully these will run smoothly but let me know if anyone has any problems.


Dear Steve,

well, as You see, they run pretty smuuuuthly. :)))

Now seriously: It has been three days. Please, tell us about some progress, we deserve it. :)
I have 3 small questions:

1) Can You estimate when they are fixed or deleted?

2) Because they replicate over and over when error, can You please set lower limit of their MAX.replication rate? ... So the last one of them will have their LAST TRY error out sooner ? :)

3) Tel us what to do with them. Compute untill they all error out their last try? Or shall we kill them on sight? ... shall we wait and compute something else for the time being untill fixed?

4) Can You please create new set of WUs that will work - to compute between the returning error-ones (there is already not enough work for all) ??
I have suggestion - as a professional programmer/developer: Now that You know for sure what is wrong with them(NEW) from all the posts, I would recommend: to compare the config of the "output generator" of the old ones and config of new ones, make it perfectly similar (search for hidden whitespace also! no extra tab or space should be there around the Line 37, just for sure ) - so it would be really the same format/syntax as the old ones,
thus would run OK like the old WUs.

Steve, could You try it, please? We need some new work for CPU (only NBody is really for CPU - the separation is done 800 times faster on GPUs... so better may it run on them).

Thanx a lot for all answers to come, Steve.

*Namaste*
Filip

Profile Overtonesinger
Send message
Joined: 15 Feb 10
Posts: 52
Credit: 1,494,904
RAC: 1,287
Message 55033 - Posted: 6 Jul 2012 | 6:57:23 UTC

Why not to use, just for now, the old format of the output histograms ?

POPSIE
Send message
Joined: 25 Jan 11
Posts: 12
Credit: 3,643,137
RAC: 0
Message 55085 - Posted: 11 Jul 2012 | 18:21:20 UTC
Last modified: 11 Jul 2012 | 18:27:30 UTC

After Restarting NBody-Server after last stop on beginning of month 6, I get many WU's.
First 3 Day's all work well.
On Third day, all WU's ending with calculation errors, I think ERR-NR15
After 4. day NBody Server where stopped and nobody gets new WU's
Now no WU's and no running NBody-Server is the Staus of today.
Over 6 weeks and no proper working NBody WOW!?

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
process exited with code 15 (0xf, -241)
</message>
<stderr_txt>
<search_application> milkyway_nbody 0.88 Linux x86_64 double OpenMP, Crlibm </search_application>
Using OpenMP 7 max threads on a system with 4 processors
Error reading histogram line 37: massPerParticle = 0.000100
18:53:54 (5221): called boinc_finish

</stderr_txt>

[AF>france>pas-de-calais]symaski62
Send message
Joined: 7 Jul 09
Posts: 11
Credit: 4,406
RAC: 0
Message 55086 - Posted: 11 Jul 2012 | 20:04:02 UTC
Last modified: 11 Jul 2012 | 20:19:13 UTC

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=253993090

2.5% by 2.5% :)


TLSI2000
Send message
Joined: 15 Mar 10
Posts: 15
Credit: 71,761,717
RAC: 165,684
Message 55327 - Posted: 10 Aug 2012 | 22:43:32 UTC

When the outstanding n-Body work units finally get down to zero, is there to be another series ?

The count is now at 2.

Post to thread

Message boards : News : New NBody test searches


Main page · Your account · Message boards


Copyright © 2013 AstroInformatics Group