rpi_logo
Nobdy Release 1.02
Nobdy Release 1.02
log in

Advanced search

Message boards : News : Nobdy Release 1.02

1 · 2 · Next
Author Message
Jeffery M. Thompson
Volunteer moderator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 23 Sep 12
Posts: 149
Credit: 12,935,212
RAC: 4,584

Message 56232 - Posted: 19 Nov 2012, 19:32:15 UTC

We have updated the binaries for Nbody. Currently the Windows 64 bit and Apple Macintosh 64 bit versions are testing successfully. We are releasing a test run to test the Windows and Apple variants on the boinc system. We are monitoring this as. Please post any error details. And we will monitor the data as it comes back. Thank you.

Profile Arif Mert Kapicioglu
Send message
Joined: 14 Dec 09
Posts: 159
Credit: 576,589,374
RAC: 32,373

Message 56233 - Posted: 19 Nov 2012, 19:43:45 UTC
Last modified: 19 Nov 2012, 19:44:57 UTC

Hello,

Imminent errors.

Exit status: -1073741515 (0xffffffffc0000135) Unknown error number.

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=345135096
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=345135092
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=345134892
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=345134537
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=345134536
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=345134303

Jeffery M. Thompson
Volunteer moderator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 23 Sep 12
Posts: 149
Credit: 12,935,212
RAC: 4,584

Message 56234 - Posted: 19 Nov 2012, 20:35:17 UTC - in response to Message 56233.

Thank you. We are leaving the run up for a short time interval to gather some statistics on the systems seeing errors versus the systems that are not. We will post details in here as we proceed and work on resolving all these issues.

Milky Way Ice Cream
Send message
Joined: 14 Nov 12
Posts: 1
Credit: 2,078,076
RAC: 0

Message 56237 - Posted: 19 Nov 2012, 22:13:55 UTC - in response to Message 56232.

I'm new to this project having problems with this crashing all over the place.

Faulting application milkyway_nbody_1.02_windows_x86_64__mt.exe, version 0.0.0.0, time stamp 0x50a94316, faulting module libgomp_64-1.dll, version 6.0.6002.18541, time stamp 0x4ec3e855, exception code 0xc0000135, fault offset 0x00000000000b6fc8, process id 0x4ffc, application start time 0x01cdc6a15be48670.

When I try and run milkyway_nbody_1.02_windows_x86_64__mt.exe manually it says libgomp_64-1.dll was not found and I can't find it anywhere on my computer.

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 56238 - Posted: 19 Nov 2012, 22:28:15 UTC

The 0xc0000135 errors will probably be missing libgomp_64-1.dll and pthreadGC2_64.dll files - you're still not specifying them in

<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>102</version_num>
<platform>windows_x86_64</platform>
<avg_ncpus>1.000000</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<flops>4412417118.161846</flops>
<api_version>6.13.0</api_version>
<file_ref>
<file_name>milkyway_nbody_1.02_windows_x86_64__mt.exe</file_name>
<main_program/>
</file_ref>
</app_version>

With the files downloaded manually and in place, I'm getting exit code -1073740940 (0xc0000374) like last time:

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=345208264

Annette Thompson
Send message
Joined: 20 Sep 09
Posts: 1
Credit: 2,604,382
RAC: 8,066

Message 56239 - Posted: 19 Nov 2012, 23:22:09 UTC

Thank you I have pulled down the jobs.

The Macintosh Release and the 0.94 64bit Linux release were returning valid results. The bulk of our errors are coming the windows clients though not exclusively.

Thanks for the dll information. I will check the linking. I believe this should have been statically compiled into the executable and may be part of the problem we are seeing. Though I need to look in more deeply.

Jeffery M. Thompson
Volunteer moderator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 23 Sep 12
Posts: 149
Credit: 12,935,212
RAC: 4,584

Message 56240 - Posted: 19 Nov 2012, 23:35:10 UTC - in response to Message 56239.

As you may have noted that was posted as Annette Thompson and not Jeffery M. Thompson. My account as a user was set up on my wife's machine when Milkway@home first came out. I noticed she had received some nbody work and wanted to look at the work unit ids quickly. So oops I posted as my wife sorry for that. But what she said does apply.

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 56241 - Posted: 20 Nov 2012, 0:03:20 UTC - in response to Message 56239.

Unfortunately, it requires at least one of the files as an external DLL - this is the same as I was seeing with v0.84

Amauri
Send message
Joined: 30 Jan 09
Posts: 21
Credit: 3,316,609
RAC: 12,415

Message 56244 - Posted: 20 Nov 2012, 16:16:57 UTC - in response to Message 56241.

Four tasks successfully completed (Linux & NVidia), but can't validate...

Profile Matthias Breimann
Send message
Joined: 10 Dec 10
Posts: 1
Credit: 62,763
RAC: 0

Message 56246 - Posted: 20 Nov 2012, 19:07:19 UTC

Hallo!

calculation error:

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=345364405

Len LE/GE
Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0

Message 56251 - Posted: 21 Nov 2012, 0:27:28 UTC
Last modified: 21 Nov 2012, 0:36:30 UTC

The app_info for nbody v0.84 64bit looked something like


    <app_info>

    <app><!-- CPU app for N-Body 0.84 mt 64bit -->
    <name>milkyway_nbody</name>
    <user_friendly_name>MilkyWay@Home nbody</user_friendly_name>
    </app>

    <file_info>
    <name>milkyway_nbody_0.84_windows_x86_64__mt.exe</name>
    <executable/>
    </file_info>
    <file_info>
    <name>libgomp_64-1_nbody_0.84.dll</name>
    <executable/>
    </file_info>
    <file_info>
    <name>pthreadGC2_64_nbody_0.84.dll</name>
    <executable/>
    </file_info>

    <app_version>
    <app_name>milkyway_nbody</app_name>
    <version_num>84</version_num>

    <plan_class>mt</plan_class>
    <avg_ncpus>4</avg_ncpus>
    <max_ncpus>4</max_ncpus>
    <cmdline>--nthreads=4</cmdline>

    <file_ref>
    <file_name>milkyway_nbody_0.84_windows_x86_64__mt.exe</file_name>
    <main_program/>
    </file_ref>
    <file_ref>
    <file_name>libgomp_64-1_nbody_0.84.dll</file_name>
    <open_name>libgomp_64-1.dll</open_name>
    <copy_file/>
    </file_ref>
    <file_ref>
    <file_name>pthreadGC2_64_nbody_0.84.dll</file_name>
    <open_name>pthreadGC2_64.dll</open_name>
    </file_ref>

    </app_info>



Replacing v0.84 with v1.02 shouldn't be too hard.
Milkyway_nbody mt needs libgomp_64-1 with needs pthreadGC2_64.
Every nbody mt exe before v0.94 came with it's own dll versions.
Problem is, there are no dll files for v0.94/v1.00/v1.02; the newest ones in the download directory seem to be for v0.84.
Those dll files without version number seem to belong to v0.60 or v0.66, maybe even earlier.

So the questions are:
a) Are dll files downloaded with the exe? Which version are they and do they have the proper name (without version) when downloaded.
b) Are the v0.84 dlls compatible to the new exe or did the successfull tests (Message 56232) used newer dll versions which are missing in the download directory?

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 56255 - Posted: 21 Nov 2012, 10:05:41 UTC

If you're going to the bother of creating an app_info.xml, it's probably easier to download the DLLs under their 'real' names, rather than going for the versioned aliases and renaming them back again.

http://milkyway.cs.rpi.edu/milkyway/download/libgomp_64-1.dll
http://milkyway.cs.rpi.edu/milkyway/download/pthreadGC2_64.dll

Then you can do away with the <open_name> and <copy_file/> lines entirely - you forgot the copy on the second file, anyway.

While I've got the download directory open, we may as well link

http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_1.02_windows_x86_64__mt.exe

But I got nothing but errors with this version too - always 0xc0000374

The best explanation I've found for that one is at codeguru - there's a lot of off-topic waffling in the thread, but if you persevere to page 2 (#22), you'll see that the very first reply contains the correct diagnosis.

The symptoms here match that description - the application crashes at the end, after reaching 100% (sometimes several seconds after reaching 100%), which is when you would expect memory to be freed and heap corruption (if any) discovered.

Len LE/GE
Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0

Message 56258 - Posted: 21 Nov 2012, 13:58:56 UTC - in response to Message 56255.

If you're going to the bother of creating an app_info.xml, it's probably easier to download the DLLs under their 'real' names, rather than going for the versioned aliases and renaming them back again.

http://milkyway.cs.rpi.edu/milkyway/download/libgomp_64-1.dll
http://milkyway.cs.rpi.edu/milkyway/download/pthreadGC2_64.dll

Then you can do away with the <open_name> and <copy_file/> lines entirely - you forgot the copy on the second file, anyway.


Good catch on the missing <copy_file/> line.
Did not run nbody for a long time, so it was more like quick putting some fragments together. :) Could not test w/o nbody WUs.

Point is that you need to get the versioned dlls from the download directory when downloading manually, if you rename them locally is your choice. Those without version number are very old (used for nbody v0.40 or v0.60). AFAIR Matt moved to versioning them at that time and renaming them while downloading to the users, so he could keep the versions online without conflicts. That's why I choosed to keep the numbers locally too and used open/copy for runtime. Less confusion for me to keep the proper versions together.

I did read the explanation on codeguru. It goes basically into the same direction I was thinking; maybe my english wasn't the best to make it clear.
You are building (and testing) an exe with a new set of external dlls and trying to run it than with far older dlls. This can lead to a whole set of errors because of critical changes between those dll versions; heap corruption and memory out of bound would be far up on that list.

That's why I am saying: First make sure to use the same dynamically linked dlls the exe was build and internally tested with, than see what errors are still left. See (Message 56239) that the statically linked exes (MAC and Linux) are returning mostly valid results while the bulk of errors are coming from windows clients with the dynamically linked dlls.

Only my 2¢ and I hope they find the root of the problem soon.

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 56259 - Posted: 21 Nov 2012, 15:32:49 UTC - in response to Message 56258.

OK, my turn to say 'good catch'.

When I joined the project (primarily out of interest to see how well the development BOINC v7.0.38 coped with scheduling multi_threaded apps), I found Matt's post at message 53919, and from it assumed - wrongly, as it turns out - that the only reason for versioning the libgomp and pthread DLLs was to comply with BOINC's stratagem for managing multifile applications. As you point out (and FC confirms), there are in fact binary differences too.

The DLLs haven't been recompiled (or at least placed in the download folder) with either the 0.94 or 1.02 releases, so the newest versions available for download are still

http://milkyway.cs.rpi.edu/milkyway/download/libgomp_64-1_nbody_0.84.dll
http://milkyway.cs.rpi.edu/milkyway/download/ pthreadGC2_64_nbody_0.84.dll

and we'll have to use either file renaming or the <copy_file/> construct until Jeffery comes back with an explanation of what the app really needs.

Talking of MT, here's an example of an app_info that I was using at AQUA until they went off the air about 18 months ago - ignore the file names and parameters, but it shows the sort of extra tags that will be needed when NBody is ready to go fully MT.

<app_version>
<app_name>Fokker_Planck</app_name>
<version_num>210</version_num>
<avg_ncpus>3.100000</avg_ncpus>
<max_ncpus>3.100000</max_ncpus>
<flops>10000000000</flops>
<plan_class>fpmt</plan_class>
<api_version>6.11.1</api_version>
<cmdline>--nthreads 4</cmdline>
<file_ref>
<file_name>fokker_planck_2.10_windows_intelx86__fpmt.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>vcomp90_32bit</file_name>
<open_name>vcomp90.dll</open_name>
<copy_file/>
</file_ref>
</app_version>

At the least, some analogue of the lines I've picked out in red (which will be familiar to GPU users here, I'm sure) will be needed.

johnnymc
Send message
Joined: 10 Mar 11
Posts: 2
Credit: 2,116,386
RAC: 0

Message 56260 - Posted: 21 Nov 2012, 16:26:14 UTC
Last modified: 21 Nov 2012, 16:28:03 UTC

Greetings Crunchers!

My machine was cranking out N-Body's quite well - and at one point I was doing them exclusively when there was lots of work to do and I felt my 8 core machine should practice working as a team.

Recently though I have noticed 5 Errors using MilkyWay@Home N-Body Simulation so far since the project started offering them again to my system.

Looking at the task ID numbers I notice my box isn't the only one choking on these bits and bytes.

Cheers to all!
____________

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 56261 - Posted: 21 Nov 2012, 18:03:05 UTC

Even with the .84 DLLs (renamed), I still seem to get 0xC0000374 errors:

task 346291094

Nicklw
Send message
Joined: 16 Aug 09
Posts: 11
Credit: 26,847,645
RAC: 41,912

Message 56268 - Posted: 22 Nov 2012, 3:15:40 UTC

Hi everyone, I have been running MilkyWay for a year or two now with no real problems but the last two days my computer turns itself off after running the program for about five minutes, suspend the activity and no problems.Any one else having problems or know what's happening. Nick
____________

Len LE/GE
Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0

Message 56272 - Posted: 22 Nov 2012, 19:27:28 UTC - in response to Message 56268.

Hi everyone, I have been running MilkyWay for a year or two now with no real problems but the last two days my computer turns itself off after running the program for about five minutes, suspend the activity and no problems.Any one else having problems or know what's happening. Nick


Sounds like a heat problem. Australian summer ...
Try a tool like HWMonitor to find out about the temps in your box.

POPSIE
Send message
Joined: 25 Jan 11
Posts: 12
Credit: 3,738,529
RAC: 2,009

Message 56277 - Posted: 23 Nov 2012, 4:45:55 UTC

On Windows 8


Name ps_nbody_plus_slice_emd_2_1352203202_12105_3
Arbeitspaket 269547725
Erstellt 22 Nov 2012 | 1:26:29 UTC
Gesendet 22 Nov 2012 | 1:26:35 UTC
Empfangen 22 Nov 2012 | 1:28:38 UTC
Serverstatus Abgeschlossen
Resultat Berechnungsfehler
Clientstatus Berechnungsfehler
Endstatus -1073741515 (0xffffffffc0000135) Unknown error number
Computer ID 472499
Ablaufdatum 4 Dec 2012 | 1:26:35 UTC
Laufzeit 0.00
CPU Zeit 0.00
Prüfungsstatus Ungültig
Punkte 0.00
Anwendungsversion MilkyWay@Home N-Body Simulation v1.02
Stderr Ausgabe

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code -1073741515 (0xc0000135)
</message>
]]>


On Windows 7


Name ps_nbody_plus_slice_emd_2_1352203202_13392_4
Arbeitspaket 269559500
Erstellt 23 Nov 2012 | 0:38:00 UTC
Gesendet 23 Nov 2012 | 0:38:05 UTC
Empfangen 23 Nov 2012 | 0:39:23 UTC
Serverstatus Abgeschlossen
Resultat Berechnungsfehler
Clientstatus Berechnungsfehler
Endstatus -1073741515 (0xffffffffc0000135) Unknown error number
Computer ID 435673
Ablaufdatum 5 Dec 2012 | 0:38:05 UTC
Laufzeit 0.00
CPU Zeit 0.00
Prüfungsstatus Ungültig
Punkte 0.00
Anwendungsversion MilkyWay@Home N-Body Simulation v1.02
Stderr Ausgabe

<core_client_version>7.0.25</core_client_version>
<![CDATA[
<message>
- exit code -1073741515 (0xc0000135)
</message>
]]>


On Linux OpenSuse


Name ps_nbody_plus_slice_emd_2_1352203202_10142_3
Arbeitspaket 269531818
Erstellt 22 Nov 2012 | 4:06:17 UTC
Gesendet 22 Nov 2012 | 4:06:29 UTC
Empfangen 22 Nov 2012 | 5:01:45 UTC
Serverstatus Abgeschlossen
Resultat Erfolgreich
Clientstatus Fertig
Endstatus 0 (0x0)
Computer ID 400549
Ablaufdatum 4 Dec 2012 | 4:06:29 UTC
Laufzeit 3,316.00
CPU Zeit 5,399.97
Prüfungsstatus Arbeitspaket fehlerhaft - Prüfung übersprungen
Punkte 0.00
Anwendungsversion MilkyWay@Home N-Body Simulation v0.94
Stderr Ausgabe

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_nbody 0.94 Linux x86_64 double OpenMP, Crlibm </search_application>
Warning: not applying timestep correction for workunit with min version 0.80
Number of particles in bins is very small compared to total. (42 << 100000). Skipping distance calculation
<search_likelihood>-194.117647058823536</search_likelihood>
06:02:55 (19824): called boinc_finish

</stderr_txt>
]]>

[/u]

Profile [SG]ATA-Rolf
Send message
Joined: 18 Sep 07
Posts: 1
Credit: 12,317,775
RAC: 0

Message 56301 - Posted: 25 Nov 2012, 21:26:07 UTC - in response to Message 56277.
Last modified: 25 Nov 2012, 21:26:58 UTC

Bis zum 24.11.2012 11:56:49 UTC war alles in Ordnung:
http://milkyway.cs.rpi.edu/milkyway/results.php?userid=339&offset=0&show_names=0&state=3&appid=

Danach ging nichts mehr:
http://milkyway.cs.rpi.edu/milkyway/results.php?userid=339&offset=0&show_names=0&state=5&appid=

Was soll das????????????????????????

1 · 2 · Next
Post to thread

Message boards : News : Nobdy Release 1.02


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group