Welcome to MilkyWay@home

New N-Body Release 1.42


Advanced search

Message boards : Number crunching : New N-Body Release 1.42
Message board moderation

To post messages, you must log in.

AuthorMessage
Profileentigy

Send message
Joined: 10 Jun 09
Posts: 5
Credit: 7,022,515
RAC: 774
5 million credit badge11 year member badge
Message 62120 - Posted: 6 Aug 2014, 11:46:38 UTC

Hi.

I now have 7 units of the new release that have gone on error.

This is one of the error logs in question:

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
The system cannot find the drive specified.
(0xf) - exit code 15 (0xf)
</message>
<stderr_txt>
<search_application> milkyway_nbody 1.42 Windows x86_64 double OpenMP, Crlibm </search_application>
Using OpenMP 4 max threads on a system with 8 processors
RHO MAX IS 5.23656
5.23656Error reading histogram line 37: 1 -48.5294117647 0.0439655511 0.0013148967
12:36:36 (5352): called boinc_finish

</stderr_txt>
]]>

Any help/suggestions ??
Cheers.
ID: 62120 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileJohn Black

Send message
Joined: 3 May 10
Posts: 74
Credit: 1,532,760
RAC: 0
1 million credit badge10 year member badge
Message 62124 - Posted: 7 Aug 2014, 2:23:22 UTC

Hi Entigy,

I have four tasks that have done the same with a similar error. I think that I will suspend running these until somebody works out what is happening.

John
ID: 62124 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 8 Aug 08
Posts: 21
Credit: 330,700
RAC: 0
100 thousand credit badge12 year member badge
Message 62125 - Posted: 7 Aug 2014, 7:32:40 UTC

Add another one to the error list task# 802961249:

Client state: Compute error
Exit status: 15 (0xf) Unknown error number
    Stderr output
    <core_client_version>7.2.42</core_client_version>
    <![CDATA[
    <message>
    The system cannot find the drive specified.
    (0xf) - exit code 15 (0xf)
    </message>
    <stderr_txt>
    <search_application> milkyway_nbody 1.42 Windows x86 double , Crlibm </search_application>
    RHO MAX IS 7.40308
    7.40308Could not load Ktm32.dll (126): The specified module could not be found.

    Could not load Ktm32.dll (126): The specified module could not be found.

    Error reading histogram line 37: 1 -48.5294117647 0.0439655511 0.0013148967
    02:22:26 (1932): called boinc_finish


    </stderr_txt>
    ]]>



ID: 62125 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileJohn Black

Send message
Joined: 3 May 10
Posts: 74
Credit: 1,532,760
RAC: 0
1 million credit badge10 year member badge
Message 62128 - Posted: 7 Aug 2014, 13:04:30 UTC

Hi,
still getting errored out. Here is the SDERR.

Stderr output

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
The system cannot find the drive specified.
(0xf) - exit code 15 (0xf)
</message>
<stderr_txt>
<search_application> milkyway_nbody 1.42 Windows x86 double , Crlibm </search_application>
RHO MAX IS 6.34097
6.34097Error reading histogram line 37: 1 -48.5294117647 0.0439655511 0.0013148967
02:41:56 (6076): called boinc_finish

</stderr_txt>
]]>

Does anybody have any ideas why this is happening?

John
ID: 62128 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Bauer
Project developer
Project tester
Project scientist

Send message
Joined: 20 Aug 12
Posts: 66
Credit: 406,916
RAC: 0
100 thousand credit badge8 year member badge
Message 62132 - Posted: 7 Aug 2014, 21:08:45 UTC - in response to Message 62128.  

The histogram format was changed. If you are crunching units posted earlier than 08_06, you will see this error. These should stop as soon as the queue clears.

Jake
ID: 62132 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stick

Send message
Joined: 8 Oct 07
Posts: 39
Credit: 1,381,653
RAC: 5,136
1 million credit badge13 year member badge
Message 62140 - Posted: 9 Aug 2014, 17:20:49 UTC
Last modified: 9 Aug 2014, 17:21:57 UTC

Task 803429280 crashed on my computer with a different error:
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
The handle is invalid.
(0x6) - exit code 6 (0x6)
</message>
<stderr_txt>
<search_application> milkyway_nbody 1.42 Windows x86 double , Crlibm </search_application>
RHO MAX IS 57.28266
57.28266Could not load Ktm32.dll (126): The specified module could not be found.

Failed to find end marker in checkpoint file.
14:20:42 (764): called boinc_finish
</stderr_txt>

That seems a little strange because, as far as I know, the unit was the only unit in cache at the time and should not have had any reason to revert to a checkpoint. That is, neither the computer nor BOINC was restarted during the timeframe. But the unit had run for a long time and was very close to finishing up.

OTOH, Task 803429273 finished up OK but its Stderr is also a little strange:
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_nbody 1.42 Windows x86 double , Crlibm </search_application>
RHO MAX IS 33.35565
33.35565Could not load Ktm32.dll (126): The specified module could not be found.

Poor likelihood. Returning worst case.
<search_likelihood>-9999999.900000000400000</search_likelihood>
22:14:21 (1440): called boinc_finish
</stderr_txt>

It is currently in "Checked, but no consensus yet" state because its wingman's unit crashed with a -185 (0xffffffffffffff47) ERR_RESULT_START error and the replacement is "Unsent".
ID: 62140 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eric Findley

Send message
Joined: 1 Jan 14
Posts: 24
Credit: 4,277,349
RAC: 0
3 million credit badge7 year member badgeextraordinary contributions badge
Message 62143 - Posted: 10 Aug 2014, 8:50:09 UTC

Name ps_nbody_06_03_orphan_sim_3_1405680903_293700_2
Workunit 599873095
Created 6 Aug 2014, 18:23:12 UTC
Sent 9 Aug 2014, 3:54:20 UTC
Received 10 Aug 2014, 5:32:06 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 15 (0xf) Unknown error number
Computer ID 554457
Report deadline 21 Aug 2014, 3:54:20 UTC
Run time 2,021.55
CPU time 13,208.92
Validate state Invalid
Credit 0.00
Application version MilkyWay@Home N-Body Simulation v1.42 (mt)
I have 3 of these in a row .
ID: 62143 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eric Findley

Send message
Joined: 1 Jan 14
Posts: 24
Credit: 4,277,349
RAC: 0
3 million credit badge7 year member badgeextraordinary contributions badge
Message 62145 - Posted: 10 Aug 2014, 14:29:54 UTC - in response to Message 62143.  

I have 2 more since I last posted.
Name ps_nbody_06_03_orphan_sim_3_1405680903_336415_1
Workunit 600253171
Created 7 Aug 2014, 19:45:30 UTC
Sent 10 Aug 2014, 1:01:50 UTC
Received 10 Aug 2014, 7:42:17 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 15 (0xf) Unknown error number
Computer ID 554457
Report deadline 22 Aug 2014, 1:01:50 UTC
Run time 1,278.67
CPU time 8,254.84
Validate state Invalid
Credit 0.00
Application version MilkyWay@Home N-Body Simulation v1.42 (mt)

ID: 62145 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eric Findley

Send message
Joined: 1 Jan 14
Posts: 24
Credit: 4,277,349
RAC: 0
3 million credit badge7 year member badgeextraordinary contributions badge
Message 62163 - Posted: 13 Aug 2014, 22:10:17 UTC - in response to Message 62145.  

I have ten more over the last two days.Name de_nbody_06_11_orphan_sim_0_1405680903_256730_2
Workunit 598074943
Created 10 Aug 2014, 15:31:08 UTC
Sent 12 Aug 2014, 11:41:15 UTC
Received 13 Aug 2014, 7:40:16 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED
Computer ID 554457
Report deadline 24 Aug 2014, 11:41:15 UTC
Run time 35,375.91
CPU time 233,620.90
Validate state Invalid
Credit 0.00
Application version MilkyWay@Home N-Body Simulation v1.42 (mt)

ID: 62163 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eric Findley

Send message
Joined: 1 Jan 14
Posts: 24
Credit: 4,277,349
RAC: 0
3 million credit badge7 year member badgeextraordinary contributions badge
Message 62169 - Posted: 15 Aug 2014, 7:11:58 UTC - in response to Message 62163.  

Starting the day off badly,3 more N Body errors.
Stderr output
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
The system cannot find the drive specified.
(0xf) - exit code 15 (0xf)
</message>
<stderr_txt>
<search_application> milkyway_nbody 1.42 Windows x86_64 double OpenMP, Crlibm </search_application>
Using OpenMP 8 max threads on a system with 8 processors
RHO MAX IS 6.42550
6.42550Error reading histogram line 37: 1 -48.5294117647 0.0439655511 0.0013148967
02:53:03 (4440): called boinc_finish

</stderr_txt>
]]>


ID: 62169 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSuperSluether
Avatar

Send message
Joined: 2 Jul 14
Posts: 15
Credit: 20,982,404
RAC: 0
20 million credit badge6 year member badge
Message 62184 - Posted: 16 Aug 2014, 16:57:30 UTC - in response to Message 62120.  

I haven't looked at the error report, but I've been getting computation errors with N-Body Simulation 1.42 as well. They run on 8 CPUs and error out after about 15 minutes. Probably just a glitch. I heard that some computers trash loads of tasks every day. :)
ID: 62184 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eric Findley

Send message
Joined: 1 Jan 14
Posts: 24
Credit: 4,277,349
RAC: 0
3 million credit badge7 year member badgeextraordinary contributions badge
Message 62560 - Posted: 13 Oct 2014, 21:33:19 UTC

Stderr output
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
<stderr_txt>
<search_application> milkyway_nbody 1.44 Windows x86_64 double OpenMP, Crlibm </search_application>
Using OpenMP 8 max threads on a system with 8 processors
RHO MAX IS 1672.62217
1672.62217
</stderr_txt>
]]>

latest n-body to error out
ID: 62560 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brett Collins
Avatar

Send message
Joined: 26 Oct 12
Posts: 2
Credit: 516,327
RAC: 0
500 thousand credit badge8 year member badge
Message 62567 - Posted: 14 Oct 2014, 22:43:30 UTC

Hi - Please advise what you are doing to fix the glitches and what causes them - the work I am getting is a waste of resources - only about 7 percent of my CPU time is going to credit (60k secs out of 900k secs) with the balance going to invalid, error, failed, etc. This state of tasks has been going on for some weeks and is wasteful - hoping for "the queue to clear" is a nonsense.
[img]http://boincstats.com/signature/-1/user/3453400/sig.png[img]
ID: 62567 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : New N-Body Release 1.42

©2021 Astroinformatics Group