Welcome to MilkyWay@home

New N-Body Release

Message boards : News : New N-Body Release
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Jake Bauer
Project developer
Project tester
Project scientist

Send message
Joined: 20 Aug 12
Posts: 66
Credit: 406,916
RAC: 0
Message 57165 - Posted: 4 Feb 2013, 21:34:06 UTC

Shortly, there will be release of N-Body 1.07. Hopefully, many of the issues on the Windows clients will be resolved. I plan to upload a search with an improved likelihood calculation and a fixed simulation time. We are making a lot of progress with N-Body's development thanks to the excellent userbase. We really appreciate your feedback.

Jake
ID: 57165 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Chaskiel Grundman

Send message
Joined: 24 Dec 12
Posts: 1
Credit: 8,426,039
RAC: 0
Message 57166 - Posted: 5 Feb 2013, 3:09:34 UTC

Are the linux plan class issues going to be sorted out? I have a job using 8 threads in the ati_opencl plan class that boinc thinks is only using 5% of a cpu
ID: 57166 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
EdwardPF

Send message
Joined: 8 Apr 10
Posts: 25
Credit: 268,525
RAC: 0
Message 57171 - Posted: 5 Feb 2013, 16:44:53 UTC - in response to Message 57165.  
Last modified: 5 Feb 2013, 16:49:24 UTC

My first 1.06 finished last night successfully ... however I don't think the cobblestone Wh...s will be happy with the credit! (Workunit 303830201)

Wall time 379,018 sec (4d23h12m25s), CPU time: 1,315,527 sec, credit: 3,004.90 :-(

if this had been the "de_separation_15_sSgr_1" kinda' runs the credit would have been about 33,000 cobs.

Perhaps a tweak in the code is in order :-)

Ed F
ID: 57171 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
James L. Neill

Send message
Joined: 28 Aug 11
Posts: 7
Credit: 29,852,657
RAC: 0
Message 57174 - Posted: 5 Feb 2013, 19:01:07 UTC

Are there any other modifications on the linux side. I have a work-unit at the moment that has the remaining time increasing steadily. The completion estimates I did yesterday and today at 18:00 GMT both say that the unit will complete well after the deadline date. I suppose that my question is was there a timing issue and why were 1.06 units behaving this way? I do not like aborting a unit and for the first time actually did. The unit I have now I am going to leave running and see what happens!!!!!

James
ID: 57174 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
EdwardPF

Send message
Joined: 8 Apr 10
Posts: 25
Credit: 268,525
RAC: 0
Message 57180 - Posted: 6 Feb 2013, 7:00:08 UTC - in response to Message 57171.  

by the way ... It sounds completely crazy BUT ...

While the WU was running my computer clock was 1 hr behind real time (switch time zones??) ... after the WU finished ... my computer time is back to normal ....


Now that is strange!!

Ed F
ID: 57180 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Overtonesinger
Avatar

Send message
Joined: 15 Feb 10
Posts: 63
Credit: 1,836,010
RAC: 0
Message 57181 - Posted: 6 Feb 2013, 8:46:26 UTC
Last modified: 6 Feb 2013, 8:54:11 UTC

Shall I let it run, even after the Deadline ???
Will that help the science - to validate this WU id 291600433 ? :o)

It will take at least two times 1 milion 800 thousand seconds in total ... on my core i7 720QM 1.6 GHz !
... eventhough: I let it run alone sometimes (TurboBoosts to 2.8 GHz).

See the time it has taken for a Core i5 at 3.1 GHz here:

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=291600433

*Thanx*
Melwen - Child of the Fangorn Forest
Rig "BRISINGR" [ASUS G73-JH, i7 720QM 1.73, 4x2GB DDR3 1333 CL7, ATi HD5870M 1GB GDDR5],bought on 2011-02-24
ID: 57181 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Overtonesinger
Avatar

Send message
Joined: 15 Feb 10
Posts: 63
Credit: 1,836,010
RAC: 0
Message 57182 - Posted: 6 Feb 2013, 8:51:03 UTC

Will the server accept the finished RESULT (if no computation error appears) AFTER deadline ?

Thanx :)
ID: 57182 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 578
Credit: 18,845,239
RAC: 856
Message 57184 - Posted: 6 Feb 2013, 10:52:48 UTC - in response to Message 57182.  

Will the server accept the finished RESULT (if no computation error appears) AFTER deadline ?

If you return it before the last wingman: yes.
ID: 57184 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jeffery M. Thompson
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 23 Sep 12
Posts: 159
Credit: 16,977,106
RAC: 0
Message 57188 - Posted: 6 Feb 2013, 14:15:42 UTC

I am looking into the calculation times and the long runs.

We did sweep parameter space to shake out some errors for the code side.

We will be tightening up the parameters in future runs.

As for the new version Jake is running some tests against work units to verify the math coming out.

Once I get the ok on that I will test it across the platforms. And get the new versions out.

We are shaking out the bugs that come out in the extreme ends of the data so to speak.

I am trying to get everything in place shortly to update the binaries I will put a new heading in here for that.

I will be pushing out the binaries and verifying the binaries have gone out before starting the science runs.

ID: 57188 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Overtonesinger
Avatar

Send message
Joined: 15 Feb 10
Posts: 63
Credit: 1,836,010
RAC: 0
Message 57193 - Posted: 7 Feb 2013, 15:16:40 UTC - in response to Message 57184.  

The last wingman ???
Is it the one computing it using NBody 1.06 NVidia-opencl ?????? I have no chance I think. :O

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=291600433
ID: 57193 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 Sep 12
Posts: 219
Credit: 456,474
RAC: 0
Message 57194 - Posted: 7 Feb 2013, 15:43:48 UTC - in response to Message 57193.  

The last wingman ???
Is it the one computing it using NBody 1.06 NVidia-opencl ?????? I have no chance I think. :O

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=291600433

Well, he's managed to get quite a few valid results running non-GPU code on his GPU:

Valid MilkyWay@Home N-Body Simulation tasks for computer 496658

But I suspect your big task may be another to drop down the Too many errors (may have bug) plughole:

Task 291533448 (I'm the one with the 1.5 million seconds CPU time).
ID: 57194 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
M0CZY
Avatar

Send message
Joined: 26 Jun 09
Posts: 16
Credit: 357,054
RAC: 241
Message 57198 - Posted: 8 Feb 2013, 15:13:28 UTC
Last modified: 8 Feb 2013, 15:15:00 UTC

Is there going to be a 32 bit Windows or Linux GPU version of the N-Body application released?
ID: 57198 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Overtonesinger
Avatar

Send message
Joined: 15 Feb 10
Posts: 63
Credit: 1,836,010
RAC: 0
Message 57281 - Posted: 19 Feb 2013, 6:08:13 UTC - in response to Message 57194.  

Thank You!

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=291600433

Update: My extreme-unit is at 50.7 % now!, running at 2.67 GHz most of the time... as I let it alone to Turboboost CPU...

The last wingmen now is a XEON at 2.5 GHz !
// XEONs are extremely effective per clock cycle - but my core i7 is also.
ID: 57281 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
SandJ

Send message
Joined: 2 Jan 08
Posts: 17
Credit: 2,608,409
RAC: 0
Message 57282 - Posted: 19 Feb 2013, 6:44:30 UTC - in response to Message 57181.  
Last modified: 19 Feb 2013, 6:45:18 UTC

Shall I let it run, even after the Deadline ???
Will that help the science - to validate this WU id 291600433 ? :o)

It will take at least two times 1 milion 800 thousand seconds in total ... on my core i7 720QM 1.6 GHz !
... even though: I let it run alone sometimes (TurboBoosts to 2.8 GHz).


Ah, so I am not alone in having a mega-second WU. I had one:

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=3148&nowrap=true#57268
I seem to be getting a succession of "Completed, validation inconclusive" results.

Task 	Work unit	Computer 	Sent 	Time reported or deadline	Status 	Run time 	CPU time 	Credit 	Application
399803703 	291524669 	462901 	10 Feb 2013 | 12:36:32 UTC 	18 Feb 2013 | 3:52:29 UTC 	Completed, validation inconclusive 	659,757.00 	1,561,110.00 	pending 	MilkyWay@Home N-Body Simulation v1.06 (opencl_nvidia)

I don't know if I ever got any credits for that. :-(
ID: 57282 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GaryG
Avatar

Send message
Joined: 29 Aug 12
Posts: 31
Credit: 40,781,945
RAC: 0
Message 57299 - Posted: 20 Feb 2013, 18:23:44 UTC

I have an extremely long one as well

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=294575850

It has been running for 124 hours and is not quite 42% done. The deadline is the 23rd, which I won't make, but it has run to completion on one other system so it is possible.
ID: 57299 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Zoffix Znet
Avatar

Send message
Joined: 22 Jan 13
Posts: 10
Credit: 6,268,662
RAC: 0
Message 57301 - Posted: 21 Feb 2013, 1:53:22 UTC
Last modified: 21 Feb 2013, 1:54:46 UTC

Are long-running N-Body 1.06 worth running at all, or do they all error out?

All the ones that promised to run for 75 hours usually errored out pretty quick for me, but now I got one that's been running for 80 hours already, and estimated time has now changed to 112 hours and is still growing.

Despite the increasing estimated time, it managed to get to 33% completed so far.

With all that said, should I abort it or let it run? I'm guessing it will run for 160 more hours and it would be a shame if it just died eventually, like other 75-hour N-Body 1.06 runs.

This is the WU in question: http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=305973763

My computer is the 495319 one.
ID: 57301 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 578
Credit: 18,845,239
RAC: 856
Message 57302 - Posted: 21 Feb 2013, 10:20:45 UTC - in response to Message 57301.  
Last modified: 21 Feb 2013, 10:21:48 UTC

This is the WU in question: http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=305973763

My computer is the 495319 one.

All error on that WU happened with 0 run time, so you don't seem to be affected by that error. The first computer is also probably still running it, it has timed out for him already, but the last contact was the 17th, so he is probably alive.

I mean, nobody can tell you, if it will error out or not, you have to try it. If everyone aborts all WUs, that eventually might error out, how shall anyone get to know, if they really error out and why? In science also something that didn't work might be a valuable result.
ID: 57302 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Zoffix Znet
Avatar

Send message
Joined: 22 Jan 13
Posts: 10
Credit: 6,268,662
RAC: 0
Message 57303 - Posted: 21 Feb 2013, 12:24:04 UTC - in response to Message 57302.  

I mean, nobody can tell you, if it will error out or not, you have to try it. If everyone aborts all WUs, that eventually might error out, how shall anyone get to know, if they really error out and why? In science also something that didn't work might be a valuable result.


Alright :) I'll keep it running :D (37.6% right now). Thanks for replying.
ID: 57303 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Zoffix Znet
Avatar

Send message
Joined: 22 Jan 13
Posts: 10
Credit: 6,268,662
RAC: 0
Message 57384 - Posted: 28 Feb 2013, 21:42:07 UTC - in response to Message 57302.  

All error on that WU happened with 0 run time, so you don't seem to be affected by that error. The first computer is also probably still running it, it has timed out for him already, but the last contact was the 17th, so he is probably alive.

I mean, nobody can tell you, if it will error out or not, you have to try it. If everyone aborts all WUs, that eventually might error out, how shall anyone get to know, if they really error out and why? In science also something that didn't work might be a valuable result.


So now my run completed successfully, and another box tried to run it unsuccessfully. The message now reads "Too many errors (may have bug)" and "Completed, can't validate"

Will my result now be discarded, or is it somehow possible for me to validate it (or for someone with my type of box to run it... not sure why I'm the only one without errors).

Just curious.
ID: 57384 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tom*

Send message
Joined: 4 Oct 11
Posts: 38
Credit: 309,729,457
RAC: 0
Message 57385 - Posted: 28 Feb 2013, 22:15:19 UTC

I also have a 1.06 N-Body workunit that has run for 43.5 hours and is 37.8% complete.

Since there are 5 other failed tasks on this workunit, should I let it

run hoping to get another user to complete this workunit?

Someone mentioned that they do not need 1.06 Workunits anymore?? If so

is the proper procedure here to abort 1.06 WU in favor of 1.07?

Thanks

ID: 57385 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : News : New N-Body Release

©2024 Astroinformatics Group