Message boards :
News :
updated the nbody applications again
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
Now at v0.06. Let us know how they're running here. |
Send message Joined: 17 Oct 08 Posts: 36 Credit: 411,744 RAC: 0 |
I received no workunits for nbody 0.06, only for milkyway 0.19. However, I prefer to leave the latter to the ATI guys. *grin* So I set up an app_info.xml again and downloaded nbody 0.06 manually. This is the app_info part only for nbody 0.06 CPU tasks: <app_info> <app> <name>milkyway_nbody</name> </app> <file_info> <name>milkyway_nbody_0.06_windows_x86_64__sse2.exe</name> <executable/> </file_info> <app_version> <app_name>milkyway_nbody</app_name> <version_num>6</version_num> <file_ref> <file_name>milkyway_nbody_0.06_windows_x86_64__sse2.exe</file_name> <main_program/> </file_ref> </app_version> </app_info> This is for Windows 64bit, for 32bit remove the _64 part from the download link and the app_info file references. Btw, I think it's a good idea to include the sse2 requirement now, should save people with old computers from some frustation. |
Send message Joined: 17 Oct 08 Posts: 36 Credit: 411,744 RAC: 0 |
Another issue: Is there (or was there) some glitch in the database concerning the nbody application name? For example see here: http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=150916643 Above the tasklist it says: "This is displayed on the workunit pageDatabase Error" and in the cell for the application name there is noted "v0.00" which is also repeated in the task itself. I've seen a number of instances. Maybe this is already resolved, because here is a workunit where a task with linux app 0.06 is finished and the app version is shown correctly. EDIT: Checkpointing seems to work now. I shut down BOINC on purpose and work was resumed at the last checkpoint. This is Win 7 64bit. Great. Checkpoint: tnow = 2.01929. time since last = 361.459s Regards Alex |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
I have 3 cruched wu's, none of them validated. One example: Checked, but no consensus yet http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=195759632 Regards Alexander |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
I have 3 cruched wu's, none of them validated. One example: That doesn't look like the new release. That is an old one with the restarting checkpointing on windows. |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
|
Send message Joined: 17 Oct 08 Posts: 36 Credit: 411,744 RAC: 0 |
Hmm.. from my result list it seems that v0.06 will not validate against v0.04. Anyone else seeing this? My only valid v0.06 result (win 64bit) so far was against linux v0.06. On the other hand, I've also seen some v0.04 results which did not validate against another v0.04. Guess we could need some more data here *g*. |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
I've checked another result: http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=196316955 It contains the message Number of bins does not match those in histogram file. Expected 34, got 0 Failed to calculate chisq <search_likelihood>1.#QNAN</search_likelihood> <search_application>milkywayathome nbody 0.06 Windows x86 double</search_application> 21:02:22 (5220): called boinc_finish HTH Alexander Hi Matt, I still have ~12 wu's in cache. Does it make sense to crunch them or can they be killed? |
Send message Joined: 17 Oct 08 Posts: 36 Credit: 411,744 RAC: 0 |
Yeah, seen this, too. On my only valid v0.06 result here, it was included in both results output. So the big question is, if this is really a valid result to the project? Btw, Guten Abend, Alexander! *g* |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
Hi, nice to meet you again! You're too still working? Alexander |
Send message Joined: 17 Oct 08 Posts: 36 Credit: 411,744 RAC: 0 |
Depends on how you define 'working'. I'm not getting paid for what I do right now. But that's fine with me. :) Guess, we shouldn't chat here too much. *g* The result I linked to below was just purged from the database... |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
Turns out I had a stupid build system issue so that the histogram wasn't being resolved with BOINC, so it gets opened as an empty file. I'll start making another release now. |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
Can confirm the working checkpoint. I see the histogram issue too. 7 WU's between 1000 and 3000 seconds seems to claim credits like ~5 per 1000 seconds. None validated, so none granted yet. Wonder what multiplier will be used when they are validated. 1 WU (150658380) was at 2 hrs and 20%; canceled it as it seemed so far out of line with the other runtimes. Still the checkpointing looked ok and showed slow progress of the WU. |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
I updated to Matt's new release. Hopefully this is working better. |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
Might still be a small problem, there is no % progression and I see this in the stderr out. <core_client_version>6.11.7</core_client_version> <![CDATA[ <stderr_txt> shmget in attach_shmem: Invalid argument 16:10:20 (83546): Can't set up shared mem: -1. Will run in standalone mode. Starting fresh nbody run Starting nbody system <plummer_r> -4.0267114691334 14.424159068236 6.8061497609999 </plummer_r> <plummer_v> 199.37230954409 111.54102111951 -177.06111744164 </plummer_v> Checkpoint: tnow = 0.905146. time since last = 1.28451e+09s Checkpoint: tnow = 1.89042. time since last = 304.232s Checkpoint: tnow = 3.16037. time since last = 303.48s Making final checkpoint Simulation complete <search_likelihood>-88.370444765717294899</search_likelihood> <search_application>milkywayathome nbody 0.07 Darwin x86_64 double</search_application> 16:27:25 (83546): called boinc_finish </stderr_txt> http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=196437612 I have several model 1_1 units that are over an hour and still at 0%, but I will let them run just to see if they will finish or error out. |
Send message Joined: 17 Oct 08 Posts: 36 Credit: 411,744 RAC: 0 |
I've seen the same error on this wingman of mine, also a Mac running Darwin x86_64: http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=196251224 |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
Might still be a small problem, there is no % progression and I see this in the stderr out. Yeah, I'm seeing that when I run it on my OS X. From your log it worked (except the will run in standalone mode part, which is probably the problem). It didn't happen in the old releases. It actually works, but the only thing seems to be that the progress bars don't show up in the manager. It is actually progressing and working when I manually inspect the checkpoints. I have no idea what I could have done to stop the progress bars from working, but I'll look into it. |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
They also appear to work on Linux and Windows. |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
Ok, 2 WU's with version 0.07 done and validated. The histogram message is gone now. That's on a Phenom II @ 2.8GHz: 2,996.59s claimed 12.78 granted 12.77 -> 16cr/h 2,649.30s claimed 11.60 granted 11.80 -> 15.33cd/h Time to define a reasonable multiplier |
Send message Joined: 2 Mar 10 Posts: 5 Credit: 105,634,798 RAC: 0 |
This version is gobbling my processor 100%, ignoring the restrictions I have put in place. I'm not doing any more processing until you get this fixed. |
©2024 Astroinformatics Group