N-Body 1.08
log in

Advanced search

Message boards : News : N-Body 1.08

1 · 2 · Next
Author Message
Jake Bauer
Project developer
Project tester
Project scientist
Send message
Joined: 20 Aug 12
Posts: 66
Credit: 406,916
RAC: 0

Message 57574 - Posted: 20 Mar 2013, 16:26:50 UTC

Hello users,

I have just updated the N-Body binaries. Expect a new release tonight. Post errors here!

Thanks,

Jake

Jimmy Gondek
Send message
Joined: 28 Sep 11
Posts: 60
Credit: 22,764,173
RAC: 0

Message 57577 - Posted: 20 Mar 2013, 19:18:56 UTC - in response to Message 57574.

Hi Jake,

A couple of stderr's as examples below...more in my task bin...is this what you're looking for...?

Task 424020748
Jimmy Gondek | log out
Name ps_nbody_100K_EMD_32013_2_1358941502_230897_0
Workunit 326324298
Created 20 Mar 2013 | 18:12:21 UTC
Sent 20 Mar 2013 | 18:32:31 UTC
Received 20 Mar 2013 | 18:58:55 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 330727
Report deadline 1 Apr 2013 | 18:32:31 UTC
Run time 1,584.00
CPU time 3,518.23
Validate state Checked, but no consensus yet
Credit 0.00
Application version MilkyWay@Home N-Body Simulation v1.08
Stderr output

<core_client_version>6.12.43</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_nbody 1.08 Darwin x86_64 double OpenMP, Crlibm </search_application>
Number of particles in bins is very small compared to total. (7 << 100000). Skipping distance calculation
<search_likelihood>-9999999.900000000372529</search_likelihood>
14:58:48 (73108): called boinc_finish

</stderr_txt>
]]>


Task 423995278
Jimmy Gondek | log out
Name de_nbody_100K_EMD_32013_1358941502_229457_1
Workunit 326299430
Created 20 Mar 2013 | 17:26:12 UTC
Sent 20 Mar 2013 | 17:42:41 UTC
Received 20 Mar 2013 | 18:12:50 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 330727
Report deadline 1 Apr 2013 | 17:42:41 UTC
Run time 1,809.00
CPU time 6,633.10
Validate state Checked, but no consensus yet
Credit 0.00
Application version MilkyWay@Home N-Body Simulation v1.08
Stderr output

<core_client_version>6.12.43</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_nbody 1.08 Darwin x86_64 double OpenMP, Crlibm </search_application>
<search_likelihood>-14517.243707935935163</search_likelihood>
14:12:44 (72750): called boinc_finish

</stderr_txt>
]]>


Trambambaj
Send message
Joined: 13 Feb 13
Posts: 1
Credit: 436,515
RAC: 0

Message 57579 - Posted: 20 Mar 2013, 20:27:27 UTC

Hi,
I recieved task that is estimated for 7915 hours on CPU.
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=326343808

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 57580 - Posted: 20 Mar 2013, 20:52:21 UTC
Last modified: 20 Mar 2013, 21:12:16 UTC

ARGH!

The Applications page is still showing plan classes (opencl_amd_ati) and (opencl_nvidia) for Linux, and *no* plan class (MT) for Windows.

For the record, can we confirm that N-Body 1.08 is still supposed to be a CPU-only, multi-threaded, application?

Never mind, I'll go grab a task and see what I can make of it.

Edit - yes, OpenMP is still reporting into stderr.txt "Using 1 max threads on a system with 4 processors", and Process Explorer confirms one worker thread using 24% CPU. I'll transfer the new executables down to my anonymous platform machine, and see how it looks multithreaded.

This task started with a 1156 hour (7 week) estimate, but completed 7.5% in the first 10 minutes.

TLSI2000
Send message
Joined: 15 Mar 10
Posts: 17
Credit: 427,047,095
RAC: 204,150

Message 57584 - Posted: 21 Mar 2013, 3:07:59 UTC

These seem to be running fine, with a run time coming in at 2 to 4 hours
for an older AMD 2.4 ghz

But the credit calculation seems to be a bit odd

Run time _ _ CPU time _ _ Credit _ _ Application
6,357.59 _ _ 6,357.59 _ _ 26.84 _ _ MilkyWay@Home N-Body Simulation v1.08
6,404.13 _ _ 6,404.13 _ _ 27.04 _ _ MilkyWay@Home N-Body Simulation v1.08
9,509.63 _ _ 9,496.64 _ _ 13.22 _ _ MilkyWay@Home N-Body Simulation v1.08

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 57597 - Posted: 21 Mar 2013, 15:16:04 UTC

I've put through a few tasks on both hosts now

Host 479865 - running 'au naturel', single threaded (roughly the same run time and CPU time)

Host 465695 - running under app_info.xml with --nthreads 3, and CPU times to match.

Observations:
The stock host got a grossly exaggerated runtime estimate for the very first task (only), but settled immediately to reasonable values thereafter. That makes sense - I was probably too early for app_version.pfc_scale to have been established for the first one.

The anonymous platform host is still getting distorted runtimes - the most recent one an initial estimate of 983 hours. That may be because for anonymous platform cases, the application details record isn't re-initialised for each new app_version: it appears the server thinks my i7 is much slower than it really is, which might be the case if a different base <rsc_fpops_est> has been used for this app/batch. That probably accounts for the extremely low credit scores for that host, too (apologies to wingmates).

Once this new task has finished, I'll force that machine to get a new HostID and thus reset the speed and usage data - see if that cures it.

One thing I haven't tested yet is restarting from checkpoints - I'll leave that to someone else.

DJStarfox
Send message
Joined: 29 Sep 10
Posts: 53
Credit: 924,393
RAC: 1,086

Message 57607 - Posted: 22 Mar 2013, 12:34:33 UTC - in response to Message 57574.
Last modified: 22 Mar 2013, 12:43:09 UTC

I'm getting SELINUX errors because the nbody application is trying to access /home. No, I will not disable SELINUX. Your application should not access my filesystem outside its working directory.

Edit:
The real problem is that it says it requires GLIB 2.14. I have libc.so.6, but I guess it's not the right version.

./milkyway_nbody_1.08_x86_64-pc-linux-gnu__mt: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by ./milkyway_nbody_1.08_x86_64-pc-linux-gnu__mt)
linux-vdso.so.1 => (0x00007fff693ff000)
librt.so.1 => /lib64/librt.so.1 (0x0000003315200000)
libm.so.6 => /lib64/libm.so.6 (0x0000003088400000)
libgomp.so.1 => /usr/lib64/libgomp.so.1 (0x000000326d800000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003315e00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003087800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003087400000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003088800000)

Jeffery M. Thompson
Volunteer moderator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 23 Sep 12
Posts: 145
Credit: 12,411,662
RAC: 4,379

Message 57617 - Posted: 22 Mar 2013, 19:24:38 UTC

I am seeing the Glibc errors coming across, but on older versions of the BOINC client.

I am guessing you are running BOINC 6.10 or 6.12 if it matches the pattern I have been observing. I don't have a resolution at this time as I am just researching the error. But I would suggest to update to the latest BOINC client to see if the error persists.


Jeff Thompson

floyd
Send message
Joined: 13 Sep 11
Posts: 17
Credit: 3,251,778
RAC: 0

Message 57623 - Posted: 23 Mar 2013, 11:52:41 UTC - in response to Message 57617.

I am seeing the Glibc errors coming across, but on older versions of the BOINC client.

I am guessing you are running BOINC 6.10 or 6.12 if it matches the pattern I have been observing.


That´s just because older clients will be more likely to run on older systems with older glibc. AFAICS the real reason is that the application is dynamically linked - BTW the i686 binary is not - and it specifically depends on GLIBC_2.14 just as stated in the error message. To be more precise it´s just memcpy from that version. If this requirement is really necessary is up to the developers. If in doubt I´d suggest a static binary.

DJStarfox
Send message
Joined: 29 Sep 10
Posts: 53
Credit: 924,393
RAC: 1,086

Message 57626 - Posted: 23 Mar 2013, 17:29:01 UTC

I agree. Statically link it, or I'm not going to be able to run the latest n-body application.

Profile Overtonesinger
Avatar
Send message
Joined: 15 Feb 10
Posts: 63
Credit: 1,836,010
RAC: 0

Message 57647 - Posted: 25 Mar 2013, 22:01:19 UTC

unlikely result :) :) :)

- seems like only 2 stars from the given sample are nearly-fitting the Saggitarius Dwarf stream? :O

&lt;core_client_version&gt;7.0.58&lt;/core_client_version&gt; &lt;![CDATA[ &lt;stderr_txt&gt; &lt;search_application&gt; milkyway_nbody 1.08 Windows x86_64 double OpenMP, Crlibm &lt;/search_application&gt; Using OpenMP 1 max threads on a system with 4 processors Number of particles in bins is very small compared to total. (2 &lt;&lt; 100000). Skipping distance calculation &lt;search_likelihood&gt;-9999999.900000000400000&lt;/search_likelihood&gt; 22:44:58 (4740): called boinc_finish &lt;/stderr_txt&gt; ]]&gt;



http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=328674025
____________
Melwen - Child of the Fangorn Forest
Rig "BRISINGR" [ASUS G73-JH, i7 720QM 1.73, 4x2GB DDR3 1333 CL7, ATi HD5870M 1GB GDDR5],bought on 2011-02-24

Alinator
Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0

Message 57650 - Posted: 26 Mar 2013, 1:37:29 UTC

Looks like there are still some problems with checkpointing on Winboxes as well.

Although I haven't had the problem on any of mine, I don't have an "out of the box" standard installation of Windows on any of my machines.

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=327231097

lucas_smith
Send message
Joined: 23 Feb 13
Posts: 3
Credit: 18,695
RAC: 0

Message 57656 - Posted: 26 Mar 2013, 14:09:44 UTC

Hi everyone! Sorry but I'm new and don't know how to find the download for the new nbody release. Where should I look for such releases in the future? Can you assist me? Thank you!

Alinator
Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0

Message 57658 - Posted: 26 Mar 2013, 15:20:35 UTC - in response to Message 57656.

I checked your host, and at this point you don't need to do anything. You have alrerady run 4 of them successfully.

lucas_smith
Send message
Joined: 23 Feb 13
Posts: 3
Credit: 18,695
RAC: 0

Message 57659 - Posted: 26 Mar 2013, 15:47:08 UTC - in response to Message 57658.

Thank you! Does this mean that it is automatically downloaded and that I will be not have to take action in the future? I appreciate the help!

jay_e
Send message
Joined: 24 Mar 13
Posts: 11
Credit: 14,823
RAC: 0

Message 57666 - Posted: 26 Mar 2013, 23:43:13 UTC
Last modified: 26 Mar 2013, 23:48:22 UTC

Greetings.

I have an 8 core CPU and set the CPU preferences to 50%.
4 CPU tasks are loaded and 1 GPU task.
Yet the CPU monitor shows all 8 cores are running at 100%.
I set no new tasks. Let all tasks finish. stopped work from all other projects and rebooted. Problem repeats.
Summary: Ubuntu-Linux and de_nbody_100K_EMD_32013_2_1358941502_444444_0 .

details follow.

Tue 26 Mar 2013 07:03:52 PM EDT | | Starting BOINC client version 7.0.27 for x86_64-pc-linux-gnu Tue 26 Mar 2013 07:03:52 PM EDT | | log flags: file_xfer, sched_ops, task Tue 26 Mar 2013 07:03:52 PM EDT | | Libraries: libcurl/7.29.0 OpenSSL/1.0.1c zlib/1.2.7 libidn/1.25 librtmp/2.3 Tue 26 Mar 2013 07:03:52 PM EDT | | Data directory: /var/lib/boinc-client Tue 26 Mar 2013 07:03:52 PM EDT | | Processor: 8 AuthenticAMD AMD FX(tm)-8150 Eight-Core Processor [Family 21 Model 1 Stepping 2] Tue 26 Mar 2013 07:03:52 PM EDT | | Processor: 2.00 MB cache Tue 26 Mar 2013 07:03:52 PM EDT | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold Tue 26 Mar 2013 07:03:52 PM EDT | | OS: Linux: 3.8.0-13-generic Tue 26 Mar 2013 07:03:52 PM EDT | | Memory: 7.70 GB physical, 8.04 GB virtual Tue 26 Mar 2013 07:03:52 PM EDT | | Disk: 18.33 GB total, 16.27 GB free Tue 26 Mar 2013 07:03:52 PM EDT | | Local time is UTC -4 hours Tue 26 Mar 2013 07:03:52 PM EDT | | ATI GPU 0: Capeverde (CAL version 1.4.1741, 2048MB, 1710MB available, 2048 GFLOPS peak) Tue 26 Mar 2013 07:03:52 PM EDT | | OpenCL: ATI GPU 0: Capeverde (driver version 1084.4 (VM), device version OpenCL 1.2 AMD-APP (1084.4), 2048MB, 1710MB available) Tue 26 Mar 2013 07:03:52 PM EDT | | Config: use all coprocessors Tue 26 Mar 2013 07:03:52 PM EDT | | Config: GUI RPC allowed from: Tue 26 Mar 2013 07:03:52 PM EDT | | A new version of BOINC is available. &lt;a href=http://boinc.berkeley.edu/download.php&gt;Download it.&lt;/a&gt; Tue 26 Mar 2013 07:03:52 PM EDT | malariacontrol.net | URL http://www.malariacontrol.net/; Computer ID 621946; resource share 20 Tue 26 Mar 2013 07:03:52 PM EDT | LHC@home 1.0 | URL http://lhcathomeclassic.cern.ch/sixtrack/; Computer ID 10282414; resource share 20 Tue 26 Mar 2013 07:03:52 PM EDT | World Community Grid | URL http://www.worldcommunitygrid.org/; Computer ID 2325898; resource share 40 Tue 26 Mar 2013 07:03:52 PM EDT | Milkyway@Home | URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 508164; resource share 20 Tue 26 Mar 2013 07:03:52 PM EDT | | General prefs: from http://setiathome.berkeley.edu/ (last modified 24-Mar-2013 04:56:00) Tue 26 Mar 2013 07:03:52 PM EDT | | Host location: none Tue 26 Mar 2013 07:03:52 PM EDT | | General prefs: using your defaults Tue 26 Mar 2013 07:03:52 PM EDT | | Preferences: Tue 26 Mar 2013 07:03:52 PM EDT | | max memory usage when active: 7494.78MB Tue 26 Mar 2013 07:03:52 PM EDT | | max memory usage when idle: 7494.78MB Tue 26 Mar 2013 07:03:52 PM EDT | | max disk usage: 10.00GB Tue 26 Mar 2013 07:03:52 PM EDT | | max CPUs used: 7 Tue 26 Mar 2013 07:03:52 PM EDT | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) Tue 26 Mar 2013 07:03:52 PM EDT | | Not using a proxy Tue 26 Mar 2013 07:04:17 PM EDT | LHC@home 1.0 | update requested by user Tue 26 Mar 2013 07:04:19 PM EDT | LHC@home 1.0 | Sending scheduler request: Requested by user. Tue 26 Mar 2013 07:04:19 PM EDT | LHC@home 1.0 | Not reporting or requesting tasks Tue 26 Mar 2013 07:04:20 PM EDT | LHC@home 1.0 | work fetch resumed by user Tue 26 Mar 2013 07:04:21 PM EDT | LHC@home 1.0 | Scheduler request completed Tue 26 Mar 2013 07:04:31 PM EDT | LHC@home 1.0 | Sending scheduler request: To fetch work. Tue 26 Mar 2013 07:04:31 PM EDT | LHC@home 1.0 | Requesting new tasks for CPU Tue 26 Mar 2013 07:04:33 PM EDT | LHC@home 1.0 | Scheduler request completed: got 0 new tasks Tue 26 Mar 2013 07:04:33 PM EDT | LHC@home 1.0 | Project has no tasks available Tue 26 Mar 2013 07:05:21 PM EDT | LHC@home 1.0 | work fetch suspended by user Tue 26 Mar 2013 07:05:53 PM EDT | | General prefs: from http://setiathome.berkeley.edu/ (last modified 24-Mar-2013 04:56:00) Tue 26 Mar 2013 07:05:53 PM EDT | | Host location: none Tue 26 Mar 2013 07:05:53 PM EDT | | General prefs: using your defaults Tue 26 Mar 2013 07:05:53 PM EDT | | Reading preferences override file Tue 26 Mar 2013 07:05:53 PM EDT | | Preferences: Tue 26 Mar 2013 07:05:53 PM EDT | | max memory usage when active: 7494.78MB Tue 26 Mar 2013 07:05:53 PM EDT | | max memory usage when idle: 7494.78MB Tue 26 Mar 2013 07:05:53 PM EDT | | max disk usage: 10.00GB Tue 26 Mar 2013 07:05:53 PM EDT | | Number of usable CPUs has changed from 7 to 4. [color=darkred]This is where I set preferences to 50% before allowing ANY work.[/color] Tue 26 Mar 2013 07:05:53 PM EDT | | max CPUs used: 4 Tue 26 Mar 2013 07:05:53 PM EDT | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) Tue 26 Mar 2013 07:05:59 PM EDT | Milkyway@Home | work fetch resumed by user Tue 26 Mar 2013 07:06:53 PM EDT | Milkyway@Home | Sending scheduler request: To fetch work. Tue 26 Mar 2013 07:06:53 PM EDT | Milkyway@Home | Requesting new tasks for CPU and ATI Tue 26 Mar 2013 07:06:55 PM EDT | Milkyway@Home | Scheduler request completed: got 5 new tasks Tue 26 Mar 2013 07:06:57 PM EDT | Milkyway@Home | Starting task de_nbody_100K_EMD_32013_2_1358941502_444444_0 using milkyway_nbody version 108 (opencl_amd_ati) in slot 0 Tue 26 Mar 2013 07:06:57 PM EDT | Milkyway@Home | Starting task ps_nbody_100K_EMD_32013_2_1358941502_274697_2 using milkyway_nbody version 108 in slot 1 Tue 26 Mar 2013 07:06:57 PM EDT | Milkyway@Home | Starting task de_separation_23_3s_sSgr_1_1358941502_28794660_0 using milkyway version 101 in slot 2 Tue 26 Mar 2013 07:06:57 PM EDT | Milkyway@Home | Starting task ps_nbody_100K_EMD_32013_2_1358941502_444422_0 using milkyway_nbody version 108 in slot 3 Tue 26 Mar 2013 07:06:57 PM EDT | Milkyway@Home | Starting task de_separation_23_3s_sSgr_1_1358941502_28794661_0 using milkyway version 101 in slot 4

App_config.xml and cc_config.xml
&lt;app_config&gt; &lt;app&gt; &lt;name&gt;hcc1&lt;/name&gt; &lt;max_concurrent&gt;2&lt;/max_concurrent&gt;` &lt;gpu_versions&gt; &lt;gpu_usage&gt;0.5&lt;/gpu_usage&gt; &lt;cpu_usage&gt;0.5&lt;/cpu_usage&gt; &lt;/gpu_version&gt; &lt;/app&gt; &lt;app&gt; &lt;name&gt;milkyway&lt;/name&gt; &lt;max_concurrent&gt;2&lt;/max_concurrent&gt; &lt;gpu_versions&gt; &lt;gpu_usage&gt;0.5&lt;/gpu_usage&gt; &lt;cpu_usage&gt;0.5&lt;/cpu_usage&gt; &lt;/gpu_versions&gt; &lt;/app&gt; &lt;/app_config&gt; ================================================= &lt;cc_config&gt; &lt;options&gt; &lt;use_all_gpus&gt;1&lt;/use_all_gpus&gt; &lt;/options&gt; &lt;/cc_config&gt;


Need more data?
In comparison, when using WCG Help Conquer Cancer GPU WU,
the total CPU utilization is ~50%

Is there a way to see if something is spinning in a loop?
Its not a reliable measurement, but the fans sound like they are running for a 100% load.

One weirdness.
The BOINC Task page, on the line describing the GPU task says:
"Running (0.05 CPUs + 1 ATI GPU)
The xml file was not set to 0.05 for CPU.
And, it looks like FOUR CPUs are attached/linked/associated with the GPU task.
If I suspend the GPU task, the utilization goes to using 4 of the 8 CPUs at 100% and the other 4 are idle.
Resuming the single GPU task sets all 8 cores to 100%. This is not temporary, but lasts all the time the GPU is running.
I have not observed, yet, what happens when the WU finishes and uploads finished data and gets new data into the GPU.
I'll try to observe this and report later.

T H A N K S,
Jay

--edit - add fglrx versions --
Package fglrx:
i 2:9.010-0ubuntu2 raring 500

Package fglrx-amdcccle:
i A 2:9.010-0ubuntu2 raring 500

Ubuntu 13.04.

jay_e
Send message
Joined: 24 Mar 13
Posts: 11
Credit: 14,823
RAC: 0

Message 57667 - Posted: 27 Mar 2013, 0:26:33 UTC - in response to Message 57607.

-additional data to previous post --

Data when GPU task changes.

The overload stopped when the GPU task finished.
System Moinitor shows 5 of 8 cores at 100%.

But then, after I was writing this post, all 8 cores went back to 100%

Tue 26 Mar 2013 07:50:18 PM EDT | Milkyway@Home | Computation for task de_nbody_100K_EMD_32013_2_1358941502_373121_1 finished

Hmmm. Not sure how BOINC and MW list the CPU task that handles the GPU loading/unloading.

I did check when I suspended the GPU task and the overload stopped. It *was* the GPU task - not the CPU task that I suspended on the BOINC Manager screen - when the overload previously stopped.
The overload lasted for the time that the GPU task ran - approx. 20 minutes.
The GPU is a Radeon HD 7750 with 2GB memory - slower - but less heat and watts.

Here is link to the 1st completed GPU task - no errors in stderr.
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=428297681

After the load went back to 100%.
I did a ps -ef to get a task list - but all of the parameters did not fit on the display.
boinc 2551 2473 60 19:06 ? 00:42:17 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_nbody_1.08_x86_64-pc-linux-gnu__mt -f nbody_parameters.lua -h histogram.txt --seed 230516087 boinc 2552 2473 59 19:06 ? 00:41:49 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_separation_1.01_x86_64-pc-linux-gnu -np 20 -p 0.401795116392895 11.0570420466829 20 120 9.23 boinc 2553 2473 60 19:06 ? 00:42:02 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_nbody_1.08_x86_64-pc-linux-gnu__mt -f nbody_parameters.lua -h histogram.txt --seed 244208335 boinc 2554 2473 60 19:06 ? 00:41:55 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_separation_1.01_x86_64-pc-linux-gnu -np 20 -p 0.917377772089099 1 20 218.173156674949 9.3681 root 2733 2 0 19:44 ? 00:00:00 [flush-8:0] root 2797 2 0 19:52 ? 00:00:00 [kworker/4:2] boinc 2830 2473 99 20:04 ? 01:04:40 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_nbody_1.08_x86_64-pc-linux-gnu_mt__opencl_amd_ati -f nbody_parameters.lua -h histogram.txt -


Enjoy!
Jay

jay_e
Send message
Joined: 24 Mar 13
Posts: 11
Credit: 14,823
RAC: 0

Message 57669 - Posted: 27 Mar 2013, 1:06:06 UTC
Last modified: 27 Mar 2013, 1:07:55 UTC

2nd 'continued' posting to CPU using 100% when 50% specified.

OK.
It looks like every the overload happens on every other GPU WU.

I checked to see if the problem was in thesystem monitor.

I went back to the sysstat package and ran sar -P ALL
here is what it reported during the overload:

08:45:35 PM CPU %user %nice %system %iowait %steal %idle 08:45:45 PM all 1.64 92.98 0.82 0.00 0.00 4.56 08:45:45 PM 0 0.70 96.72 0.70 0.00 0.00 1.89 08:45:45 PM 1 0.00 94.31 0.50 0.00 0.00 5.19 08:45:45 PM 2 1.00 94.41 0.70 0.00 0.00 3.90 08:45:45 PM 3 0.70 94.31 0.70 0.00 0.00 4.29 08:45:45 PM 4 5.32 87.56 1.71 0.00 0.00 5.42 08:45:45 PM 5 3.09 89.82 1.20 0.00 0.00 5.89 08:45:45 PM 6 1.71 92.38 0.50 0.00 0.00 5.42 08:45:45 PM 7 0.70 94.49 0.40 0.00 0.00 4.40

this shows that all 8, indeed, are used.

The BOINC status only shows 4 CPU plus one CPU-GPU task.

A ps -ef only shows ( but there are 3 more running at 100% somwhere)

$ ps -ef | grep boinc boinc 2473 1 0 19:03 ? 00:01:01 /usr/bin/boinc --check_all_logins --redirectio --dir /var/lib/boinc-client jay 2514 1 1 19:04 ? 00:01:59 /usr/bin/boincmgr boinc 2551 2473 69 19:06 ? 01:15:17 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_nbody_1.08_x86_64-pc-linux-gnu__mt -f nbody_parameters.lua -h histogram.txt --seed 230516087 -np 6 -p 2.2613531307244 2.30756208857862 0.280775306084978 0.307090154017686 13.7441723154459 0.146058620134541 boinc 2552 2473 68 19:06 ? 01:15:02 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_separation_1.01_x86_64-pc-linux-gnu -np 20 -p 0.401795116392895 11.0570420466829 20 120 9.23227259465482 6.02408145224687 -4.65891254542505 13.49007896143 20 122.412517562509 2.3 0.569129901562 -6.28318530717959 2.69928641156317 20 244 2.4 4.0146428615553 6.28318530717959 0.984356404794499 boinc 2554 2473 68 19:06 ? 01:14:56 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_separation_1.01_x86_64-pc-linux-gnu -np 20 -p 0.917377772089099 1 20 218.173156674949 9.36815519919617 6.28318530717959 5.3483147091067 16.745039480715 4.81361567974091 151.560387347829 2.3 6.28318530717959 -6.28318530717959 3.30858427949722 20 244 2.4 5.39780165528236 -4.43232117875659 4.42432041794522 boinc 3038 2473 66 20:37 ? 00:12:06 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_nbody_1.08_x86_64-pc-linux-gnu__mt -f nbody_parameters.lua -h histogram.txt --seed 25278614 -np 6 -p 1.5 1.5 0.5 0.5 15 0.128067006109071 boinc 3043 2473 99 20:44 ? 01:01:30 ../../projects/milkyway.cs.rpi.edu_milkyway/milkyway_nbody_1.08_x86_64-pc-linux-gnu_mt__opencl_amd_ati -f nbody_parameters.lua -h histogram.txt --seed 130617792 -np 6 -p 2.5 1.66034131823107 0.5 0.330835734494028 15 0.131445339787751 --device 0

I tried running the ps-ef as root - same thing

ah-hah
"htop" shows 9 task - different PIDs running opencl amd ati tasks.

Hmmmm

Anyone else see this?
Should I change to Beta fglrx drivers?

Thanks,
Jay

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 57672 - Posted: 27 Mar 2013, 9:30:01 UTC - in response to Message 57669.

Anyone else see this?
Should I change to Beta fglrx drivers?

No, don't change anything at your end.

The N-Body application has been designed and programmed to use every available CPU in your system. It is not GPU application, and should be going nowhere near your ATI card.

Unfortunately, it has been deployed (repeatedly) on the Milkyway server as if it was a GPU program, and the server sends out resource settings (0.05 CPUs + 1 ATI GPU) which tell your computer to treat as a GPU application.

We have pointed out this mistake many, many times since the N-Body project was restarted 6 months ago, but unfortunately nobody at the project seems to understand, or even to be listening.

Richard Haselgrove
Send message
Joined: 4 Sep 12
Posts: 218
Credit: 448,778
RAC: 0

Message 57673 - Posted: 27 Mar 2013, 10:12:49 UTC

@ admins,

"A person who won't read has no advantage over one who can't read."

Mark Twain

1 · 2 · Next
Post to thread

Message boards : News : N-Body 1.08


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group