Welcome to MilkyWay@home

Posts by Len LE/GE

21) Message boards : Number crunching : Feature request: Run CPU versions of applications for which GPU versions are available - yes/no (Message 60495)
Posted 1 Dec 2013 by Len LE/GE
Post:
Like Richard showed in his example, you need to list not only the exe but although the used dll's first, before you can refer to them in the app_version block.


<app_info>
...
<app>
<name>milkyway_nbody</name>
</app>
<file_info>
<name>milkyway_nbody_1.38_windows_x86_64.exe</name>
<executable/>
</file_info>
<file_info>
<name>libgomp_64-1_nbody_1.38.dll</name>
<executable/>
</file_info>
<file_info>
<name>pthreadGC2_64_nbody_1.38.dll</name>
<executable/>
</file_info>

<app_version>
...



Is that nbody exe really singlethreaded?
I thought nbody was all mt plan class.

Last time an app_info for nbody on windows was discussed (with examples) nearly a year ago.
22) Message boards : Number crunching : Frequency (in Hz) that should try to complete individual work chunks.- say what? (Message 60063)
Posted 30 Sep 2013 by Len LE/GE
Post:
Think of it like the time slices windows is using to switch between processes.

Here it is the work that is split into pieces.
Lets use the gpu-target-frequency with default 60 (= 60Hz) as an example:
The work is split into small pieces so that 60 pieces can be done per second.
This way the gpu can 60 times per second do what it usually does.
The lower the frequency, the bigger the pieces are and that means less times per second for the gpu to do something else. If the frequency is too low (the pieces are too big) the system starts to feel laggy, if the frequency is too high (small pieces) the gpu is wasting time waiting for the next piece of work.

It depends on the gpu if you need to change the frequency for optimum use of it.
A HD79xx can work well with a far lower frequency than the default while a HD38xx needs a higher frequency to keep the system responsible.
23) Message boards : News : Separation Modified Fit v1.28 Release (Message 59945)
Posted 20 Sep 2013 by Len LE/GE
Post:
It's a known bug they are working on. See the messages above.
24) Message boards : Number crunching : Exclude entries in cc_config.xml and app_config.xml (Message 59413)
Posted 22 Jul 2013 by Len LE/GE
Post:
<!-- This is a comment and ignored while reading the app_info.xml -->
25) Message boards : Number crunching : Computational Error? (Message 59089)
Posted 25 Jun 2013 by Len LE/GE
Post:
host 521265

Right now you have like 40% of your tasks working (HD7970) and 60% error tasks (HD5770).

The error task logs are showing
for separation:
Found 2 CL devices
Device 'Juniper' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Driver version: 1124.2 (VM)
Version: OpenCL 1.2 AMD-APP (1124.2)
Compute capability: 0.0
Max compute units: 10
Clock frequency: 500 Mhz
Global mem size: 1073741824
Local mem size: 32768
Max const buf size: 65536
Double extension: (none)
Device doesn't support double precision
Failed to calculate likelihood


and for separation modified fit:
Found 2 CL devices
Device 'Juniper' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: ATI Radeon HD 5700 Series

Driver version: 1124.2 (VM)
Version: OpenCL 1.2 AMD-APP (1124.2)
Compute capability: 0.0
Max compute units: 10
Clock frequency: 500 Mhz
Global mem size: 1073741824
Local mem size: 32768
Max const buf size: 65536
Double extension: (none)
Device doesn't support double precision
Failed to calculate likelihood
26) Message boards : Number crunching : Computational Error? (Message 59066)
Posted 23 Jun 2013 by Len LE/GE
Post:
Just to clarify, it's the milkyway_seperation_1.02_windows_x86_64_opencl_amd_ati that was, and still is, causing the Computational Error on my machine.


Yes that work unit always errors out on me,but only on my x64 rig with an HD 5770 and HD 7970.
My 2 two HD 5870's in running on a x86 have no such problems.



You need to exclude your HD 5770 from being used for mw. It can not do double precision calculations.
See Updated GPU Requirements
27) Message boards : Number crunching : process exited with code 22 (Message 58670)
Posted 11 Jun 2013 by Len LE/GE
Post:
Both of your gpus are able to run the newer openCL version 1.02 instead of the old CAL version 0.82.
28) Message boards : News : added applications for ati only GPUs (Message 58659)
Posted 10 Jun 2013 by Len LE/GE
Post:
Is there a chance to increase the 'max # errors' for those tasks that got trashed by 'too many errors' caused by that old app?
This would give us additional tasks and a chance to get those (if successfully chrunched) still validated
29) Message boards : Number crunching : Computational Error? (Message 58402)
Posted 25 May 2013 by Len LE/GE
Post:
@George Del Monte

Your mw error logs are showing the following reason for your 'Computation error':

../../projects/milkyway.cs.rpi.edu_milkyway/milkyway: error while loading shared libraries: libboinc_api.so.6: cannot open shared object file: No such file or directory
30) Message boards : Number crunching : Request help updating app_info.xml for Linux (Message 58338)
Posted 19 May 2013 by Len LE/GE
Post:
nbody exes are multithreaded but cpu only. Why there are still additional exes for linux linked indicating they would make use of the gpu has been has been the question here for some time.

The app_config defines how many WU's of one app can run on the gpu at the same time. Trying to use it for a cpu only app can only mess things up.

Since I think you can't use app_config and app_info at the same time (someone correct me if I am wrong), I fear you have to use the app_info again to make settings for separation and nbody.
The nbody part in your app_info needs some cleanup and correction:

<app>
<name>milkyway_nbody</name>
</app>
<file_info>
<name>milkyway_nbody_1.09_x86_64-pc-linux-gnu__mt</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>1.09</version_num>
<plan_class>mt</plan_class>
[color=green]<avg_ncpus>4</avg_ncpus>
<max_ncpus>4</max_ncpus>
<cmdline>--nthreads=4</cmdline>

<file_ref>
<file_name>milkyway_nbody_1.09_x86_64-pc-linux-gnu__mt</file_name>
<main_program/>
</file_ref>
</app_version>

The first green marked lines are for BOINC to know how many cpus to reserve and the last green line tells nbody how many cpus to use. So the setting above would use 4 cpus for nbody.
If you reduce the green lines to <cmdline></cmdline> only, it will run single threaded (1 WU per cpu).
With separation and nbody properly defined in the app_info, you should be able to run 2 separation tasks on your gpu and at the same time nbody tasks single or multithreaded on your cpus.

Note 1: I am running Win and haven't crunched a nbody WU for some time.
Note 2: Maybe you need to change the commandline above
to <cmdline>--nthreads=4 --disable-opencl</cmdline>
31) Message boards : Number crunching : Request help updating app_info.xml for Linux (Message 58173)
Posted 8 May 2013 by Len LE/GE
Post:
In case you want or have to stick with BOINC 6

There is at least 1 obvious error in your app_info:

<cmdline></cmdline>
<file_ref>
<file_name>milkyway_separation_1.02_x86_64-pc-linux-gnu__opencl_nvidia</file_name>
<main_program/>
</file_ref>
</app>
<name>milkyway_nbody</name>
</app>
<file_info>
<name>milkyway_nbody_1.08_x86_64-pc-linux-gnu__mt</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>1.08</version_num>

The red marked part should be

</app_version>
<app>


Cleaning it from old versions and structure it a bit (i.e. empty lines between the different blocks) helps to find those wrong/missing tags quicker.
32) Message boards : Number crunching : Get Linux BOIC and Windows BOINC to use same work units? (Message 58109)
Posted 4 May 2013 by Len LE/GE
Post:
Not a linux guy but how about using Wine or a vitual machine with windows?
33) Message boards : News : New Separation Runs (Message 58091)
Posted 2 May 2013 by Len LE/GE
Post:
Looks normal on my 5850 too.

@Nowi
For a 7950 at 900 MHz you need a lot of time per WU.
You must either run 2 WUs at once or your target frequency of 10 does not fit at all to those short WUs (even those with 106 credits).
Give it a try with 1 Wu at once and default target frequency of 60 to see how your times are than.
34) Message boards : News : New Separation Runs (Message 58077)
Posted 1 May 2013 by Len LE/GE
Post:
Same Game all get compute errors.
Greetings


All your WUs (those I checked) erroring out with:

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x000007FEDD901DCD read attempt to address 0x00000010

after the integration (on gpu) is finished and the cpu takes over for the final calculations.

Did you try rebooting the system? What are the last changes you made on your system before those errors started? Which cat version are you using (13.x?)? Did you do a clean ccc install or did you only update without deinstalling the older one first?
Can you give us the first few lines of the BOINC message log (before the WUs are starting)?
35) Message boards : News : Separation Runs ps_p_82_1s_dr8_4 and de_p_82_1s_dr8_4 Started (Message 58026)
Posted 26 Apr 2013 by Len LE/GE
Post:
I had 1 single de_separation_13_3s_sscon_2 fail yesterday, but on closer look it was a resent from the day before, means it still had the old params.
Since than no more failing WUs.

Depending on the buffer size: Wouldn't it be helpful in situations like this to expland the error message to show how much buffer memory was needed and how much was available? Or even better add this to the start info together with 'Global mem size', 'Local mem size', 'Max const buf size' etc.
36) Message boards : News : Separation Runs ps_p_82_1s_dr8_4 and de_p_82_1s_dr8_4 Started (Message 58006)
Posted 25 Apr 2013 by Len LE/GE
Post:
Was just running a few of the new de_separation_12_3s_sscon_2 and de_separation_12_3s_sscon_3 workunits and they seem to work now for me
(HD5850; XP)
37) Message boards : Number crunching : Rcent Errors on new work here (Message 58003)
Posted 25 Apr 2013 by Len LE/GE
Post:
The real error is a bit more down in the logs.

The gpu app for your HD38xx is failing.
It looks very similar to what I see on my own HD5850 with too other runs.

See http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=3205

Looks like while they are playing with the parameters for the new runs they are sometimes getting to close to the limit and that is causing the runs to fail on some system but still run on others (possibly older gpus vs. newer gpus).
38) Message boards : News : Separation Runs ps_p_82_1s_dr8_4 and de_p_82_1s_dr8_4 Started (Message 57988)
Posted 24 Apr 2013 by Len LE/GE
Post:
de_separation_12_3s_sscon_2

with the same error


Looking at the error logs, it looks like your BOINC client may be out of date. The error that you received is typical when running an outdated BOINC client, an outdated Milkyway@home app, or an old (or buggy) GPU driver. Try making sure all of those are up to date and see if you still get errors.

The overall error rates are currently low, so I think the new runs are doing well.


You must have looked into different error logs than I did.
I am talking about errors of WU type
de_separation_12_3s_sscon_2 and
de_separation_13_3s_sscon_2.
Those are the only WUs that I see in my error list.

1) The error comes from out of the mw app (see setup_cl.c, separationCheckCutMemory), not from the BOINC client.
2) I am running milkyway_separation_1.02_windows_intelx86__opencl_amd_ati.exe (with app_info to use the command line params). If you have a newer internal version that I could try, I am willing to.
3) cat 12.1 (on Win XP) has never been a problem with every WU before and after
de_separation_12_3s_sscon_2 and
de_separation_13_3s_sscon_2

4) DrNoCDN gets the same error with those runs on Linux with BOINC 7 client, a newer cat version and the linux app.

There must be something special in those runs, making them error out on some systems but not on others.
My best guess for now is a WU paramter too close to the limit to work on all systems.

Sorry, I am not going to mess with a stable system because of 2 runs and an unknown cause of those errors. Sitting them out and live with the 2s per WU error is the better choice for the moment.
39) Message boards : News : Separation Runs ps_p_82_1s_dr8_4 and de_p_82_1s_dr8_4 Started (Message 57984)
Posted 23 Apr 2013 by Len LE/GE
Post:
de_separation_12_3s_sscon_2

with the same error
40) Message boards : News : Separation Runs ps_p_82_1s_dr8_4 and de_p_82_1s_dr8_4 Started (Message 57975)
Posted 23 Apr 2013 by Len LE/GE
Post:
24 of those last night:

de_separation_13_3s_sscon_2

An output buffer would exceed CL_DEVICE_MAX_MEM_ALLOC_SIZE
Capability check failed for cut 0
Failed to calculate likelihood



Previous 20 · Next 20

©2024 Astroinformatics Group