Welcome to MilkyWay@home

Posts by Richard Haselgrove

61) Message boards : News : New N-Body Runs (Message 61911)
Posted 17 Jun 2014 by Richard Haselgrove
Post:
seems like a waste of everybody's time to run them then, Ya think?

Why not try posting the actual cause of the error for Jake to see and do something about?

Exit status	196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED

The task is trying to use more disk space than is allowed for by the <rsc_disk_bound> value in the workunit template. They can fix that...
62) Message boards : Number crunching : Ever longer N-Bodies (Message 61901)
Posted 15 Jun 2014 by Richard Haselgrove
Post:
Yes, that looks like a workunit setup problem by the project: a lot get

Exit status 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED
63) Message boards : News : Users Auto-Aborting Work Units (Message 61705)
Posted 7 May 2014 by Richard Haselgrove
Post:
Hello,

Please look at this computer:
http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=573990&offset=0&show_names=0&state=6&appid=

Shouldn't he be "banned" for mass abort? Or at least banned from receiving GPU tasks?

Note that the error message (every task I've looked at) is

201 (0xc9) EXIT_MISSING_COPROC

The card is NVIDIA Quadro K1000M (2048MB) driver: 296.79, but OpenCL support isn't being reported by BOINC - though the card iself can run OpenCL 1.2

I think the question is more - why does the project keep allocating OpenCL tasks to it?
64) Message boards : Application Code Discussion : MAC OS X Nvidia - MilkyWay Sepearation Modified Fit v1.30 (Message 61674)
Posted 1 May 2014 by Richard Haselgrove
Post:
I doubt there's anything you can do to change the card selection at your end. BOINC has properly detected the NVIDIA GeForce GTX 650 in host 541933: that MilkyWay can't see the same device might be a problem with MilkyWay's own programming, or it might be a problem with the 'BOINC API' libraries - the BOINC-supplied code which enables applications like MilkyWay's to communicate with the BOINC platform and ultimately with the project servers.

Some similar OpenCL device selection anomalies are being discussed at the moment at the Einstein project, and the API code is being updated in an attempt to resolve them. I've added your report to the list.

Can anyone explain this result?
65) Message boards : Number crunching : adequately supply or WUs: PLEASE: increase max_wus_in_progress or enable the Computing preference settings (Message 61620)
Posted 25 Apr 2014 by Richard Haselgrove
Post:
why dose WM do this?

that should be
why does WM do this?

As you were told in your other thread, WM (WayMilky? WilkyMay?) doesn't have a work fetch algorithm.

BOINC has a work fetch algorithm.
66) Message boards : Number crunching : Tesla K40 & System Upgrade Thoughts? (Message 61399)
Posted 15 Mar 2014 by Richard Haselgrove
Post:
To be honest, i have no experience using SLI and there are no other posts in this forum about using SLI for Milky Way.

A quick google search about BOINC and SLI brought me to this post
http://boinc.berkeley.edu/dev/forum_thread.php?id=3592

They said it is better to disable SLI when doing SETI work using CUDA
The same might be correct for Milky Way.

That thread is five years old - technology has moved on since then.

Specifically, NVidia have made chances to their drivers and cuda runtime support systems. Unless you are still using a five year old video driver, you will find that the SLI setting is for all practical purposes ignored when running distributed computing applications. SLI is a video technology used for display purposes, not computing purposes.

The only drawback I've found is that with SLI, the speeds of the cards are locked together. If one card encounters a computing problem and goes into protective downclock, then both cards are downclocked until you can reboot: if you disable SLI, one card can be stalled but the other run at full speed.

IMO - if you game, enable SLI, because you'll kick yourself if you forget to turn it on one day before a session. If you don't game, then there's no point in enabling it anyway, so you may as well leave it off.
67) Message boards : Number crunching : Tips and tricks to improve CPU and GPU crunching. (Message 61140)
Posted 16 Feb 2014 by Richard Haselgrove
Post:
I have one of those. The fans are ineffective and don't do anything to lower the CPU temps, but it definitely gives good circulation around the pc.

It certainly makes a difference to my laptop - so much so, that when I knocked the USB cable out of place by mistake, the machine first downclocked and then shut down completely.

Get to know your laptop. Find out where the air vents are: which ones blow, and which ones suck. If it's not obvious, hold a sheet of tissue paper nearby and see which way the wind takes it. [hint: there's usually a supply of suitable light-weight paper on a roll in the smallest room in the house]

The stand will only help if the fans are helping the air to move in the right direction. In my case, the stand pushes the air upwards towards the base of the laptop: the laptop itself sucks from underneath and blows hot air out of the side, so all is good. I don't actually see any reduction in temperature, but I can hear that the internal fan is working less hard, so the manufacturer's thermal control software is happy that I'm running within limits. And there's extra headroom available if the ambient temperature increases, or I run an application which places higher stress on the CPU or GPU.
68) Message boards : Number crunching : ATI GPU R600 (R38xx) does not support OpenCL (Message 60937)
Posted 3 Feb 2014 by Richard Haselgrove
Post:
I see version 7.2.39 is out now, you may want to try that too. BOTH are test versions, meaning they may work and they may not.

I've been running that for a while, and it works fine (after a brief hiccup with the 32-bit installer).The download link is http://boinc.berkeley.edu/download_all.php

But it won't change the behaviour of existing project applications or servers.

OK, so I may have to eat my words here.

The OP for this thread has posted at BOINC (using his alter ego 'PlaysGames11') that the MilkyWay server is now seeing his Radeon HD 7870/7950/7970/R9 280X series (Tahiti) as OpenCL capable and producing valid work here.

It's not entirely clear whether it's the upgrade to BOINC v7.2.39 that's done it, or "the latest" (unversioned, unfortunately) Catalyst driver. But host 557409 is producing validated work on that GPU.
69) Message boards : Number crunching : ATI GPU R600 (R38xx) does not support OpenCL (Message 60933)
Posted 3 Feb 2014 by Richard Haselgrove
Post:
I see version 7.2.39 is out now, you may want to try that too. BOTH are test versions, meaning they may work and they may not.

I've been running that for a while, and it works fine (after a brief hiccup with the 32-bit installer).The download link is http://boinc.berkeley.edu/download_all.php

But it won't change the behaviour of existing project applications or servers.
70) Message boards : Number crunching : All Milkyway@Home 1.02 tasks ending in computation error on HD6950. (Message 60877)
Posted 31 Jan 2014 by Richard Haselgrove
Post:
Those 3+1 now validated :).

I am glad it is now working!!!!!

As for Windows uninstallers they have been notoriously BAD ever since Windows 2! It's the sharing files concept MS uses that causes the problems.

Especially since those early shared support files had no version control, and the installers had to rely on datestamps when deciding whether a file was overwritable. Anybody can falsify a datestamp, and some otherwise reputable companies did.
71) Message boards : Number crunching : Home Separation (Modified Fit) v1.28 taking ages (Message 60798)
Posted 23 Jan 2014 by Richard Haselgrove
Post:
From the time differences I guess it was gpu (nonstop) vs. cpu (often suspended).
Since the WU is purged now, I can't verify.

Yeah it could be, I don't remember either.

The 816 seconds was done under an ATI GPU plan_class, but I didn't follow through to see the exact card model.
72) Message boards : Number crunching : Collatz offline (Message 60768)
Posted 19 Jan 2014 by Richard Haselgrove
Post:
There is a thread on BOINC's own message boards which can be used to consolidate information like this.

News on Project Outages

And I see you got an answer there with the explanation for an earlier Collatz outage.
73) Message boards : Number crunching : Home Separation (Modified Fit) v1.28 taking ages (Message 60758)
Posted 17 Jan 2014 by Richard Haselgrove
Post:
Gee I don't get it, now I come home and I see boinc says one has crunched 33 hours remaining 7 hours (67%) and the other 33 hours remaining 14 hours (59%)... they are the same WUs than before, I can't see anything in boinc messages appart that the WU did resume at some point, on my account page I only see the same 2 WUs sent on the 09/01 !!

WTF ?

I think you need to figure out why it is 'suspending', since you said it is 'resuming' that means it was suspended earlier, figuring out why it suspended is the key I think.

Enable the <sched_op_debug> log flag to see when (and possibly why) tasks are being suspended or pre-empted.

That information used to be in the default logs, but recent BOINCs have dumbed it down into the debug areas.
74) Message boards : Number crunching : All ps_separation tasks are getting computation errors (Message 60734)
Posted 13 Jan 2014 by Richard Haselgrove
Post:
Do you have a preference?

Nothing to do with me. I'm just a volunteer, like you - pretty much an ex-volunteer, since they discontinued the multi-threading tests I was interested in.

Returning errors is no use to anybody.

My preference, if I had any say in the matter, would be a proper debugging session with the developers to find the cause of all these errors people keep reporting. But neither the project staff nor the volunteers seem to understand that concept.

Alternatively, you could simply de-select the ps_separation work units in your project preferences, so that your GPUs worked productively on ps_modfit tasks, and let your CPUs crunch for another project where they're needed and appropriate.
75) Message boards : Number crunching : All ps_separation tasks are getting computation errors (Message 60732)
Posted 13 Jan 2014 by Richard Haselgrove
Post:
There's nothing wrong with v7.2.37 for general use - just make sure you never drop "Use at most ... % CPU time" below 100. I'm running it on 4 machines - but the normal version, not VirtualBox. Uploads and downloads are fine.

I don't know why you're getting

Access Violation (0xc0000005) at address 0x000007FEDB2A5E00 read attempt to address 0x00000010

but I'd be 99.999% sure it has nothing to do with the version of BOINC you use.
76) Message boards : Number crunching : All ps_separation tasks are getting computation errors (Message 60725)
Posted 12 Jan 2014 by Richard Haselgrove
Post:
If you want to try 7.2.37, here's a link where you can get it.

I'd advise against using that one. It contains experimental code, which, on reflection, has now been withdrawn as unworkable. There is nothing in it, over and above the released v7.2.33, which could possibly have a bearing on computation errors.
77) Message boards : Number crunching : CPU not fully loaded (Message 60690)
Posted 6 Jan 2014 by Richard Haselgrove
Post:
Grab yourself a copy of Process Explorer and watch the utilisation of each thread in real time.
78) Message boards : Number crunching : Feature request: Run CPU versions of applications for which GPU versions are available - yes/no (Message 60498)
Posted 1 Dec 2013 by Richard Haselgrove
Post:
Is that nbody exe really singlethreaded?
I thought nbody was all mt plan class.

As you can see on the Applications page, there are currently both single- and multi-threaded (mt) versions of N-Body for all platforms.

But three weeks ago, Jake posted:

I am of the opinion that there is no inherent advantage to running MT tasks and plan to release only non-MT versions for the next n-body release.

Which for me removes the distinctive nature of crunching for this project, and I think I've got all the bugs I can out of both the project and BOINC's support for multi-threading, so I'll bid you all farewell.
79) Message boards : Number crunching : Feature request: Run CPU versions of applications for which GPU versions are available - yes/no (Message 60493)
Posted 30 Nov 2013 by Richard Haselgrove
Post:
I'm surprised that nobody posted a working app_info section for nbody. Hasn't anybody succeeded or does nobody care? ;)

Well, since you ask, this is the app_info I was using back in June, before the project had written its own MT plan_class.

It will be hugely out-of-date now, so there are lots of version numbers to be updated: I would suggest that you download the current executable first, and use Dependency Walker to get the correct <open_name> for the current DLLs.

My objective in writing this app_info was to run N-Body on three cores of a four-core CPU, leaving enough overhead for BOINC to schedule a single-CPU task and a GPU task from other projects at the same time. So I think it's got most of the wrinkles you might want to use. And it worked back when v1.12 was as far as they'd got.

<app_info> 
 
    <app>
        <name>milkyway_nbody</name>
    </app>

    <file_info>
        <name>milkyway_nbody_1.12_windows_x86_64__mt.exe</name>
        <executable/>
    </file_info>
    <file_info>
        <name>libgomp_64-1_nbody_1.10.dll</name>
        <executable/>
    </file_info>
    <file_info>
        <name>pthreadGC2_64_nbody_1.10.dll</name>
        <executable/>
    </file_info>

    <app_version>
        <app_name>milkyway_nbody</app_name>
        <version_num>112</version_num>
        <platform>windows_x86_64</platform>
        <avg_ncpus>2.100000</avg_ncpus>
        <max_ncpus>2.100000</max_ncpus>
        <flops>13326025349.445996</flops>
        <plan_class>rh_mt_test</plan_class>
        <api_version>6.13.0</api_version>
        <cmdline>--nthreads 3</cmdline>
        <file_ref>
            <file_name>milkyway_nbody_1.12_windows_x86_64__mt.exe</file_name>
            <open_name>milkyway_nbody.exe</open_name>
            <main_program/>
            <copy_file/>
        </file_ref>
        <file_ref>
            <file_name>libgomp_64-1_nbody_1.10.dll</file_name>
            <open_name>libgomp_64-1.dll</open_name>
            <copy_file/>
        </file_ref>
        <file_ref>
            <file_name>pthreadGC2_64_nbody_1.10.dll</file_name>
            <open_name>pthreadGC2_64.dll</open_name>
            <copy_file/>
        </file_ref>
    </app_version>

</app_info> 
80) Message boards : Number crunching : Feature request: Run CPU versions of applications for which GPU versions are available - yes/no (Message 60468)
Posted 27 Nov 2013 by Richard Haselgrove
Post:
I don't take the lack of reply so far as a no. Maybe this can not be implemented on short notice. I would appreciate a reply, though, even a 'no'.

In the meantime I'm trying to load the cannon. Can somebody help me with a working N-Body Simulation section for an app_info.xml? The following app_info.xml was created based on the corresponding section in client_state.xml:...

Without checking it exhaustively, that looks about right for the non multi-threaded version of the application - you would need to do more to run it MT, as originally intended. And your technique of copying the elements from client_state is spot-on.

But as with all app_info installations, you have to supply the files as well. Since you're just packaging the project's own application, you can download them from the project servers:

http://milkyway.cs.rpi.edu/milkyway/download/milkyway_nbody_1.38_windows_x86_64.exe
http://milkyway.cs.rpi.edu/milkyway/download/libgomp_64-1_nbody_1.38.dll
http://milkyway.cs.rpi.edu/milkyway/download/pthreadGC2_64_nbody_1.38.dll


Previous 20 · Next 20

©2024 Astroinformatics Group