Welcome to MilkyWay@home

Posts by Paul D. Buck

1) Message boards : Number crunching : [coproc] Insufficient CUDA for de_separation_23_3s_fix errors (Message 46134)
Posted 10 Feb 2011 by Profile Paul D. Buck
Post:
BOINC 6.10.58 would be a downgrade.


Actually, 6.10.58 would be an upgrade. You would be upgrading from a BETA version to a stable RELEASED version.

Most of us that test the later versions of BOINC *NEVER* suggest trying a version that we ourselves have not been running... usually for some considerable time. I personally also watch the change logs carefully to see what changes have been made and how radical of a shift there has been...

There things to dislike about the 6.12.x series like the hiding of the event log and the order of the columns are not what I would choose ... but, though Beta, the push is on to make one of these the next stable. 6.12.13 has only minor tweaks from 6.12.12 like removing the URL change from notices because of a suggestion I made ...

But, the biggest change is to get rid of strict FIFO which should not have been allowed to persist as long as it did ... if you were single project it mattered little, but if you did multiple projects it was a real bad thing ...

Lastly, even the "stable" release versions have their issues ... just not usually big enough that most people notice...
2) Message boards : News : bypassing server set cache limits (Message 46064)
Posted 8 Feb 2011 by Profile Paul D. Buck
Post:
When that runs out, it gets work from a backup project. Hours and hours of work (much more than my set preferences suggest, for some reason). While this lot is worked on, new MW WUs usually trickle in after a few minutes, just sitting there getting older before BOINC is done with the other project's work. (This FIFO behaviour might change in a future release of the BOINC core client.)

FIFO for the GPU is finally gone in the 6.12.x series...
3) Message boards : Number crunching : [coproc] Insufficient CUDA for de_separation_23_3s_fix errors (Message 46011)
Posted 6 Feb 2011 by Profile Paul D. Buck
Post:
Does not obey TDI parameters at all. I have my general and local preferences to switch projects every 5 minutes. 6.10.58 would never obey my rules.

Switching projects every 5 minutes is not a good idea. Unless all projects you are running have tasks less than the TSI, well, you are just thrashing.TSI at default allows most projects to get a shot of CPU and to complete a reasonable amount of work before a switch.

On mutli-core systems, particularly with 8 or more cores a far better strategy is to extend TSI out so that most tasks complete beffore the TSI expires (mine is set to 720 min, 6 hours). There are a host of issues with honoring TSI on GPU projects where the tasks are longer than 5 minutes in that you will waste considerable time unloading the GPU and loading it with the next task and rinse and repeat...
4) Message boards : Number crunching : Losing work done when stopping GPU computations. (Message 45628)
Posted 19 Jan 2011 by Profile Paul D. Buck
Post:
Hi there,
I wondered if anyone could help? I'm using my GPU to crunch Milkyway (when the computer has not been used for two minutes) and all is fine except when I start using it again the crunching stops as per instructions, but when it starts again any computation is lost and time elapsed and progress reverts to zero. I'm sure that in the past this wasn't the case or am I imagining that?
Have just replaced my GTX 295 with a GTX 580...wow!!!!

Because of the short run times, minutes only, MW on the GPU does not checkpoint. So, yes, all the work done in lost. But, by not checkpointing, more work is actually done over the long haul because of the time saved avoiding wasteful disk writes...
5) Message boards : Number crunching : nbody (Message 41736)
Posted 25 Aug 2010 by Profile Paul D. Buck
Post:
Checkpoint: tnow = 0.550334. time since last = 686193s
Checkpoint: tnow = 0.429749. time since last = 686747s

These don't mean anything. It's just from subtracting an arbitrary time from 0.0 when the first checkpoint happens (which I added here http://github.com/Milkyway-at-home/milkywayathome_client/commit/b8ea7ee37035eb2e69403cc8c4767f7a58111c54). It's just debug printing since it seems like on some systems the checkpointing is happening way too often. The BOINC default time is supposedly 300 seconds, but most systems seem to do it around 60 seconds. A fair number also seem to be checkpointing every 10 seconds for some reason, which is helping slow things down and might partially explains some of the maximum time exceeded errors.

They have made changes in the checkpointing... I forget which version... somewhere in the 3 or 4 series the number of CPUs was taken into account because people were setting it to a value and because of the multiple CPUs the setting was effectively divided by that count... so if you set 4 minutes on an 8 CPU system the effective checkpointing interval was 30 seconds ...

Recently, ( and I don't recall how recently ) the multiplier was removed so we are back to much more rapid CKPTs than most expect... especially on GPU equipped systems (which add processing elements and tasks in work)...
6) Message boards : Number crunching : Boinc 6.10.56 - backup project (Message 40289)
Posted 8 Jun 2010 by Profile Paul D. Buck
Post:
Go to http://boinc.thesonntags.com/collatz/prefs.php?subset=global and change the suspend if CPU option to "0"

I made the change, but still get this message:
6/7/2010 8:24:34 PM		   suspend work if non-BOINC CPU load exceeds 25 %

If you have used local manager preferences then the web preferences will be overridden. You will need to clear them for the web preferences to take or make the change in the manager.

I assume this under Advanced>Preferences>CPU Usage? If so, the answer is yes. Nothing worked, which is why I posted the question here, then I followed up on changing preferences via the web settings.

If you made any setting, or even clicked Ok on that pane, you are now using LOCAL preferences and no change you make with the web site will be honored. You have to open the preferences pane and click on the 'Clear' button, and click 'Yes' on the confirm ... this is one of the long-standing GUI issues where there is no clear indication of when you are using local preferences ... what you should see is some clear indication that you are on local settings or web settings ... but, until that is done, the only way to be sure is to clear the local preferences ...

Oh, and you should force an update to the last project for which you have changed the settings to be sure that they have been updated ...
7) Message boards : Number crunching : Boinc 6.10.56 - backup project (Message 40148)
Posted 2 Jun 2010 by Profile Paul D. Buck
Post:
Where's the 64 bit of this version of BOINC?

All Versions List contains all the versions sorted most recent on top.

Note that the latest recommended is 6.10.56 not .57 ...
8) Message boards : Number crunching : Problems with GPU usage for a iMac11,1 with quad core Intel i7 processor and ATI 4850 GPU 512 MB.. (Message 40141)
Posted 2 Jun 2010 by Profile Paul D. Buck
Post:
From the Boinc Download page:

Attach to projects with GPU applications

Projects with NVIDIA applications:
GPUgrid.net
SETI@home
Milkyway@home (Double precision GPU required)
AQUA@home (cuda offline)
Lattice
Collatz Conjecture
PrimeGrid (AP26)

Projects with ATI applications:
Collatz Conjecture
Milkyway@home (Double precision GPU required)
DNETC@Home
You're done! Soon you'll be racking up big credit numbers. Of course, you can attach to other projects too; BOINC will keep both your CPU and GPU busy.


I have seen posts from people with macs running ATI cards as coprocessors. It is just not working with Snow Leopard as far as I can tell.

I just double checked and none of the three projects listed in your message that has an ATI application has one for the Mac. These projects do have an ATI application for Windows, and in some cases also for Linux, but no one has an ATI application running on OS-X ... MW and Collatz can run on the Mac on the CPU side and there is the one CUDA application from Collatz that runs only on the GPU and the Einstein beta application that runs on the CPU and GPU ... AND THAT IS IT ...

Developers target Windows first, Linux second and the Mac third ... DENETC at the moment is windows only (ATI, CUDA, and CPU) ... likely the next applications will be made available on Linux and only later can we expect to see a Mac version...

The project with the widest applicability is Collatz... And I would look to them to be the first out with an application for the ATI cards on the Mac if anyone does it ... then likely MW ... but the limitation is that ATI has not released the right drivers yet for the Mac platform and until they do, nothing is going to happen ...
9) Message boards : Number crunching : Boinc 6.10.56 - backup project (Message 40116)
Posted 1 Jun 2010 by Profile Paul D. Buck
Post:
Ah, thanks Paul.

I may try going through BAM again to override the settings there, or turning the cache back on so I have some stored up WU's for those times where MW and Seti are having issues.

Been getting a lot of project has no work when requesting GPU tasks from both of em in the past 5 days or so.

Sadly BOIC does not actually fare well running for long periods of time on the server side... UCB has also shown no interest in finding out why this problem set exists as well ... though many of us are pretty familiar with some of the symptoms ... one of the more common ones is loss of connection to the database ...

Three day weekends seems to be a particular hazard. At least a half a dozen projects flipped out ... including Einstein which is one of the most reliable ones going ... I was just glad I was able to keep all the GPUs running most of the days, on one project if not another ... because I did a lot of Collatz instead of DNETC or MW my daily total this morning stunk big time ... but, likely I will get it back tomorrow and the next day as I turn back to mostly MW now (assuming I can get the work) ...
10) Message boards : Number crunching : Problems with GPU usage for a iMac11,1 with quad core Intel i7 processor and ATI 4850 GPU 512 MB.. (Message 40108)
Posted 1 Jun 2010 by Profile Paul D. Buck
Post:
I have moved cuda out of the way in /usr/local

Now I get:

Mon 31 May 12:16:58 2010 No NVIDIA library found
Mon 31 May 12:16:58 2010 No usable GPUs found

Guess I need to point it towards OpenCL or ATI stream?

Alan

Alan,

For the Mac there is nothing for users with ATI cards from any project that I know of... if you have a CUDA card and do the work you can get work from Collatz (Pure GPU) and EaH (50/50) ... sadly the PG project that also had a CUDA app for the mac (though it made for lots of lag) completed and they have not ported another application to GPU for any platform ...

As usual, in many respects, the Mac world lags the Linux and Windows worlds ...

If I were to guess if there would be a project that will release a Mac application for GPUs (though likely CUDA at first) it would be DNETC ...
11) Message boards : Number crunching : Boinc 6.10.56 - backup project (Message 40106)
Posted 1 Jun 2010 by Profile Paul D. Buck
Post:
Any of you guys happen to try this with projects other than Collatz?

I myself tried it with Einstein and when I set 0 resource share on their site it tells me it saved it, but it's still set at 100. (I'm currently using a NVidia GPU).

Any suggestions on a way to override it perhaps?

This change requires a server side update ... I don't think EaH has made the changes yet. Based on the report above I can almost assure you that they have not ... at the moment there are only a few projects using the newest and capable server software that will allow this ... SaH, Collatz are two I know ... DNETC just indicated that they have updated the server software but I have not tried to see if this is possible there ...

One of the problems with BOINC is that when projects customize the software for their own needs and UCB throws in something like this, well, it causes problems because they do not have sufficient isolation in some of the parts ... so ...
12) Message boards : Number crunching : Boinc 6.10.56 - backup project (Message 40033)
Posted 28 May 2010 by Profile Paul D. Buck
Post:
It doesn't work completely correctly for me. When MW was down yesterday, I had a long period where 2 out of 3 machines had no GPU work. I have Collatz set to a 0 share, MW 100. Using 6.10.56 on the 2 that this didn't work on, using 6.10.43 on the other machine where it DID work promptly. To make matters worse, after a couple of hours of no work, I updated Collatz at the 2 idle machines and the messages tab showed "user requested update, not reporting tasks, not requesting new work". I left them alone after that and by this morning they all had work at Collatz, but I have no idea why they didn't get any sooner.

-Dave

Ok, so it works sorta ... :)

UCB shows little appetite for working on the resource scheduler "trio" of modules that governs all of this Resource Scheduler, RR SIm, and Work Fetch)... and so we have longstanding and lingering problems ... An alternative problem is that I have seen where BOINC will get a slew of MW work, run it all off before it asks for more, then it will run off the new batch and repeat endlessly ... meaning that about every 20-30 minutes or so I get two idle GPUs for as much as 30 seconds while BOINC goes and gets new work ...

sadly, one of the only "cures" for this is project reset or reset of debts (set flag in CC Config and a full BOINC restart) ...
13) Message boards : Number crunching : Computer with rac of 173 and 3000+ tasks in progress ?? (Message 40031)
Posted 28 May 2010 by Profile Paul D. Buck
Post:
Hmm, Tiger Direct is selling GTX480 at $530 this weekend ...

How much is the 5970 going for?

Didn't look, Frys is selling the 470 for $350 I think it was ... no matter ... I need to wait a bit before I go spend that much again ... and I am still more likely to get another 5870 instead ... when I start to get 3 slot MBs I may think to put in a pair of 5870s and one 480/470
14) Message boards : Number crunching : Boinc 6.10.56 - backup project (Message 40030)
Posted 28 May 2010 by Profile Paul D. Buck
Post:
Thanks Fred & Paul, Appreciated.

NP...

{edit}
Looking at my post... I did not explain that with RS of 0 at Collatz this all happened automagically with no user intervention on my part ... so, it works ...
15) Message boards : Number crunching : Boinc 6.10.56 - backup project (Message 40023)
Posted 28 May 2010 by Profile Paul D. Buck
Post:
If you already have COllatz work on hand it will be all run off before you will start to run MW tasks ... but, with Collatz set to 0 share it will be there if and when MW run out of tasks ...

So, you have to clear out the on hand work ...

Because I wanted to get the two projects back into balance I went to this a week or so ago and now am doing almost all MW ... but, when MW had an outage I started up on Collatz but now am back to MW work on all machines ...

If UCB would get rid of the Strict FIFO rule for GPU work you would have seen MW work being done immediately after the change to RS until the Collatz work was in deadline peril at which point BOINC would have switched back to run off the work on hand so as to not blow deadlines ... sadly ... it does not look like the FIFO rule is going away anytime soon ... (it causes other problems as well)...
16) Message boards : Number crunching : Computer with rac of 173 and 3000+ tasks in progress ?? (Message 40022)
Posted 28 May 2010 by Profile Paul D. Buck
Post:
He might just a have a dodgey computer that is detaching and reataching via BAM or something...may be innocent.

Three different mechanism have been proposed from the malicious to the benign (in other words, doing it deliberately or it just may be a bug and the user is unaware).

THis ties into a larger discussion about host "punishment" for returning bad tasks, I posted a note to this effect, but in that UCB is pretty wedded to the idea that I should be ignored, no telling if they are going to listen ...

Hmm, Tiger Direct is selling GTX480 at $530 this weekend ...
17) Message boards : Number crunching : Can't stop CPU based WU on ATi system. (Message 39955)
Posted 26 May 2010 by Profile Paul D. Buck
Post:
6.10.56 is the official release, I am running it on my 2 Windows machines.

My Mac is running 6.10.55.


Is the Win BMs worth upgrading too? I still run 6.10.13 OK

*I* think so... I have been running 6.10.56 since it was first published on the alpha list and I think a couple of my systems have not had BOINC stopped or the computer rebooted since then ... so, yes, I had had a couple of GPU stops even with 6.10.45 (and lots with other higher versions) and have not seen any show-stopper bugs in .56 ... a couple minor issues, one of which was fixed (cosmetic log error) and the other was a one time bug that because I could not return a log with the right debug on so far UCB has shown no interest (though I think that the log data I did provide should have been sufficient to isolate the problem with roll-over to NaN calculation results)

Anyway, 6.10.56 is *MY* recommendation on both Windows and OS-X ... if you run anything ove 6.10.18 I suggest the move to .56 ... then again, who listens to me ... :)
18) Message boards : Number crunching : Computer with rac of 173 and 3000+ tasks in progress ?? (Message 39934)
Posted 25 May 2010 by Profile Paul D. Buck
Post:
No, I sure don't understand it ... because he should only be getting 24 tasks at a time, how he is getting so many issued with none returned is past me ... his system total is falling at the least so that is a good thing... though I don't quite know why it is taking so long ...

There is a proposal to change the "punishment" scheme in BOINC to catch badly behaving hosts ... though it is hard to know what is going on here without some input from the project ... all I can think of is that he is downloading tasks but deleting them without boing reporing that back to the server ...
19) Message boards : Number crunching : Can't stop CPU based WU on ATi system. (Message 39880)
Posted 21 May 2010 by Profile Paul D. Buck
Post:
Oh ****. My bad. Thanks for the help guys!

No problem... we all misread, fat-finger, or make some other kind of mistake ... thankfully most of the fanatics seem to stay at SaH so innocent mistakes here are far less contentious ...
20) Message boards : Number crunching : Bittersweet Milestone (Message 39840)
Posted 19 May 2010 by Profile Paul D. Buck
Post:
In the early days of computers there was a "race" between one of the early machines and it lost to a room full of abaci

But that wasn't a fair contest. A single processor against a multicore...

I am surprised you did not note that they were also multi-threaded ... :)


Next 20

©2024 Astroinformatics Group