1)
Message boards :
News :
How the new validator works
(Message 38889)
Posted 19 Apr 2010 by Brian Silvers Post: Is the error rate tracked per user or per computer? Is it possible to have this metric added in the appropriate section so that it is visible to us? With the quick purge rate specific task errors can quickly disappear from our sight. (Probably a change request for the BOINC dev team, but worthwhile since this figure is an important part in our contributions and how we manage our systems.) ...and people keep ignoring my idea of a Homogeneous Redundancy-like thing for GPUs... That would have the various classes of GPUs, as best as I understand it when applied to CPUs... :shrug: |
2)
Message boards :
News :
server issues
(Message 38551)
Posted 10 Apr 2010 by Brian Silvers Post: We're having some database issues which is why the server isn't sending out any work. I've contacted labstaff and elevated the ticket to emergency so hopefully we'll have things fixed shortly. I think we might have to move to yesterday's backup. Are you not doing incrementals???????? |
3)
Message boards :
Number crunching :
Feeder & validator need kicking
(Message 38429)
Posted 8 Apr 2010 by Brian Silvers Post: I've got an Intel i7 based machine that has been getting only 4 CPU work units to crunch at a time (doing it on the CPU only). My safety stock is empty. Obviously the machine would like 8 or more units. I was running 6.10.18 but upgraded today to 6.10.43 as it is now the recommended version and I was hoping that it would solve the problem. Nope! I also detached/ reattached but that also didn't help. Yesterday I reset my preferences to 5 days of cache and updated the preferences. No help there either. My lap top has since received a nice cache but my main cruncher is at a loss for enough work. Any ideas? Check these two settings under computing preferences: On multiprocessors, use at most n processors On multiprocessors, use at most Enforced by version 6.1+ xxx % of the processors |
4)
Message boards :
News :
quorum down to 2
(Message 38230)
Posted 6 Apr 2010 by Brian Silvers Post:
I can appreciate your issue. It, however, is not my issue. I do hope you understand that. I know I'm not adding much very often compared to GPUs, but it has been stated in other threads that the CPU results are definitely more accurate at this point than the ATI 5800 series cards. As Crunch3r has said, you could restrict 5800 series participation until the issue is sorted with them. In the meantime, I'm setting to no new tasks... |
5)
Message boards :
News :
quorum down to 2
(Message 38197)
Posted 6 Apr 2010 by Brian Silvers Post: The database is having a bit of trouble keeping up with all the new results due to a quorum of 3, so for the time being I'm dropping it to a quorum of 2. If things are not better soon, could you consider figuring out a way to utilize "Homogeneous Redundancy"-like capability to group GPUs away from CPUs? I know I'm not adding much in comparison to people with GPUs anymore, but now that you're grouping me with GPUs that are having problems validating, I'm burning electricity for nothing sometimes. Thanks... Edit: Oh, and my Pentium 4 got grouped with 3 other 5800 series GPUs, which in another thread you state aren't matching up to other architectures, so my result, which is probably the one you should've accepted, got dumped as invalid simply because they formed a quorum with their matching each other... WU 90248204 |
6)
Message boards :
News :
testing new validator
(Message 38100)
Posted 5 Apr 2010 by Brian Silvers Post: Since I don't know if purges are running real quick, and I don't know what the major amount of noise is in this thread since I last read it, I'm going ahead and posting this. It will likely be formatted badly, and may already be covered by the numerous postings, but I just wanted to state that it's quite unfair to me to have an app that is known to be working fine and spend 4.5 hours on a task for zip, zap, zero... Oh, and in case the 4.5 hours didn't tell you which system is mine, it's the non-GPU system, the first one in the quorum... name de_s11_3s_free_6_1544383_1270347650 application MilkyWay@Home created 4 Apr 2010 2:20:50 UTC minimum quorum 3 initial replication 4 max # of error/total/success tasks 3, 6, 1 errors Too many success results This is displayed on the workunit pageTask ID click for details Computer Sent Time reported or deadline explain Status Run time (sec) CPU time (sec) Claimed credit Granted credit Application 95656877 26452 4 Apr 2010 2:22:25 UTC 5 Apr 2010 5:49:34 UTC Completed, can't validate 0.00 16,347.27 70.47 0.00 Anonymous platform 96480761 26133 5 Apr 2010 5:50:54 UTC 5 Apr 2010 7:17:50 UTC Completed, can't validate 216.20 212.47 1.11 0.00 MilkyWay@Home v0.21 (ati13ati) 96480762 141414 5 Apr 2010 5:50:38 UTC 5 Apr 2010 6:27:01 UTC Completed, can't validate 89.52 87.25 0.61 0.00 MilkyWay@Home v0.21 (ati13ati) |
7)
Message boards :
News :
Server outage
(Message 37994)
Posted 4 Apr 2010 by Brian Silvers Post: It's good to hear you're on top of things - I hope BOINC gives you enough capabilities to deal with the scammers. Will these changes affect the distribution of the anonymous platform apps? For instance, will new versions need to be validated before being allowed, assuming you have the capability to enforce validation? How does this deal with CPCW (Cherry-Picking Credit Whoring)? I just got two very different credit per hour rates. The const_v2 searches are considerably faster than const_v3, but yield the same credit. Are you going to be holding a second set of statistics for people who abort longer-running tasks? |
8)
Message boards :
Number crunching :
de_14_3s_free_5... shows no progress on 0.20 CPU client
(Message 37760)
Posted 26 Mar 2010 by Brian Silvers Post:
I'll check that out then... I've been doing Cosmo on this system for the past few days... The other computer is running in service mode and I don't feel like looking at it right now... :/ Edit: Yeah, ps_13_3s_const_v3_5087818_1269576799_0 is working ok... Still, if the project sends out whatever tasks that caused it not to work properly, that case needs to be fixed, unless those types of tasks were not supposed to come from the project. :shrug: All I know is I pulled my AMD off and switched to Cosmo while I waited and semi-monitored my Intel... |
9)
Message boards :
Number crunching :
de_14_3s_free_5... shows no progress on 0.20 CPU client
(Message 37754)
Posted 25 Mar 2010 by Brian Silvers Post:
"Next weekend" as in 2 days from now, or as in 9 days from now? Without the progress percentage, those of us with CPU apps are somewhat "flying blind", hoping that we do not have a bad task. So far everything has been ok, but will it continue that way? |
10)
Message boards :
Number crunching :
de_14_3s_free_5... shows no progress on 0.20 CPU client
(Message 37731)
Posted 24 Mar 2010 by Brian Silvers Post:
Any progress on this? Thanks... |
11)
Message boards :
Number crunching :
de_14_3s_free_5... shows no progress on 0.20 CPU client
(Message 37377)
Posted 15 Mar 2010 by Brian Silvers Post: ok, so are these new tasks good, bad, or hit or miss? Also, this inquiry is in regards to the *CPU* app only. I do not have a GPU-capable system. I came looking at this thread because of, well, the subject title... :-) |
12)
Message boards :
Number crunching :
Now that we have native ATI GPU support, how about longer tasks?
(Message 35760)
Posted 17 Jan 2010 by Brian Silvers Post:
Perhaps I'm getting ahead of the curve with trying to segregate tasks, regardless of quorum. Not sure if there's already a way to do that, but the whole point is that GPU users need to be placed in a different classification category than CPU users. You folks can exclusively have the 3-stream (longer-running) tasks, and leave CPU users with the 1-stream, 2-stream, or other shorter-running tasks. Perhaps I am phrasing the BOINC equivalent wrong, and there is something there already, but if the planned "2 to 4 times increase" in runtime happens again, then that will undo the increase in deadline and will cause people with CPUs to start howling again... I'm advocating making everyone happier, not just a few. Same as I've been doing all along... I think if something like what I'm suggesting is done, it will improve total project throughput and maybe, just maybe, allow you all to have a larger cache. Might not, but it is certainly worth a try if there is a way to do that already or if it is a minimal change. |
13)
Message boards :
Number crunching :
Now that we have native ATI GPU support, how about longer tasks?
(Message 35721)
Posted 17 Jan 2010 by Brian Silvers Post: Thanks for implementing native ATI support! In my opinion, you should consider looking at whether or not there is a way to use Homogeneous Redundancy classes to separate GPU from CPU and give GPUs the longer task and leave the shorter tasks to CPUs. |
14)
Message boards :
Number crunching :
Testing ATI Application Availability
(Message 35241)
Posted 9 Jan 2010 by Brian Silvers Post: Except that we should all my now be using the wonderful new automagically-downloaded-from-the-server apps. :) Said by Anthony Waters only around 17-18 hours ago:
|
15)
Message boards :
Number crunching :
Milky Way, Project unfriendly.....
(Message 35186)
Posted 8 Jan 2010 by Brian Silvers Post: I'd be happy with an hour of work cached for MW so that when a project maintenance back off occurs I can keep crunching or loose very little time. But my hour is different to your hour or his hour as we have different set ups. For example I have a 4850 and a 4870 in my Q9450 box. The longer wu's take approx 3.5 min and 3 min respectively. So that would mean I'd need to have cached 38 wu's, currently I get 24 which is about 39 minutes. With a few shorties of 55 seconds, that drops down to less than 30 minutes. The previously mentioned 5 minute -> 20 minute cache thing was on a 4850 or a 4870, can't remember which. At any rate, the same logistical problem is in place - these tasks were originally designed to be processed by CPUs, not GPUs. Giving you all 100% of the 3-stream tasks is just a band-aid. It will not address the root cause, which is that the tasks just aren't complex enough. Allegedly the MW_GPU project was going to provide tasks perhaps 100 times the complexity. For whatever reason, that idea was tossed out. It needs to be brought to the front burner again... |
16)
Message boards :
Number crunching :
Milky Way, Project unfriendly.....
(Message 35178)
Posted 8 Jan 2010 by Brian Silvers Post: Why not limit the short WUs to CPU clients? That will at least help a little. Allowing GPU clients to cache more WUs would also solve the problem. Same problem exists as before. Runtimes were increased by 4. To be effective in handling you all with GPUs without totally crushing the server, it needs to be increased by another factor of 10, perhaps 20, particularly if you're wanting "minimum of several hours". Allegedly 6 tasks was like 5 minutes back then, so 6 tasks are only 20 minutes now. To get to 3 hours, that means 9 times...or 54 tasks, but again, it's only 3 hours. I have the feeling most of you won't be happy unless it goes up to 8 hours, so that means an increase in cache of 24 times, or 144 tasks. Bumping your caches up by factors of that much will only cause problems. I do not know if the current type of tasks have that much room for expansion in their scientific value. That's why this whole time, I've said that the real long-term fix is a separate GPU project or separate type of work. |
17)
Message boards :
Number crunching :
Milky Way, Project unfriendly.....
(Message 35153)
Posted 7 Jan 2010 by Brian Silvers Post: Why not limit the short WUs to CPU clients? That will at least help a little. Allowing GPU clients to cache more WUs would also solve the problem. That's what I was trying to say for a very long time, but got a bunch of static from you folks with GPUs. What I was proposing was longer tasks for GPUs only, either by segregating the tasks here in this project or with a separate project. Unfortunately, with what the project did in lengthening the runtime for all users, that helped the situation some at the expense of those on the lower end of the spectrum (re: the rise in complaints about the short deadlines). I would think / hope that the server would be able to differentiate between a GPU and a CPU, so once all is in place, in theory the 3-stream tasks could go to those of you with GPUs and perhaps increase your caches as well, up to perhaps double what they are now. After that, the 1-stream and 2-stream tasks can go to CPU participants, again with double the cache (from 6 to 12). |
18)
Message boards :
Number crunching :
Deadline problem
(Message 34999)
Posted 1 Jan 2010 by Brian Silvers Post: Well, I aborted everything and have it set not to get new wu's from milky way. If they want it set that only rich people who have advanced computers and/or can afford an extra computer to just run BOINC then the hell with them for being so elitist. I'll keep my spare computing power for cosmology, climate prediction, World Community Grid and the LHC project (if lhc ever gives any wu's) or maybe join a new one. If they decide to let normal people participate again maybe I'll come back. These complaints arise any time a project has a short/tight deadline and the person does not have the computer on enough and/or is running several other projects. On the one hand, we had the people with fast GPUs howling about things. On the other, we now have people on the other end of the spectrum howling about things. Unless the 1-stream and 2-stream work is of no use to the project anymore, they could address some of the complaining by sending the 3-stream tasks to GPU users and the 1-stream and 2-stream tasks to CPU users. I would think that if the server code is updated for native GPU support, that kind of "homogeneous redundancy"-ish kind of setup should be doable... |
19)
Message boards :
Number crunching :
Deadline problem
(Message 34745)
Posted 21 Dec 2009 by Brian Silvers Post: Another small thing that could be done is if the 5-10% performance improvement in the CUDA code that was changed recently makes any difference at all to CPU processing times, a new stock and 3rd-party optimized application could be made for CPU processing. OK... I didn't think there was going to be any benefit for CPU processing, but didn't know for sure... At any rate, GPUs need longer running tasks, but not CPUs. It would be best if they could distribute the longer tasks to GPUs and start up the shorter-running searches that the scientists were told not to run and send those to CPU users. If not, then the project will probably need to do some PSAs and/or other ways to educate users on why the deadlines are what they are and why they really cannot be changed. |
20)
Message boards :
Number crunching :
Deadline problem
(Message 34735)
Posted 20 Dec 2009 by Brian Silvers Post: I was and am a strong advocate of making the tasks longer, but primarily for GPU users. As best as I understand things, the scientists working on this project had to intentionally not run tasks that are of the 1-stream or 2-stream variety because they would not run for long enough so that the systems out here pounded on the server causing many problems, such as slow web site performance and, ironically, work outages. As I said somewhere over the past few weeks, I knew that when task runtimes were increased for everyone, this kind of complaint would crop up. The way to deal with this to please more people is to find a way to send these longer-running 3-stream tasks (tasks that have 3s in their name) to GPU users, and then let the scientists generate 1 and 2-stream tasks again and direct those to CPU users. Another small thing that could be done is if the 5-10% performance improvement in the CUDA code that was changed recently makes any difference at all to CPU processing times, a new stock and 3rd-party optimized application could be made for CPU processing. The best thing though is to try to segregate the tasks... |
©2024 Astroinformatics Group