Message boards :
Number crunching :
Host with WAY too many tasks.
Message board moderation
Previous · 1 · 2 · 3
Author | Message |
---|---|
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
Cartoonman, I totally agree what you posted except in one point: what is the reason then to run CPU-app's? By definition they may take up to 8 days to complete, which means, they missed the boat, as you posted. And if they validate against a gpu-wu, this one is also wasted. So maybe it makes sense to think about the idea to stop the separation-CPU-wu's and use the cpu only for the nbody's. |
Send message Joined: 19 Jul 10 Posts: 597 Credit: 18,982,369 RAC: 5,800 |
however, i don't believe BOINC manager has the ability to see the GPU as a separate variable in terms of caching WU's just yet, correct me if i'm wrong. Yes, it works. They used that feature not so long time ago at Seti after the 3 day outages. There were different "in progress" limits per CPU and per GPU, for example 8/CPU and 40/GPU, that's what they usually started with after the outage. So for example a host with a dual core CPU and one GPU would get 2*8 + 40 = 56 WUs while a host with a single core CPU with 4 GPUs would get 8 + 4*40 = 168 WUs. I think something like that should work here as well and I'm actually suprised that it has not been implemented yet. |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
Generally most take 4-8 hours. A p4 takes 5-5.5 hours with current wus. I haven't seen any that take 8 days. Not sure that a computer slow enough could run MW even. They seem to validate fine. I have not had any be invalid. Initially the cpu vs gpu were inconclusive and needed another run, now they seem to validate just fine. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
For what we are discussing here we need to talk about turnaround-time, not crunching time. This may take some days if not a 7/24 cruncher. Shure, they validate, but do they still help the project (as the system has moved on. aka, you missed the boat)? |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
The whole "limit GPU queue size" because of turn around time is invalidated by allowing CPUs to run the WUs. We've suggested several times that the old WUs be limited to GPUs and allow CPUs to run N-Body WUs. What's the downside? 1) Even with an increased GPU WU cache the turn around time would be FAR faster. 2) The larger cache would result in more WUs being run via smoothing out workflow caused by the many outages. 3) More N-Body WUs would be run because all CPUs would be doing them. 4) Fewer people would dump the project due to frustration. I suspect the problem is that the admins don't know how to do this. After all they're scientists, not programmers. There was recently a good article (I believe in Nature) regarding this problem. I'm sure Slicker at Collatz would be willing to help. He's a wizard at such things. |
Send message Joined: 25 Jun 10 Posts: 284 Credit: 260,490,091 RAC: 0 |
I'm sure Slicker at Collatz would be willing to help. He's a wizard at such things. And, Slicker has been using different cache sizes for CPU/GPU at Collatz for sometime now. And DNETC has even subdivided it down to what kind (Nvidia/ATI) and model (HD 5XXX and not HD 5XXX) for the type of workunit that is being sent. So, everything we are asking for, has already been accomplished at other projects. |
©2024 Astroinformatics Group