Why is it so hard to get work?

Author	Message
nick n Send message Joined: 16 Mar 09 Posts: 21 Credit: 69,966 RAC: 0	Message 20753 - Posted: 28 Apr 2009, 22:24:17 UTC Last modified: 28 Apr 2009, 22:27:27 UTC Why is it so hard for everyone to get work all the time? The project has WU's to send as shown on the server status page but it takes several requests to get anything. What is causing this problem and is it going to be fixed soon? ID: 20753 · Rating: 0 · rate: /

Pwrguru Send message Joined: 30 Aug 08 Posts: 24 Credit: 250,053,699 RAC: 0	Message 20758 - Posted: 28 Apr 2009, 23:15:30 UTC - in response to Message 20753. To answer that question would take way too long.....Your best bet is to read thru the resent threads and you will gain insight to your question... ID: 20758 · Rating: 0 · rate: /

banditwolf Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0	Message 20763 - Posted: 28 Apr 2009, 23:50:45 UTC - in response to Message 20753. Why is it so hard for everyone to get work all the time? What Pwrguru said. The project has WU's to send as shown on the server status page but it takes several requests to get anything. What is causing this problem..? 1. Mw only uses a part of a server 2. More tasks are being asked for than can be created to send 3. Project is limiting tasks, down to 34k(23%) 'Results in progress' from 145k (on 2-18) 4. Project accepts ati gpus which can crunch through many thousands of these wu's/day. and on and on..... and is it going to be fixed soon? If the Gpu site goes live with an Cuda app, the Ati app still needs made before Mw Cpu sees any relief. And then restrictions will need to be made to keep Gpu's on the Gpu version. But the throttling will also need to be lifted so the Cpu's can get work. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. ID: 20763 · Rating: 0 · rate: /

sygopet Send message Joined: 18 Apr 09 Posts: 7 Credit: 5,005,579 RAC: 0	Message 20802 - Posted: 29 Apr 2009, 12:57:08 UTC Now when BOINCSIMAP had a similar problem, where the faster crunchers got 100s of units while others got none, they introduced a rationing scheme . . . (But they can still only provide units for 2 or 3 days a month because they are dependent on basic information from other sources which is produced slowly!) I still have a problem understanding where the MilkyWay bottleneck really is: if it is true that MilkyWay cannot create sufficient units, is that because there is a real maximum or would more hardware allow increased production? Several people have commented that while the server status page indicates several hundred units as being available at any time, these get sucked up so quickly that your chances of getting a few are minute: so let's have a bigger buffer of units. Can we have some ideas for a short-term fix as all the suggestions I've seen are likely to take a month or two before they become effective. ID: 20802 · Rating: 0 · rate: /

Brian Silvers Send message Joined: 21 Aug 08 Posts: 625 Credit: 558,425 RAC: 0	Message 20807 - Posted: 29 Apr 2009, 13:53:12 UTC - in response to Message 20802. I still have a problem understanding where the MilkyWay bottleneck really is: if it is true that MilkyWay cannot create sufficient units, is that because there is a real maximum or would more hardware allow increased production? Several people have commented that while the server status page indicates several hundred units as being available at any time, these get sucked up so quickly that your chances of getting a few are minute: so let's have a bigger buffer of units. Alinator could probably explain this better than I could, but as I understand it, there is a shared memoryd buffer that handles the transfer of tasks from the WU Generator into the "Feeder" which supplies the "Scheduler". The scheduler and feeder talk to each other through the shared memory segment. The general idea is that this buffer cannot keep up. There may be a bug with the feeder as well. If there is a bug, then that is a BOINC development issue that, while Dave and Travis could choose to try to work on, is not really their responsibility, but the responsibility of the BOINC development team. Can we have some ideas for a short-term fix as all the suggestions I've seen are likely to take a month or two before they become effective. The "short-term fix" is to do work for other projects. This has been something that has been said over and over and over. People tend to get all hostile at the person making this suggestion though, so it is sometimes like the person asking the question really doesn't want an answer and is really saying "we want the long-term fix NOW!" ID: 20807 · Rating: 0 · rate: /

Bill Send message Joined: 3 Oct 07 Posts: 21 Credit: 49,862 RAC: 0	Message 20808 - Posted: 29 Apr 2009, 14:04:39 UTC - in response to Message 20753. Why is it so hard for everyone to get work all the time? The project has WU's to send as shown on the server status page but it takes several requests to get anything. What is causing this problem and is it going to be fixed soon? I've been getting "No work available" for weeks in spite of the status page. I mentioned that more than a week ago. ID: 20808 · Rating: 0 · rate: /

banditwolf Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0	Message 20809 - Posted: 29 Apr 2009, 14:42:44 UTC - in response to Message 20808. I've been getting "No work available" for weeks in spite of the status page. I mentioned that more than a week ago. Everyone does. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. ID: 20809 · Rating: 0 · rate: /

James Sotherden Send message Joined: 3 Jan 09 Posts: 139 Credit: 50,066,562 RAC: 0	Message 20822 - Posted: 29 Apr 2009, 16:36:41 UTC My P4 hasnt had work in a week but my Mac has been getting its 12 WU's every day. ID: 20822 · Rating: 0 · rate: /

Paul D. Buck Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0	Message 20840 - Posted: 29 Apr 2009, 21:07:12 UTC - in response to Message 20808. Why is it so hard for everyone to get work all the time? The project has WU's to send as shown on the server status page but it takes several requests to get anything. What is causing this problem and is it going to be fixed soon? I've been getting "No work available" for weeks in spite of the status page. I mentioned that more than a week ago. Which is a very, very, very short term. Sorry, but true. Guys, software development is slow and hard. A lot of what goes on in BOINC is software development and takes a long time. Like it or not, 6 months is pretty fast for getting problems fixed. A year is not out of the question. Consider this, I reported a problem in 2005 and we are still talking about it ... Problems I reported in BOINC Beta are still with us ... (I was refused permission to suggest a fix) ID: 20840 · Rating: 0 · rate: /

boosted Send message Joined: 4 Feb 08 Posts: 116 Credit: 17,263,566 RAC: 0	Message 20844 - Posted: 29 Apr 2009, 21:26:52 UTC I am just getting tired of all these people saying, go to other projects until this is resolved. If I were running the CPU app that is a valid comment. But since alot of people (such as myself) only really run it on the GPU, that is not really an option. I already run other projects on my CPU(s). I just do not understand how some people here are still getting 60-100K a day when there are many more others that are barely getting anything. That is what is ticking me off. I also keep reading about how they are working on a CUDA app, why not get the ATI one working first? Take on one thing at a time. There are other projects that are running under CUDA, this is the only one (recognized by BOINC) that runs under ATI. ID: 20844 · Rating: 0 · rate: /

The Gas Giant Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0	Message 20852 - Posted: 29 Apr 2009, 22:21:24 UTC Last modified: 29 Apr 2009, 22:21:45 UTC We all just have to wait and let Travis do his thing. Hassling him about it just slows things down, but a quick 5 minute weekly update would be great! Meanwhile, as it appears my 4850 died due to temperature swings as it got and completed work, I'm not running it again until there is consistent work for it. Dang it! :) Live long and BOINC! ID: 20852 · Rating: 0 · rate: /

bradslab Send message Joined: 28 Mar 09 Posts: 1 Credit: 2,180,458 RAC: 0	Message 20855 - Posted: 29 Apr 2009, 22:31:02 UTC Sounds a lot like 10 year old girls complaining about a stain on their dress. If you can do better, grab a compiler and start writing some code. GPU or CPU, what does it matter? Whether you get 100,000 or 500... you get the same amount of money. The only thing worse than volunteers who complain, is those who complain when getting something for free. ID: 20855 · Rating: 0 · rate: /

Chris S Send message Joined: 20 Sep 08 Posts: 1391 Credit: 203,563,566 RAC: 0	Message 20859 - Posted: 29 Apr 2009, 22:42:10 UTC The only thing worse than volunteers who complain, is those who complain when getting something for free. This is year 11 mate, check out Freehal...... ID: 20859 · Rating: 0 · rate: /

Brian Silvers Send message Joined: 21 Aug 08 Posts: 625 Credit: 558,425 RAC: 0	Message 20861 - Posted: 29 Apr 2009, 22:53:00 UTC - in response to Message 20844. I am just getting tired of all these people saying, go to other projects until this is resolved. What else is there to do? Really? Should you drive to the campus and start demanding that they hurry up? The project is still running, it just cannot keep up with the demand. Contrast this with the continual disaster that is Cosmology@Home. The server crashes and it takes weeks to get it back up and running. The server crashed because they didn't have a UPS attached and there was a power outage. There has been continual lapses in communication from the project for over a year. Even when tasks are crashing or there's a shared memory issue with the scheduler, best-case secenario is that they correct it within 2-3 days. More often it takes a week before someone gets around to checking on the server. The latest snafu over there was that after the server crash and a new server, they released an apparently relatively untested new version with no advance warning about the tasks taking 2-3 times longer and 2-3 times more memory. It is so memory intensive that I dare not try to run it on my Pentium 4 system that only has 1GB of memory. I've seen it using 800MB+ of physical memory on some tasks. Other tasks it takes 600-650MB. If something similar to what goes on there were happening here, then I'd speak up like I do there. However, tasks are not crashing, nor do they take up gobs of resources or a lot of time. The only "issue" is that people can't get enough, and frankly that's an "issue" just because the project pays out so much in contrast to other projects. If you're willing to tell me that the only reason you're upset is because you feel cheated that you can't help move the progress of science along at a quicker pace, then I applaud you and will think about reconsidering my viewpoint. I just do not understand how some people here are still getting 60-100K a day when there are many more others that are barely getting anything. That is what is ticking me off. Ah, the heart of the matter... Like I said, it's not about scientific progress... ;-P Like Kevin (campaignforliberty, zeitgeistmovie, et al) said, the more machines you have, the more you are able to get because you are asking for more than others. I also keep reading about how they are working on a CUDA app, why not get the ATI one working first? Take on one thing at a time. There are other projects that are running under CUDA, this is the only one (recognized by BOINC) that runs under ATI. That's because BOINC has endorsed CUDA, and thus getting that up as a project should be easier since they can get help from other people associated with BOINC (and/or SETI). This project running ATI is somewhat of a "rebel". ID: 20861 · Rating: 0 · rate: /

uBronan Send message Joined: 9 Feb 09 Posts: 166 Credit: 27,520,813 RAC: 0	Message 20864 - Posted: 29 Apr 2009, 23:05:28 UTC The only thing i like to see is a update on the progress or some short info if progress is made or not. When people are informed the lesser they can complain ;) I have days that i receive a fair amount to crunch but most of the time my machine is doing other tasks since no units flow from the server. There are only a few other projects which interest me so i stick with the ones i am doing now. ID: 20864 · Rating: 0 · rate: /

Alinator Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0	Message 20871 - Posted: 30 Apr 2009, 0:31:09 UTC - in response to Message 20807. Last modified: 30 Apr 2009, 0:32:07 UTC I still have a problem understanding where the MilkyWay bottleneck really is: if it is true that MilkyWay cannot create sufficient units, is that because there is a real maximum or would more hardware allow increased production? Several people have commented that while the server status page indicates several hundred units as being available at any time, these get sucked up so quickly that your chances of getting a few are minute: so let's have a bigger buffer of units. Alinator could probably explain this better than I could, but as I understand it, there is a shared memoryd buffer that handles the transfer of tasks from the WU Generator into the "Feeder" which supplies the "Scheduler". The scheduler and feeder talk to each other through the shared memory segment. The general idea is that this buffer cannot keep up. There may be a bug with the feeder as well. If there is a bug, then that is a BOINC development issue that, while Dave and Travis could choose to try to work on, is not really their responsibility, but the responsibility of the BOINC development team. Can we have some ideas for a short-term fix as all the suggestions I've seen are likely to take a month or two before they become effective. The "short-term fix" is to do work for other projects. This has been something that has been said over and over and over. People tend to get all hostile at the person making this suggestion though, so it is sometimes like the person asking the question really doesn't want an answer and is really saying "we want the long-term fix NOW!" You did pretty good with it. ;-) The problem is two pronged: 1.) You can't build up that big a reserve of work because of the nature of the searches themselves. You have to wait at some point for the outstanding work to return or the simulation can 'wander off'. 2.) The design of the BOINC backend is such that even with parity in host performance you can suck the schedulers active memory constrained queue dry PDQ at the performance level of the fastest GPU's. This imparts an 'Out of Work' delay while it gets refilled from slower mass storage. The bottom line is; the project is what it is, and the guys are doing the best they can to remedy the situation ultimately. Unfortunately, they don't have infinite time, equipment, or money to throw at the problem, and really don't need to either. As I have said before, we are here to offer our services (and rigs) to help answer their science questions. If they can in return meet our expectations and desires, then so much the better (although some have been pretty outrageous in the past). If they can't, then I guess we all need to learn how to deal with disappointment issues better! :-D Alinator ID: 20871 · Rating: 0 · rate: /

verstapp Send message Joined: 26 Jan 09 Posts: 589 Credit: 497,834,261 RAC: 0	Message 20882 - Posted: 30 Apr 2009, 1:48:35 UTC - in response to Message 20859. I jst had a look at the FreeHAL board. Perhaps we don't have it so bad... Won't stop us whingeing, though. Cheers, PeterV . ID: 20882 · Rating: 0 · rate: /

GalaxyIce Send message Joined: 6 Apr 08 Posts: 2018 Credit: 100,142,856 RAC: 0	Message 20894 - Posted: 30 Apr 2009, 7:00:13 UTC - in response to Message 20861. Last modified: 30 Apr 2009, 7:03:36 UTC I also keep reading about how they are working on a CUDA app, why not get the ATI one working first? Take on one thing at a time. There are other projects that are running under CUDA, this is the only one (recognized by BOINC) that runs under ATI. That's because BOINC has endorsed CUDA, and thus getting that up as a project should be easier since they can get help from other people associated with BOINC (and/or SETI). This project running ATI is somewhat of a "rebel". It's true that this project has done nothing to develop ATI optimized apps so far. This was done by Cluster Physik who is not employed by the MW project in any way, AFAIK. MilkyWay have not chosen to develop ATI GPU apps at any time, at all, so far. That also applies to the CPU optimized apps that Cluster Physik developed for this project - they are not MilkyWay products either. So if you wonder why MilkyWay keep quiet about these optimized apps - it's because they didn't do them. And if they've throttled their server in response to these optimized apps, both GPU and CPU, which they didn't write, then that's probably their best response to keep the whole thing from keeling over. At least they allow the continuing use of these optimized apps - both CPU and GPU. For which at least I am thankful. Hence no whinge from me. Thanks again Cluster Physik. ID: 20894 · Rating: 0 · rate: /

[AF>DoJ] supersonic Send message Joined: 5 Mar 09 Posts: 19 Credit: 102,651,985 RAC: 0	Message 20902 - Posted: 30 Apr 2009, 10:27:09 UTC - in response to Message 20844. I am just getting tired of all these people saying, go to other projects until this is resolved. If I were running the CPU app that is a valid comment. But since alot of people (such as myself) only really run it on the GPU, that is not really an option. I already run other projects on my CPU(s). I just do not understand how some people here are still getting 60-100K a day when there are many more others that are barely getting anything. That is what is ticking me off. I also keep reading about how they are working on a CUDA app, why not get the ATI one working first? Take on one thing at a time. There are other projects that are running under CUDA, this is the only one (recognized by BOINC) that runs under ATI. I agree on that. My feeling is that MW people are headed in a tough way of developping a new CUDA compatible project. During that time, why not take the opportunity of Cluster Physiks's gifted people to create a new ATI app for GPU project ? Then, when CUDA app is ready, both Nvidia and ATI can crunch. ID: 20902 · Rating: 0 · rate: /

Brian Silvers Send message Joined: 21 Aug 08 Posts: 625 Credit: 558,425 RAC: 0	Message 20903 - Posted: 30 Apr 2009, 11:07:52 UTC - in response to Message 20902. Last modified: 30 Apr 2009, 11:11:44 UTC I am just getting tired of all these people saying, go to other projects until this is resolved. If I were running the CPU app that is a valid comment. But since alot of people (such as myself) only really run it on the GPU, that is not really an option. I already run other projects on my CPU(s). I just do not understand how some people here are still getting 60-100K a day when there are many more others that are barely getting anything. That is what is ticking me off. I also keep reading about how they are working on a CUDA app, why not get the ATI one working first? Take on one thing at a time. There are other projects that are running under CUDA, this is the only one (recognized by BOINC) that runs under ATI. I agree on that. My feeling is that MW people are headed in a tough way of developping a new CUDA compatible project. During that time, why not take the opportunity of Cluster Physiks's gifted people to create a new ATI app for GPU project ? Then, when CUDA app is ready, both Nvidia and ATI can crunch. Again, BOINC endorses CUDA... What you're asking is the equivalent of asking Intel to manufacture Phenom processors because they're out of a component that they use to manufacture i7 processors... It's not the "company product". My semi-educated guess is that the GPU project would have to have CUDA as its' default supplied application to be sanctioned by David Anderson. Once that is done, the ATI application could be made by an outside source and run under the Anonymous Platform mechanism. ID: 20903 · Rating: 0 · rate: /