Why is it so hard to get work?

Author	Message
boosted Send message Joined: 4 Feb 08 Posts: 116 Credit: 17,263,566 RAC: 0	Message 23345 - Posted: 25 May 2009, 20:41:52 UTC Bill, accept the fact that no matter how many times you say that a DoS attack is happening the truth is it simply is not. If it were, there servers would be down. As in not functioning. As in dead. Sending no responses to inquiries. THAT is the definition of a DoS attack. To prevent the services of a computer from being reached, or used. To prevent all connections and to force a software overload and crash. THAT is what a DoS attack is. Take that from a person that actually has a degree in this computer networking stuff. Not wikipedia the most overused and over-quoted place ever. What is happening is that the work generator cannot keep up with the tens of thousands of computers making requests for work. There simply is not enough work being generated. If there were, people would have it. No one, not matter what you think has the right to claim this project as theirs (cpu vs GPU). If the operators of this project had a problem with the use of GPUs thay would not allow them. But guess what, they are allowed so by that they do not have a problem with people using them. I am not expressing opinions. However you seem to propagate yours as if they were fact. This is an opinion, it would seem to me that the project personnel would would prefer the use of GPU's because they can do the science faster. However there is also fact in that. But the overall fact that no project besides one is 'owned' by the GPU. That is probably just a matter of time while people scramble to make GPU apps for all projects. They simply have FAR more potential. ID: 23345 · Rating: 0 · rate: /

Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 23346 - Posted: 25 May 2009, 20:48:19 UTC - in response to Message 23345. Actually, the problem isn't that work isn't being generated fast enough, it's that while there's available work, the server can't move it into shared memory fast enough to keep up with work requests (which get work fed from shared memory). I'm really hoping to put this all behind us once we get milkyway_gpu up and running (which is getting really close). Probably another week or two. I know it's been a long time, but instead of working on half-assed semi-fixes that probably wouldn't work anyways, we decided to take the extra time and effort into making a real fix -- splitting up the project into CPU and GPU versions where we can have correspondingly sized workunits. We really do appreciate everything sticking with us through all of this. There is at least a light at the end of the tunnel now :) ID: 23346 · Rating: 0 · rate: /

GalaxyIce Send message Joined: 6 Apr 08 Posts: 2018 Credit: 100,142,856 RAC: 0	Message 23348 - Posted: 25 May 2009, 21:03:56 UTC Thanks for the explanation Travis. looking forward to what's beyond that tunnel ;) ID: 23348 · Rating: 0 · rate: /

Blurf Volunteer moderator Project administrator Send message Joined: 13 Mar 08 Posts: 804 Credit: 26,380,161 RAC: 0	Message 23352 - Posted: 25 May 2009, 21:39:46 UTC Please step back and take a deep breath. I am as frustrated as all of you with the lack of work, however the personal flaming over this issue has been escalating and it needs to stop. Travis is a Grad Student-he has other responsibilities besides the project. I can assure you via our discussions he is working on it and as he said there is a "light at the end of the tunnel". It will get fixed..the process just needs to be finished and it is being worked on. Thanks ID: 23352 · Rating: 0 · rate: /

Bill & Patsy Send message Joined: 7 Jul 08 Posts: 47 Credit: 13,629,944 RAC: 0	Message 23353 - Posted: 25 May 2009, 21:43:09 UTC Last modified: 25 May 2009, 21:56:30 UTC Thanks Travis and Blurf. It's really good (and a relief!) to hear from you on this. --Bill ID: 23353 · Rating: 0 · rate: /

GalaxyIce Send message Joined: 6 Apr 08 Posts: 2018 Credit: 100,142,856 RAC: 0	Message 23355 - Posted: 25 May 2009, 22:01:19 UTC - in response to Message 23339. Spankinmonkee wrote: One thing thats going on and maybe most are not aware. The Top 2 teams in Boinc are battling it out for bragging rights on MW before the the project freezes the old stats. It's never occurred to me to take a look at the top teams in MW. I'm pleased to see my team in the top 12, but more so, to see 3 teams from the UK in the top 12. I'm sure we'll be battling for top bragging rights before the project freezes any stats ;) Oh I remember now, as well as the science, this is about fun, friendly competition between individuals and teams... ID: 23355 · Rating: 0 · rate: /

Alinator Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0	Message 23356 - Posted: 25 May 2009, 22:19:53 UTC - in response to Message 23346. Last modified: 25 May 2009, 22:21:21 UTC Actually, the problem isn't that work isn't being generated fast enough, it's that while there's available work, the server can't move it into shared memory fast enough to keep up with work requests (which get work fed from shared memory). I'm really hoping to put this all behind us once we get milkyway_gpu up and running (which is getting really close). Probably another week or two. I know it's been a long time, but instead of working on half-assed semi-fixes that probably wouldn't work anyways, we decided to take the extra time and effort into making a real fix -- splitting up the project into CPU and GPU versions where we can have correspondingly sized workunits. We really do appreciate everything sticking with us through all of this. There is at least a light at the end of the tunnel now :) Hmmm... Seems we had come to this conclusion back around the time you started contemplating setting up a separate GPU project about three months ago. :-D As far as the apparent slow progress on getting the new project up, I suggest some of the 'critics' try setting up a BOINC powered project themselves to get a better perspective of what's involved and how 'easy' it is (unless converting case is your science objective that is). ;-) BTW, could you give a 12 hour purge cycle a try here again? That would make my life a lot easier! :-) <edit> LOL... Just as long as that light isn't a locomotive! :-D Alinator ID: 23356 · Rating: 0 · rate: /

Brian Silvers Send message Joined: 21 Aug 08 Posts: 625 Credit: 558,425 RAC: 0	Message 23358 - Posted: 25 May 2009, 22:46:15 UTC - in response to Message 23327. Concerning this latter error, quoting Brian yet again: What you all have been overloading is the BOINC Feeder <--> BOINC Scheduler daemon interfacing. I don't know why this is so difficult for so many to understand. These are FACTS folks. Facts which some have been ignoring. It follows directly that if the DoS attack were stopped, yet even more science would be done because more total work would get through. Since you're quoting me, I'd like to say a couple of things... First, the people that are telling you that this isn't a DoS are correct. It does not technically meet the criteria of a DoS. I don't believe that there is any intent to disrupt the service of the project. Greed? Yes, most certainly greed exists. Intent to disrupt service does not... Secondly, there is only a small chance that more work could get done. Most likely the work would be distributed differently, but the same total work would be done. The small chance would be if, as I suggested many moons ago over at SETI (and was equally laughed at for suggesting it as I suspect I will be here), that if slower hosts (CPU-based hosts in this instance) could get enough work to where they were not asking for it as often as they are now, the people with the faster hosts (high-end CPU and/or GPU) might be able to get more tasks assigned to them. The problem is, nobody is willing to work together. There are a bunch of people who just want all that they can get and are willing to do whatever it takes to get all that they can get. ID: 23358 · Rating: 0 · rate: /

Brian Silvers Send message Joined: 21 Aug 08 Posts: 625 Credit: 558,425 RAC: 0	Message 23360 - Posted: 25 May 2009, 22:52:27 UTC - in response to Message 23345. What is happening is that the work generator cannot keep up with the tens of thousands of computers making requests for work. There simply is not enough work being generated. If there were, people would have it. On the contrary, as Travis pointed out, the problem is as I said, the feeder and scheduler mechanism. Getting those of you who are pounding on the thing to ease off might help, but getting you to understand how it might help is nearly hopeless... ID: 23360 · Rating: 0 · rate: /

Paul D. Buck Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0	Message 23368 - Posted: 26 May 2009, 1:11:24 UTC - in response to Message 23330. If you keep this argument going then we'll need to consider it trolling. Please remember if you don't like what is happening here you can go elsewhere. Interesting how the royal "we" crept in here... Also interesting it is being made by one who has admitted to be doing what B&P is saying is not good for the project. As Brian has stated and Travis confirmed, the problem is the speed of service of the feeder mechanism. HItting the scheduler a lot of times and not allowing the back off mechanism to work does not help the system operate. But I do find it so fascinating that someone can claim the project as their own and that anyone that does not like what they are doing should leave the project ... absolutely fascinating ... Somewhere there has to be a definition of arrogance and it points right to statements like that one... "If you don't like my scripting screwing up the project, then you should shut up and leave..." So, am I going to get banned too? ID: 23368 · Rating: 0 · rate: /

Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 23369 - Posted: 26 May 2009, 1:30:47 UTC - in response to Message 23358. Secondly, there is only a small chance that more work could get done. Most likely the work would be distributed differently, but the same total work would be done. The small chance would be if, as I suggested many moons ago over at SETI (and was equally laughed at for suggesting it as I suspect I will be here), that if slower hosts (CPU-based hosts in this instance) could get enough work to where they were not asking for it as often as they are now, the people with the faster hosts (high-end CPU and/or GPU) might be able to get more tasks assigned to them. The problem is, nobody is willing to work together. There are a bunch of people who just want all that they can get and are willing to do whatever it takes to get all that they can get. To be honest, before people started using scripts to hammer the server, we were getting around 9-11 workunits a second. Now we're seeing around 6-7 workunits a second. ID: 23369 · Rating: 0 · rate: /

Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 23370 - Posted: 26 May 2009, 1:32:03 UTC - in response to Message 23368. So, am I going to get banned too? Only because it would be really fun to whack you with the ban stick. jkjk :D ID: 23370 · Rating: 0 · rate: /

Spankinmonkee [TopGun] Divisio... Send message Joined: 22 Mar 08 Posts: 38 Credit: 48,762,331 RAC: 0	Message 23371 - Posted: 26 May 2009, 1:38:41 UTC - in response to Message 23370. So, am I going to get banned too? Only because it would be really fun to whack you with the ban stick. jkjk :D LMAO ID: 23371 · Rating: 0 · rate: /

The Gas Giant Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0	Message 23372 - Posted: 26 May 2009, 1:50:49 UTC - in response to Message 23369. Last modified: 26 May 2009, 1:54:21 UTC Secondly, there is only a small chance that more work could get done. Most likely the work would be distributed differently, but the same total work would be done. The small chance would be if, as I suggested many moons ago over at SETI (and was equally laughed at for suggesting it as I suspect I will be here), that if slower hosts (CPU-based hosts in this instance) could get enough work to where they were not asking for it as often as they are now, the people with the faster hosts (high-end CPU and/or GPU) might be able to get more tasks assigned to them. The problem is, nobody is willing to work together. There are a bunch of people who just want all that they can get and are willing to do whatever it takes to get all that they can get. To be honest, before people started using scripts to hammer the server, we were getting around 9-11 workunits a second. Now we're seeing around 6-7 workunits a second. Well there's a hornest nest being hit hard... ;) The change in wu availability occurred after the project outrage about 60 days ago. We pointed it out at the time via BOINCstats but the history has now gone off the page to show the change. Are you absolutely sure nothing changed at the time of the outage about 60 days ago? [edit] To stop the scriptors hitting the project so hard, you could increase the minimum time between host contacts at the server end. LHC@home increased theirs to just over 15 minutes....maybe you could try 2 minutes and see what happens. I believe it is a simple server side setting. [/edit] ID: 23372 · Rating: 0 · rate: /

Slicker [TopGun] Send message Joined: 20 Mar 08 Posts: 46 Credit: 69,382,802 RAC: 0	Message 23376 - Posted: 26 May 2009, 4:02:24 UTC - in response to Message 23369. Q: What do you get when you mix astrophysicists, computers, and crunchers? A: I'll let you know once the whining stops. I had a CEO once tell me "Morale is your own %$&* problem. If you don't like it, leave!" So, I left. I didn't leave because I was unhappy or because the guy was an idiot, but because I didn't think that he had any intention of trying to improve things to get the company back on track. That isn't the case with the MW team. While they may be going for a Guiness World Record for the longest running boinc project in alpha status, Travis and company are actively working on a resolution. They have been for several months. Sure, we wish it would happen ASAP, but I also wish I was sitting on a beach in the tropics sipping a rum-runner. The MW team tried a few quick solutions with mixed results but decided a long term solution was the proper choice - one that would not only have enough WUs for everyone, but also be able to take advantage of the gpu's capabilities and do even more science. In the mean time, I'll stick it out whether there are enough WUs to go around or not because they at least are trying to fix it, and not with some band aid. I'll live with boxes sitting idle or crunching other projects for now and hope for the best. Until then, I'll take what I can get. ID: 23376 · Rating: 0 · rate: /

The Gas Giant Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0	Message 23377 - Posted: 26 May 2009, 4:29:36 UTC I haven't seen this set of messages in a long long time....sure was good to see it just now after getting 12 wu's for my C2D machine at w#$^. 26/05/2009 2:24:48 PM Milkyway@home Sending scheduler request: To fetch work. 26/05/2009 2:24:48 PM Milkyway@home Requesting new tasks 26/05/2009 2:24:53 PM Milkyway@home Scheduler request completed: got 0 new tasks 26/05/2009 2:24:53 PM Milkyway@home Message from server: No work sent 26/05/2009 2:24:53 PM Milkyway@home Message from server: (reached per-CPU limit of 6 tasks) 26/05/2009 2:24:53 PM Milkyway@home Message from server: (Project has no jobs available) ID: 23377 · Rating: 0 · rate: /

Misfit Send message Joined: 27 Aug 07 Posts: 915 Credit: 1,503,319 RAC: 0	Message 23378 - Posted: 26 May 2009, 4:57:35 UTC - in response to Message 23368. If you keep this argument going then we'll need to consider it trolling. Please remember if you don't like what is happening here you can go elsewhere. Interesting how the royal "we" crept in here... Also interesting it is being made by one who has admitted to be doing what B&P is saying is not good for the project. As Brian has stated and Travis confirmed, the problem is the speed of service of the feeder mechanism. HItting the scheduler a lot of times and not allowing the back off mechanism to work does not help the system operate. But I do find it so fascinating that someone can claim the project as their own and that anyone that does not like what they are doing should leave the project ... absolutely fascinating ... Somewhere there has to be a definition of arrogance and it points right to statements like that one... "If you don't like my scripting screwing up the project, then you should shut up and leave..." And there it is. Light at the end of the tunnel. Truth. me@rescam.org ID: 23378 · Rating: 0 · rate: /

Brian Silvers Send message Joined: 21 Aug 08 Posts: 625 Credit: 558,425 RAC: 0	Message 23381 - Posted: 26 May 2009, 6:16:58 UTC - in response to Message 23369. Secondly, there is only a small chance that more work could get done. Most likely the work would be distributed differently, but the same total work would be done. The small chance would be if, as I suggested many moons ago over at SETI (and was equally laughed at for suggesting it as I suspect I will be here), that if slower hosts (CPU-based hosts in this instance) could get enough work to where they were not asking for it as often as they are now, the people with the faster hosts (high-end CPU and/or GPU) might be able to get more tasks assigned to them. The problem is, nobody is willing to work together. There are a bunch of people who just want all that they can get and are willing to do whatever it takes to get all that they can get. To be honest, before people started using scripts to hammer the server, we were getting around 9-11 workunits a second. Now we're seeing around 6-7 workunits a second. Is that the number of WUs outbound or inbound? I believe you are meaning outbound, but want to make sure because of the use of the word "getting". ID: 23381 · Rating: 0 · rate: /

GalaxyIce Send message Joined: 6 Apr 08 Posts: 2018 Credit: 100,142,856 RAC: 0	Message 23384 - Posted: 26 May 2009, 7:04:42 UTC - in response to Message 23287. Ageless wrote: I have taken my systems off of Milkyway as per immediately. They won't return until the GPU project is up&running and the kinks have been ironed out over there. And even then, what's to keep the scripters from keeping their scripts running to get more work for their CPUs once their GPUs have moved? Nothing. I don't think anyone should be standing down their MilkyWay crunching. Your penultimate sentence here is interesting Jord "what's to keep the scripters from keeping their scripts running to get more work for their CPUs once their GPUs have moved?". It will be interesting to see if the complainers still continue to blame GPU crunchers even if there is no GPU crunching here anymore ;) ID: 23384 · Rating: 0 · rate: /

BarryAZ Send message Joined: 1 Sep 08 Posts: 520 Credit: 302,538,504 RAC: 0	Message 23385 - Posted: 26 May 2009, 7:37:55 UTC - in response to Message 23284. True enough -- but I don't have any ATI 38xx or 48xx -- and I figure to keep it that way just to starve those evil GPU folks out <just flat out kidding>. ... and if all the CPUers stopped crunching for a while then there'd be more WUs for us GPUers. :D ID: 23385 · Rating: 0 · rate: /