Welcome to MilkyWay@home

request regarding "deferred communication" times

Message boards : Number crunching : request regarding "deferred communication" times
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 49787 - Posted: 1 Jul 2011, 2:25:31 UTC

i'm sure many of you are already well aware of some of the more common "deferred communication" times seen under normal crunching conditions. for instance, after a scheduler request for new work, communication is deferred for a minimum of 60 seconds if either the host successfully fetches new work, or if the host fails to fetch new work b/c it has reached a limit on tasks in progress.

however, you might not be aware of another deferred communication that occurs less often, and it is b/c of its recurrence rate that you may not have seen it before...not that its a very rare thing - its just something that could easily happen while you're away from your computer for an hour or more. specifically, i'm talking about the deferred communication time of 60 minutes after the host fails to fetch new work b/c the project is temporarily shut down for maintenance. my gripe with the "deferred communication" limit of 60 minutes is that i've found (or so it seems) that the servers go down for maintenance periods lasting only a few minutes far more often than they do for maintenance periods of an hour or longer. that being the case, when my GPU client requests new work from the server and gets deferred for 60 minutes due to a maintenance outage that only ends up lasting a few minutes, the maximum cache of 12 tasks gets crunched in ~20 minutes, leaving my GPU waiting for the next work fetch, and even worse, idling for the remaining 40 minutes of the deferred communication period.

could deferred communication times for maintenance outages be reduced to something like 5 minutes (instead of a whole hour) without slowing down the servers (and thus the whole project) significantly? alternatively, a cache size 3 times larger than the current maximum (36 tasks) would take my GPU ~60 minutes to complete, eliminating any GPU idle time during the 60-minute deferred communication period. (whether the actual maintenance outage lasts an hour or longer, or only a few minutes). but i don't know how feasible the latter suggestion is, considering the maximum cache size was reduced to just 12 tasks just a few months ago in another attempt to streamline the project and get it back up to pace.

has anyone else experienced these particular losses in productivity?
ID: 49787 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 49788 - Posted: 1 Jul 2011, 3:46:56 UTC

You bet, we all have the problem. A cache increase would be nice but it's been asked for more times than I can count on my fingers and toes. I wouldn't hold your breath waiting...
ID: 49788 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 49789 - Posted: 1 Jul 2011, 4:09:07 UTC

yeah, i guess i should have rephrased my question at the end there. after all, every MW@H participant must get deferred for 60 minutes when the servers are down for maintenance. i guess what i really meant to ask was how many folks are actually aware of the 60-minute deferred communication, since it happens far less often than the typical 60-second minimum deferred communication between work fetches. also, i didn't really search for increased cache requests, so i was unaware of how many times its been asked for already...i have to assume that requests have also already been made about decreasing the length of the 60-minute deferred communication for server maintenance, considering most outages seem to be on the scale of minutes?
ID: 49789 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 49792 - Posted: 1 Jul 2011, 16:02:39 UTC - in response to Message 49789.  

i have to assume that requests have also already been made about decreasing the length of the 60-minute deferred communication for server maintenance, considering most outages seem to be on the scale of minutes?

As a workaround for this problem I run a Windows scheduler job that issues a BOINC update request to the MW server every 15 minutes.
ID: 49792 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 49793 - Posted: 1 Jul 2011, 16:16:35 UTC - in response to Message 49792.  

i have to assume that requests have also already been made about decreasing the length of the 60-minute deferred communication for server maintenance, considering most outages seem to be on the scale of minutes?

As a workaround for this problem I run a Windows scheduler job that issues a BOINC update request to the MW server every 15 minutes.

that would suffice...and i'm glad it can be done using the Windows scheduler (as opposed to having to write a script or a batch file). i'll look into it when i get home this evening...
ID: 49793 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 49794 - Posted: 1 Jul 2011, 17:53:54 UTC

ok, so i decided to start playing with the Windows task scheduler function on my computer here at work, and already i have questions:

1) i went to start menu -> programs -> accessories -> scheduled tasks, and double clicked "add scheduled task," at which point i'm prompted by the scheduled tasks wizard to "click the program you want Windows to run." this is great and all, but i really don't want windows to simply run the BOINC manager - i want Windows send a scheduler request to the MW@H server via boincmgr.exe every 20 minutes or so. i'm not clear on how the Windows scheduler can do this for me based on the limited options of the Windows task scheduler wizard.

2) next, i'm asked by the wizard how often i'd like the task performed, the most repetitive option of which is "once daily." seriously? Windows can't perform a task more often than once daily? again, i don't see how the Windows scheduler can do this for me based on the limited options of the Windows task scheduler wizard.

i noticed you're running Windows 7, whereas i'm only runing WinXP Pro SP3 32-bit. perhaps the WinXP task scheduler isn't nearly as capable as the one in Windows 7?
ID: 49794 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 49797 - Posted: 1 Jul 2011, 19:19:14 UTC
Last modified: 1 Jul 2011, 19:41:31 UTC

[Edit] I can give u also an answer to question 1 :-)

There is an boinccmd.exe in the BOINC Program DIR

here u can make a little Batch-Skript named ig.

update_MW.bat

with the text in it:
boinccmd --project http://milkyway.cs.rpi.edu/milkyway/ update


(more info:
http://boinc.berkeley.edu/wiki/Boinccmd_tool)


This batch u can call in the scheduled Task.


I can give you an answer for question 2:

When you finish the wizard check the Box namend "Open advanced Prop..."
Alternativ, when you have already save the Task choose "Properties" in the context Menue.

In the Property Dialoge choose Tab "Schedule"

here choose "Daily" and "every 1 Days" than press Button "Advanced.."

At next dialog check the Box named "Repeat Task"
choose "every 15 Minutes" and Duration 24 hours.

"ok" "ok" "ok"

And u got what u want.
ID: 49797 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 49799 - Posted: 1 Jul 2011, 19:47:01 UTC

excellent...thank you very much. the 2nd part makes perfect sense, as i was able to get all the way through the wizard and find the option to run the task at any interval i'd like. i'll attempt the script part when i get home from work.
ID: 49799 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 49850 - Posted: 3 Jul 2011, 20:49:25 UTC - in response to Message 49797.  

boinccmd --project http://milkyway.cs.rpi.edu/milkyway/ update

To update all the machines on your local net from one you can use the following in your batch file:

REM Update projects on remote machine machine1 (IP 192.168.1.25)
"c:\Program Files\Boinc\boinccmd.exe" --host 192.168.1.25 --passwd whatever --project http://milkyway.cs.rpi.edu/milkyway/ update

REM Update projects on remote machine machine2 (IP 192.168.1.26)
"c:\Program Files\Boinc\boinccmd.exe" --host 192.168.1.26 --passwd whatever --project http://milkyway.cs.rpi.edu/milkyway/ update

...
ID: 49850 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 50050 - Posted: 11 Jul 2011, 4:49:41 UTC

ok, i have the batch file running every 15 minutes now. the only part that was a royal pain was the fact that scheduling a task wouldn't work without setting a password for the Windows user account it was being executed under. generally, i don't like to use a Windows account password b/c i restart my machine alot, and would prefer not to have to enter a password everytime i restart. but i guess its a small price to pay to guarantee that my GPU doesn't go idle for more than 15 minutes at a time while the servers are producing and distributing work.
ID: 50050 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 50051 - Posted: 11 Jul 2011, 8:24:59 UTC - in response to Message 50050.  

ok, i have the batch file running every 15 minutes now. the only part that was a royal pain was the fact that scheduling a task wouldn't work without setting a password for the Windows user account it was being executed under. generally, i don't like to use a Windows account password b/c i restart my machine alot, and would prefer not to have to enter a password everytime i restart. but i guess its a small price to pay to guarantee that my GPU doesn't go idle for more than 15 minutes at a time while the servers are producing and distributing work.


Hi!

Do U run Boinc as service or are you logged in?

Because if U are logged in, there is an Option in the Scheduled Task Dialog
"Run only if logged in"
When U check this box U needn't to give a user & Password.
The scheduled task needs only a user & pass if it should run even U are no^t logged in.
I've read the U could set up Boinc alos as Service (it runns when no user is logged in) but then I think U have alsoconfiger in boinc a user & Pass.
Butt I haven't test it till now.

regards

Franz
ID: 50051 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 50053 - Posted: 11 Jul 2011, 11:48:03 UTC - in response to Message 50051.  

Because if U are logged in, there is an Option in the Scheduled Task Dialog
"Run only if logged in"
When U check this box U needn't to give a user & Password.

thanks...that solved it.
ID: 50053 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 52109 - Posted: 29 Dec 2011, 3:25:24 UTC

i'm bumping this old thread b/c my scheduled tasks have stopped working. i don't exactly know how long ago they stopped working, but i only noticed it today.

if you haven't already read the thread, i was originally looking for an automated way to get BOINC to send scheduler requests to various DC project servers at regular intervals. i was shown how to create a batch (.bat) file that would force BOINC to send a scheduler request to a DC project server. i was then shown how to create a scheduled task that would execute such a batch file at regular intervals. it was almost too easy, as every task i scheduled worked like a charm. that was back in July...

...fast forward to today. my scheduled tasks are not working like they should be...in fact, they're not working at all. individual components of the processes are working, but they aren't working together to produce results. i can confirm that the Windows scheduling function is working b/c i see a DOS-like window appear and disappear in a flash at the exact time a task is scheduled to execute. i can also confirm that the batch files function correctly, b/c when i double-click them i can visually confirm that BOINC sends a scheduler request to that project's server via the event log. and yet oddly enough, even though i can see that the Windows scheduling function is doing something when its supposed to, and even though i can confirm the functionality of the batch files, Windows isn't executing them like it should when it should.

so far i only have 2 suspicions. the first one goes something like this - a few weeks ago i changed my host's "computer name" from ema-002 to ema-001. i have no idea if that had an effect on the functionality of the scheduled tasks at the time, and that i only noticed this today. so i decided to open one of my scheduled tasks (specifically, the one that executes the batch file that sends a scheduler request to the MW@H server), and this is what i saw...note the "Run As" box:



why it still says EMA-002 after i changed the computer name to ema-001, i don't know. also, i'm not sure about the capital letters, b/c the computer name definitely has lower case letters in it. if i try to change it to EMA-001 (or ema-001 for that matter), i get the following error:



i started to think that renaming my computer might have left some loose ends, so i changed it back to ema-002. but still, the scheduled tasks to not run. i'm thinking that maybe doing a system restore to an early enough point in time might get things working again, but i'm skeptical...


my second suspicion is the nasty virus i had to remove from this host just a few days ago. it was the type of virus that causes any executable file on the PC to actually open the virus' executable instead. i had to edit the registry to gain access to my programs and executables before i could even start working on getting rid of the virus. long story short, i got rid of the virus no problem, but perhaps some changes to the registry could be the source of my problems with Windows task scheduler? i'm not really sure where to go from here if this is the cause...perhaps a system restore would work in this instance too?

i really can't think of any alternatives other than to try a system restore at this point...anybody have any suggestions?

thanks,
Eric
ID: 52109 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 52110 - Posted: 29 Dec 2011, 5:19:07 UTC

*UPDATE*

so i tried a few different restore points, one of which was before the virus, and the other of which was well before i did any changing of the computer's name. neither got my scheduled tasks working again. i would think a system restore would have easily undid any computer renaming, whereas a system restore might not guarantee restoring your registry to a previous state. so i'm becoming more suspect of possible registry damage either caused by the virus i just had, or by the removal process. looks like i might not have the convenience of scheduled tasks until i either reinstall WinXP or upgrade to Win7 or something...
ID: 52110 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,761,780
RAC: 27,723
Message 52115 - Posted: 29 Dec 2011, 12:54:45 UTC - in response to Message 50050.  

ok, i have the batch file running every 15 minutes now. the only part that was a royal pain was the fact that scheduling a task wouldn't work without setting a password for the Windows user account it was being executed under. generally, i don't like to use a Windows account password b/c i restart my machine alot, and would prefer not to have to enter a password everytime i restart. but i guess its a small price to pay to guarantee that my GPU doesn't go idle for more than 15 minutes at a time while the servers are producing and distributing work.


What you have done is monopolize the limited number of Server connections so that the rest of get "Server Busy" connections. The Server has a limited number of connections it can make in a given time frame and by 'you' connecting every 15 minutes the rest of us can't get thru. The 60 minute time out was set so that when the Server does come back up after an outage 50 bazillion people all want work RIGHT NOW and the server couldn't handle all the connections and was breaking. The 60 minute clock for you started at a different time than for me so we were staggered and then had optimal chances of connecting. What you may be doing by doing this 'work around' is making Berkeley, the writers and maintainers of Boinc, to rethink the connection strategies to thwart you. Dr. A is ADAMANT that what you are doing is not 'Boinc friendly' and is therefore to be thwarted. I am STRICTLY a cruncher, I have no power over anything, I am just saying you may not want to get too invested in fixing this as it may not last. Have fun while it does though!
ID: 52115 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 52117 - Posted: 29 Dec 2011, 21:14:43 UTC

are you sure the reason the server breaks so much is due to folks like me who contact the server every 15 minutes? are you certain that it isn't at least in part [and probably mostly] due to the fact that the MW@H programmers at one point reduced the queue length to 12 tasks (to prevent excessive errors and invalids due to people downloading more work than they could complete before his or her tasks' deadlines), and in the process forced hosts to request new work from the MW@H server more often b/c their queues don't hold as much as they used to? while i'm sure their intentions were not to increase the number of attempted server contacts, it was a necessary sacrifice in order to reduce the number of errors and invalids that were occurring.

take into consideration that my HD 5870 GPU crunches through a full queue of 12 tasks every ~13 minutes. also take into consideration that any given host does not wait for the entire work queue to disappear before contacting the project server again for more work - that is, the BOINC client requests new work from the project server on its own volition quite a few times over that ~13 minutes it would otherwise take for my GPU to clear the whole cache. that kind of scheduler request is out of my control, and it occurs far more often than my automated scheduler requests that get issued every 15 minutes.

i understand that all this has been done in an effort to level the playing field and make things as fair as possible for all participants. but the fact is that not all participants are "created equal" with respect to host hardware and compute power. if we have a 3 minute outage during which my host (with the 5870 GPU) and someone else's host (with just a dual core CPU for example) contacts the server and gets backed off for 60 minutes, being idle for that 60 minutes hurts me (and more importantly the project itself) far more than it hurts the other host b/c my host is missing out of far more actual work...i'd call it "potential" work if the server were actually down for an appreciable amount of time, but if its only down for a few minutes, then i'm missing out on real work that the server is issuing while my machine sits idle.

now imagine all the folks who come back from being away from their computers, only to find that their MW@H hosts just started a 60-minute server back-off a few minutes ago. if they know that their host still has almost an hour to go before it sends another scheduler request, and they've already determined that the outage was a "shorty" (and that the server is back up and running again), do you really think they're just going to let their hosts ride out the back-off period? or do you think they're going to manually hit the update button in the BOINC manager given the opportunity?

...just some food for thought...
ID: 52117 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,761,780
RAC: 27,723
Message 52129 - Posted: 31 Dec 2011, 12:53:19 UTC - in response to Message 52117.  

are you sure the reason the server breaks so much is due to folks like me who contact the server every 15 minutes? are you certain that it isn't at least in part [and probably mostly] due to the fact that the MW@H programmers at one point reduced the queue length to 12 tasks (to prevent excessive errors and invalids due to people downloading more work than they could complete before his or her tasks' deadlines), and in the process forced hosts to request new work from the MW@H server more often b/c their queues don't hold as much as they used to? while i'm sure their intentions were not to increase the number of attempted server contacts, it was a necessary sacrifice in order to reduce the number of errors and invalids that were occurring.

take into consideration that my HD 5870 GPU crunches through a full queue of 12 tasks every ~13 minutes. also take into consideration that any given host does not wait for the entire work queue to disappear before contacting the project server again for more work - that is, the BOINC client requests new work from the project server on its own volition quite a few times over that ~13 minutes it would otherwise take for my GPU to clear the whole cache. that kind of scheduler request is out of my control, and it occurs far more often than my automated scheduler requests that get issued every 15 minutes.

i understand that all this has been done in an effort to level the playing field and make things as fair as possible for all participants. but the fact is that not all participants are "created equal" with respect to host hardware and compute power. if we have a 3 minute outage during which my host (with the 5870 GPU) and someone else's host (with just a dual core CPU for example) contacts the server and gets backed off for 60 minutes, being idle for that 60 minutes hurts me (and more importantly the project itself) far more than it hurts the other host b/c my host is missing out of far more actual work...i'd call it "potential" work if the server were actually down for an appreciable amount of time, but if its only down for a few minutes, then i'm missing out on real work that the server is issuing while my machine sits idle.

now imagine all the folks who come back from being away from their computers, only to find that their MW@H hosts just started a 60-minute server back-off a few minutes ago. if they know that their host still has almost an hour to go before it sends another scheduler request, and they've already determined that the outage was a "shorty" (and that the server is back up and running again), do you really think they're just going to let their hosts ride out the back-off period? or do you think they're going to manually hit the update button in the BOINC manager given the opportunity?

...just some food for thought...


I am munching on your thoughts but Dr. A is not, his mind is made up. He is the original programmer of Boinc and thru all of the Seti problems has tweaked it as best as he thinks it should be tweaked. He continues to tweak it to this day and it is still 'his baby', and he is still the main programmer. As you can tell I have been crunching for a VERY long time and have had multiple conversations with Dr. A, few went my way as neither of us like to admit someone else has a better idea. We often agreed to disagree in the end.
ID: 52129 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 52137 - Posted: 31 Dec 2011, 16:35:54 UTC

yeah, i guess it is what it is...

in the mean time i've done everything i can to get my scheduled tasks working again, short of reformatting and reinstalling windows. and it'll probably be weeks before i even get around to doing that, so i've given up for the time being lol...
ID: 52137 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
Message 52144 - Posted: 31 Dec 2011, 21:03:19 UTC
Last modified: 31 Dec 2011, 21:05:47 UTC

well i've decided not to give up just quite yet...

...i just got done installing a fresh copy of WinXP SP3 32-bit on another DC rig. i installed BOINC and attached to some projects. i then created some scheduled tasks to update those projects, and right off the bat they're not working on this machine. i went a step further and scheduled a task that had similar characteristics to the BOINC-related scheduled tasks - specifically, i scheduled a task that would execute (open) WinXP's built-in calculator every 60 seconds, and that task works flawlessly.

what baffles me is that the BOINC-related batch files that update individual projects execute without a problem (as verified by double-clicking them and witnessing the results of that action in the BOINC event log)...its just that the Windows scheduled tasks that are supposed to tell the batch files to run at specific times isn't doing it. i've verified that the scheduler works (by scheduling a completely BOINC-unrelated task) and i've verified that my batch files work. what gives? aside from the two methods being physically different, why would running the batch files as scheduled tasks be any different than running the batch files by manually double-clicking them?
ID: 52144 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bigred
Avatar

Send message
Joined: 23 Nov 07
Posts: 33
Credit: 300,042,542
RAC: 0
Message 52149 - Posted: 1 Jan 2012, 1:37:16 UTC - in response to Message 52144.  

Why not simply setup a backup project that will kick in when MW goes into backoff mode. Simply attach to another project such as Moo wrapper and set its resource share to 0 and then it will kick n when MW has no work for whatever reason.
ID: 52149 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : request regarding "deferred communication" times

©2024 Astroinformatics Group