Message boards :
Number crunching :
adequately supply or WUs: PLEASE: increase max_wus_in_progress or enable the Computing preference settings
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 19 Jul 10 Posts: 627 Credit: 19,361,776 RAC: 3,531 |
Are you sure about that? I have a feeling MW uses some kind of an over ride on cache settings........ Yes, on my computer with HD3850 I have ~2.4 days cache, which is way more than the 40 WUs I get from Milkyway. As soon as one WU finishes, BOINC requests new work. |
Send message Joined: 19 Jul 10 Posts: 627 Credit: 19,361,776 RAC: 3,531 |
I have pretty much set milkyway preferences the SAME as I have for all the other projects I run, and for the other projects, the finishing of a WU automatically sends a request for more WUs Do you run other projects at the same time? Than BOINC will fill up it's cache according to your settings from the other projects and not request work from Milkyway until it thinks it needs more work. Try running Milkyway only and than BOINC should request new work as soon as a WU finishes. BOINC is not good at keeping equal amount of work in cache from many (read: more than one per device) projects if it sometimes gets no work from one of them. Another counterpoint to this being a BOINC manager issue: I set a request for additional work on "Computing preferences" (Maintain enough tasks to keep busy for at least ... ... and up to an additional Well, you can't get more than what is allowed by the project. by 'cache' do you mean the Preference settings: Yes. The "additional" setting should be low if you are using BOINC v7, something like 0.01. |
Send message Joined: 22 Jan 11 Posts: 375 Credit: 64,707,164 RAC: 10 |
Are you sure about that? I have a feeling MW uses some kind of an over ride on cache settings........ A? Unless I've just mis-read what you've said or you've typo'd you've just contradicted yourself ;). I agree with your above post, my cache setting is 3 days & yet I also only have 40 WUs atm, about an hours worth! So that goes back to what I said, MW over rides the BOINC cache setting, which is what the op said too. Team AnandTech - SETI@H, DPAD, F@H, MW@H, A@H, LHC, POGS, R@H, Einstein@H, DHEP, WCG Main rig - Ryzen 5 3600, MSI B450 G.Pro C. AC, RTX 3060Ti 8GB, 32GB DDR4 3200, Win 10 64bit 2nd rig - i7 4930k @4.1 GHz, HD 7870 XT 3GB(DS), 16GB DDR3 1866, Win7 |
Send message Joined: 19 Jul 10 Posts: 627 Credit: 19,361,776 RAC: 3,531 |
Are you sure about that? I have a feeling MW uses some kind of an over ride on cache settings........ Not if you read it together with my post before yours (I assume your post was an answer to my 61573). The main thing in that post was that BOINC should request new work as soon as one WU finishes. So yes, I'm sure, that BOINC should request new work as soon as it finishes one WU. That MW has WU limits everybody should know by now. MW over rides the BOINC cache setting, which is what the op said too. That's part of the problem as I understand it. God Is Love etc. in message 61575 wrote: However, again, why do other projects send a request on a WU completion and MW does not? This is another issue, which has nothing to do with with the WU limit. For some reason his client is not requesting new tasks as soon as he finishes one. My first guess is cache full with tasks from other projects. |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
BOINC needs to request work first, than the MW server can respond and send WUs or say "max WU limit reached". If there is no request for new WUs, than it's a problem of the BOINC client. BOINC decides if it 'thinks' it needs more MW WUs. BOINC although backs off for some time from a project if there was a connection problem to the server. Add to that the short runtimes of MW gpu WUs and it can take the time of a reasonable number of WUs finished before the next request for new WUs takes place. Is BOINC set to send finished WUs right away? (see 'report_results_immediately') Do finished WUs stay in queue too when no new request happens? What does the log says when results are send back? With run times of less than 1 minute, the minimum wait time between 2 requests set by the server (60 seconds for MW servers) hits too. |
Send message Joined: 30 May 13 Posts: 18 Credit: 5,655,668 RAC: 5 |
"sigh" Well, you can't get more than what is allowed by the project. I am still holding out the hope that it is possible to convince MW to allow more than just 3 WU per processor... and when the WUs only run for ~an hour, this means a meager three hour limit. I am also asking MW is to adjust the algorithm to initiate a request for a new WU whenever one completes. Other projects do this. I fail to see how this is a matter of 'filling a cache' because that would be a matter of the amount of work requested or 'in inventory', and as MW themselves admit this is set BY MW at 3 WU per processor. some WU requests are tagged "requested by user" (although this does not necessarily mean an explicit "update" issued): [from event log Messages] SZTAKI Desktop Grid Sending scheduler request: Requested by user fightmalaria@home Sending scheduler request: Requested by user boincsimap Sending scheduler request: Requested by user EDGeS@Home Sending scheduler request: Requested by user eon2 Sending scheduler request: Requested by user some requests are tagged "requested by project" World Community Grid Sending scheduler request: Requested by project eon2 Sending scheduler request: Requested by project other WU requests are generated: malariacontrol.net Sending scheduler request: To fetch work eon2 Sending scheduler request: To fetch work and others scheduler requets are inititated to report completed tasks: malariacontrol.net Sending scheduler request: To report completed tasks EDGeS@Home Sending scheduler request: To report completed tasks boincsimap Sending scheduler request: To report completed tasks Einstein@Home Sending scheduler request: To report completed tasks eon2 Sending scheduler request: To report completed tasks World Community Grid Sending scheduler request: To report completed tasks it seems obvious that the difference kinds of scheduler requests issued are project-dependent. LLP, PhD, Prof. Engr. I think, therefor I THINK I am. God is Love, Jesus proves it. God is Love ... all (well -- most, anyway) project stats |
Send message Joined: 30 May 13 Posts: 18 Credit: 5,655,668 RAC: 5 |
On two machines I am pretty much constrained to run BOINC 5.8. Is it possible to set report_results_immediately in that version? I searched in the directory files and found no relevant hit in any of the files. LLP, PhD, Prof. Engr. I think, therefor I THINK I am. God is Love, Jesus proves it. God is Love ... all (well -- most, anyway) project stats |
Send message Joined: 23 Sep 12 Posts: 159 Credit: 16,977,106 RAC: 0 |
So we have been calculating the limit in the background it was changed in the past due to problems with the time results were returned. The general calculations in the background look like this Given 35,000 hosts, each having 8 work units that’s = 280,000 workunits in the database (each with potentially 2-3 results). And that’s only counting active work units. If we double it, it could put a big strain on the database. That’s why we have such a small limit. Our algorithm is asynchronous so we change the work units based off the returned results. So we need to balance what we have out versus what is being returned. I am trying to gauge the scope of the issue in the user base with the thread and what other resolutions may help individual users. I do not think with the scope of the work we are doing we can change it much and have the units converge quickly. Other projects release work units and wait for all the results. The type of data processing we use would take much longer in this format. We update the parameters we are searching based on the returned results. So our algorithm is very sensitive to the communication time with the users. We concede this will not work for everyone and we try to make it work for the most users as possible. In this case I am hoping that the user community can find a solution for the original poster on settings that helps mitigate the issues they are having. I personally think if the boinc version is 5.8 that updating it may help with some of the problems requesting new work units but I am no familiar with the transition from 5.x clients to higher version clients. Jeff Thompson |
Send message Joined: 19 Jul 10 Posts: 627 Credit: 19,361,776 RAC: 3,531 |
I am still holding out the hope that it is possible to convince MW to allow more than just 3 WU per processor... and when the WUs only run for ~an hour, this means a meager three hour limit. Considering just about 1 hour runtime I assume quite fast CPU (your computers are still hidden)... i7 perhaps? There are CPUs much older/slower out there. And still you have larger cache than the slowest ATI GPU capable to run Milkyway (HD3850). I have such GPU and no problems to keep my cache "full" unless the servers are down and that didn't happen too often recently, last time I can recall was in February. So the issue to generally keep enough work in your cache (i.e. BOINC not requesting tasks when a WU finishes) must be on your end and has not much to do with the WU limit. The WU limit just limits the amount of WUs in your cache, it does not stopp BOINC from requesting new tasks up to this limit like you are reporting. I am also asking MW is to adjust the algorithm to initiate a request for a new WU whenever one completes. Other projects do this. First thing to try would be setting all other projects to "no new tasks" and let them finish and report all WUs as I have written above, i.e. run only Milkyway. If that does not help, update BOINC to a less ancient version, if you crunch with your CPU only, I can recommend from my own experience 6.10.18 (or 6.12.34 if you want to use BOINC's backup project feature), if you want to use some GPUs even more recent versions might be necessary. All that versions should request new work as soon as one WU finishes (if no other projects using the same device and cache set large enough), so your cache should stay full (within the project's limits). it seems obvious that the difference kinds of scheduler requests issued are project-dependent. Sure. BOINC offers the possibility to trigger some additional scheduler requests or limit the amount of them. Milkyway only limits the scheduler requests to one every 60 seconds, nothing more, rest is up to your BOINC client. And one request per minute is completely enough even for the fastest GPUs. |
Send message Joined: 19 Jul 10 Posts: 627 Credit: 19,361,776 RAC: 3,531 |
On two machines I am pretty much constrained to run BOINC 5.8. This will report completed results immediately when they are ready, it will not change anything about requesting new tasks. I'd suggest, that you enable work_fetch_debug (see client configuration) and post the output, so we can stopp guessing what your BOINC client is doing and why it is not requesting work after finishing WU. |
Send message Joined: 30 May 13 Posts: 18 Credit: 5,655,668 RAC: 5 |
thanks. will try to get with this. things getting buys just now, both at work and at home. ttfn (to quote Tigger) LLP, PhD, Prof. Engr. I think, therefor I THINK I am. God is Love, Jesus proves it. God is Love ... all (well -- most, anyway) project stats |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
The old days of BOINC 5.8 ... some memories came back :) BOINC is delaying results to send back in bulk; reason is to be nice to the servers. BOINC 5.8 is a problem since it had no option to force sending results immediately. You had to push projects update to do it by hand. Only way out at that time was the job scheduler and BOINC command line options to force BOINC sending back results; it had to be done sensible or it was 'hammering' the servers. If you can upgrade to any BOINC 6.1.+ (using 6.10.58 myself because I don't like the newer interface), you could use the cc_config setting report_results_immediately that I mentioned earlier. With that option, results are send back 60 seconds after they are finished and than BOINC should realize that it needs to ask for refilling the work cache. |
Send message Joined: 30 May 13 Posts: 18 Credit: 5,655,668 RAC: 5 |
Actually, I am getting more and more convince that this problem is not nearly so much a matter of the BOINC manager, as it having to do with some MW parameters: 4/24/2014 9:49:55 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/24/2014 9:49:55 AM|Milkyway@Home|Requesting 136288 seconds of new work 4/24/2014 9:50:00 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/24/2014 9:50:00 AM|Milkyway@Home|Message from server: No tasks sent 4/24/2014 9:50:00 AM|Milkyway@Home|Message from server: This computer has reached a limit on tasks in progress 4/24/2014 9:50:00 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/24/2014 9:50:00 AM|Milkyway@Home|Reason: requested by project 4/24/2014 9:50:00 AM|Milkyway@Home|Deferring communication for 2 hr 5 min 43 sec with WUs running ~an hour, there is somewhere between 2 and at most 3 hours of processing left. So, why would WM defer communication for over 2 hours??? LLP, PhD, Prof. Engr. I think, therefor I THINK I am. God is Love, Jesus proves it. God is Love ... all (well -- most, anyway) project stats |
Send message Joined: 30 May 13 Posts: 18 Credit: 5,655,668 RAC: 5 |
PS i have been doing almost all my posting here at MW. Is it just MW forum that is without any built in spell-checker? Thx. LLP, PhD, Prof. Engr. I think, therefor I THINK I am. God is Love, Jesus proves it. God is Love ... all (well -- most, anyway) project stats |
Send message Joined: 19 Jul 10 Posts: 627 Credit: 19,361,776 RAC: 3,531 |
with WUs running ~an hour, there is somewhere between 2 and at most 3 hours of processing left. Because of many consecutive not successful requests. This is to prevent too many clients connecting to the servers if there is no work available. This timer should be however reset once a WU finishes, so simply let it run without clicking on any buttons. Is it just MW forum that is without any built in spell-checker? No, all BOINC forums are like that. |
Send message Joined: 22 Jan 11 Posts: 375 Credit: 64,707,164 RAC: 10 |
(The 1hr edit limit is far more annoying!!, is that a BOINC forum thing too?). Team AnandTech - SETI@H, DPAD, F@H, MW@H, A@H, LHC, POGS, R@H, Einstein@H, DHEP, WCG Main rig - Ryzen 5 3600, MSI B450 G.Pro C. AC, RTX 3060Ti 8GB, 32GB DDR4 3200, Win 10 64bit 2nd rig - i7 4930k @4.1 GHz, HD 7870 XT 3GB(DS), 16GB DDR3 1866, Win7 |
Send message Joined: 13 Mar 08 Posts: 804 Credit: 26,380,161 RAC: 0 |
(The 1hr edit limit is far more annoying!!, is that a BOINC forum thing too?). Yes |
Send message Joined: 30 May 13 Posts: 18 Credit: 5,655,668 RAC: 5 |
Thanks for you reply, Link, however: so simply let it run without clicking on any buttons It is not I who is doing any clicking, WM on it's own issues repeated requests on its own if you look at one of my earlier posts: (Message 61495 ) Why is that MW will repeatedly request more WU minute by minute when the limit had just recently been filled,(emphais added here). LLP, PhD, Prof. Engr. I think, therefor I THINK I am. God is Love, Jesus proves it. God is Love ... all (well -- most, anyway) project stats |
Send message Joined: 19 Jul 10 Posts: 627 Credit: 19,361,776 RAC: 3,531 |
if you look at one of my earlier posts: (Message 61495 ) Your BOINC client sends those requests and while the server sends a message for the user, AFAIK it has no possibility to tell the same to the BOINC client. But as you see, after few such requests your BOINC client won't ask every minute but every few hours (see "Deferring communication for 2 hr 5 min 43 sec") or when the next WU finishes. If after telling you it's deferring communication for 2+ hours it's still sending request every minute, than there's clearly something wrong at your end and you really should try to update that ancient client. |
Send message Joined: 30 May 13 Posts: 18 Credit: 5,655,668 RAC: 5 |
With only at most +/- 3 hr of work-in-progress allowed, why dose WM do this? And, this is the routine of MW: 4/25/2014 5:00:00 AM|Milkyway@Home|Requesting 118489 seconds of new work, and reporting 6 completed tasks 4/25/2014 5:00:05 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/25/2014 5:00:05 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/25/2014 5:00:05 AM|Milkyway@Home|Reason: requested by project 4/25/2014 5:00:07 AM|Milkyway@Home|Starting de_separation_86_DR8_Rev_7_4_002_1398336302_358186_1 4/25/2014 5:00:07 AM|Milkyway@Home|Starting task de_separation_86_DR8_Rev_7_4_002_1398336302_358186_1 using milkyway version 100 4/25/2014 5:01:06 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:01:06 AM|Milkyway@Home|Requesting 89686 seconds of new work 4/25/2014 5:01:37 AM|Milkyway@Home|Scheduler request failed: HTTP internal server error 4/25/2014 5:01:37 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:01:37 AM|Milkyway@Home|Requesting 89686 seconds of new work 4/25/2014 5:01:42 AM|Milkyway@Home|Scheduler request failed: HTTP file not found 4/25/2014 5:01:42 AM|Milkyway@Home|Deferring communication for 1 min 0 sec 4/25/2014 5:01:42 AM|Milkyway@Home|Reason: scheduler request failed 4/25/2014 5:02:43 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:02:43 AM|Milkyway@Home|Requesting 89611 seconds of new work 4/25/2014 5:02:48 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/25/2014 5:02:48 AM|Milkyway@Home|Message from server: No tasks sent 4/25/2014 5:02:48 AM|Milkyway@Home|Message from server: This computer has reached a limit on tasks in progress 4/25/2014 5:02:48 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/25/2014 5:02:48 AM|Milkyway@Home|Reason: requested by project 4/25/2014 5:03:54 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:03:54 AM|Milkyway@Home|Requesting 89559 seconds of new work 4/25/2014 5:03:59 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/25/2014 5:03:59 AM|Milkyway@Home|Message from server: No tasks sent 4/25/2014 5:03:59 AM|Milkyway@Home|Message from server: This computer has reached a limit on tasks in progress 4/25/2014 5:03:59 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/25/2014 5:03:59 AM|Milkyway@Home|Reason: requested by project 4/25/2014 5:05:06 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:05:06 AM|Milkyway@Home|Requesting 89499 seconds of new work 4/25/2014 5:05:11 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/25/2014 5:05:11 AM|Milkyway@Home|Message from server: No tasks sent 4/25/2014 5:05:11 AM|Milkyway@Home|Message from server: This computer has reached a limit on tasks in progress 4/25/2014 5:05:11 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/25/2014 5:05:11 AM|Milkyway@Home|Reason: requested by project 4/25/2014 5:06:17 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:06:17 AM|Milkyway@Home|Requesting 89443 seconds of new work 4/25/2014 5:06:22 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/25/2014 5:06:22 AM|Milkyway@Home|Message from server: No tasks sent 4/25/2014 5:06:22 AM|Milkyway@Home|Message from server: This computer has reached a limit on tasks in progress 4/25/2014 5:06:22 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/25/2014 5:06:22 AM|Milkyway@Home|Reason: requested by project 4/25/2014 5:06:22 AM|Milkyway@Home|Deferring communication for 2 min 13 sec 4/25/2014 5:06:22 AM|Milkyway@Home|Reason: no work from project 4/25/2014 5:08:39 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:08:39 AM|Milkyway@Home|Requesting 90117 seconds of new work 4/25/2014 5:08:44 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/25/2014 5:08:44 AM|Milkyway@Home|Message from server: No tasks sent 4/25/2014 5:08:44 AM|Milkyway@Home|Message from server: This computer has reached a limit on tasks in progress 4/25/2014 5:08:44 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/25/2014 5:08:44 AM|Milkyway@Home|Reason: requested by project 4/25/2014 5:08:44 AM|Milkyway@Home|Deferring communication for 1 min 30 sec 4/25/2014 5:08:44 AM|Milkyway@Home|Reason: no work from project 4/25/2014 5:10:16 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:10:16 AM|Milkyway@Home|Requesting 90041 seconds of new work 4/25/2014 5:10:21 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/25/2014 5:10:21 AM|Milkyway@Home|Message from server: No tasks sent 4/25/2014 5:10:21 AM|Milkyway@Home|Message from server: This computer has reached a limit on tasks in progress 4/25/2014 5:10:21 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/25/2014 5:10:21 AM|Milkyway@Home|Reason: requested by project 4/25/2014 5:10:21 AM|Milkyway@Home|Deferring communication for 17 min 51 sec 4/25/2014 5:10:21 AM|Milkyway@Home|Reason: no work from project 4/25/2014 5:28:17 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:28:17 AM|Milkyway@Home|Requesting 91552 seconds of new work 4/25/2014 5:28:22 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/25/2014 5:28:22 AM|Milkyway@Home|Message from server: No tasks sent 4/25/2014 5:28:22 AM|Milkyway@Home|Message from server: This computer has reached a limit on tasks in progress 4/25/2014 5:28:22 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/25/2014 5:28:22 AM|Milkyway@Home|Reason: requested by project 4/25/2014 5:28:22 AM|Milkyway@Home|Deferring communication for 9 min 58 sec 4/25/2014 5:28:22 AM|Milkyway@Home|Reason: no work from project 4/25/2014 5:38:21 AM|Milkyway@Home|Sending scheduler request: To fetch work 4/25/2014 5:38:21 AM|Milkyway@Home|Requesting 92636 seconds of new work 4/25/2014 5:38:26 AM|Milkyway@Home|Scheduler RPC succeeded [server version 701] 4/25/2014 5:38:26 AM|Milkyway@Home|Message from server: No tasks sent 4/25/2014 5:38:26 AM|Milkyway@Home|Message from server: This computer has reached a limit on tasks in progress 4/25/2014 5:38:26 AM|Milkyway@Home|Deferring communication for 1 min 1 sec 4/25/2014 5:38:26 AM|Milkyway@Home|Reason: requested by project 4/25/2014 5:38:26 AM|Milkyway@Home|Deferring communication for 2 hr 5 min 46 sec 2 hr 5 min ??? Thank you for addressing this. LLP, PhD, Prof. Engr. I think, therefor I THINK I am. God is Love, Jesus proves it. God is Love ... all (well -- most, anyway) project stats |
©2024 Astroinformatics Group