30 Workunit Limit Per Request

Author	Message
Mad_Max Send message Joined: 2 Aug 11 Posts: 13 Credit: 44,453,057 RAC: 0	Message 68466 - Posted: 1 Apr 2019, 19:06:14 UTC Last modified: 1 Apr 2019, 19:18:46 UTC P.S. Actually <next_rpc_delay> is NOT a server side delay. It has other name (forgot it). <next_rpc_delay> is also a client side delay, but this time not a min delay (do not contact server until this time pass) but max delay (DO contact server after this time pass even there is no any need for it - eg. nothing to report and no need to ask new for new task). Is it really needed? This option force ALL attached client to contact server every 10 mins even if client does not actually work for the project currently. For example here is log snippet from one of my computers where MW set to backoff project(low priority): .................................... 01/04/2019 15:54:30 \| Milkyway@Home \| Sending scheduler request: Requested by project. 01/04/2019 15:54:30 \| Milkyway@Home \| Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: job cache full) 01/04/2019 15:54:32 \| Milkyway@Home \| Scheduler request completed 01/04/2019 16:04:37 \| Milkyway@Home \| Sending scheduler request: Requested by project. 01/04/2019 16:04:37 \| Milkyway@Home \| Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: job cache full) 01/04/2019 16:04:40 \| Milkyway@Home \| Scheduler request completed 01/04/2019 16:14:44 \| Milkyway@Home \| Sending scheduler request: Requested by project. 01/04/2019 16:14:44 \| Milkyway@Home \| Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: job cache full) 01/04/2019 16:14:46 \| Milkyway@Home \| Scheduler request completed 01/04/2019 16:24:52 \| Milkyway@Home \| Sending scheduler request: Requested by project. 01/04/2019 16:24:52 \| Milkyway@Home \| Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: job cache full) 01/04/2019 16:24:54 \| Milkyway@Home \| Scheduler request completed 01/04/2019 16:34:56 \| Milkyway@Home \| Sending scheduler request: Requested by project. 01/04/2019 16:34:56 \| Milkyway@Home \| Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: job cache full) 01/04/2019 16:34:58 \| Milkyway@Home \| Scheduler request completed 01/04/2019 16:45:03 \| Milkyway@Home \| Sending scheduler request: Requested by project. 01/04/2019 16:45:03 \| Milkyway@Home \| Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: job cache full) 01/04/2019 16:45:05 \| Milkyway@Home \| Scheduler request completed 01/04/2019 16:55:06 \| Milkyway@Home \| Sending scheduler request: Requested by project. 01/04/2019 16:55:06 \| Milkyway@Home \| Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: job cache full) 01/04/2019 16:55:09 \| Milkyway@Home \| Scheduler request completed 01/04/2019 17:05:14 \| Milkyway@Home \| Sending scheduler request: Requested by project. ........................................... and so on every ~10 min It keeps hammering server with useless requests. Usually it is useful only for specific purposes like canceling WU in progress from the server side. Server can not contact client directly so instead it ask client to "check in" every X min/hours for possible new instructions. But even for this case usually few hours is enough. Every 10 min is a overkill. ID: 68466 · Rating: 0 · rate: / Reply Quote

jpmboy Send message Joined: 29 Apr 17 Posts: 33 Credit: 7,041,502,264 RAC: 0	Message 68467 - Posted: 1 Apr 2019, 23:55:51 UTC - in response to Message 68465. Last modified: 2 Apr 2019, 0:43:55 UTC I'm having this issue, and my two Titan Vs sit idle for most of the day waiting for a task-batch download. Each task completes in less than 1 min, I run 12 tasks at a time (6 per GPU). 200 tasks complete in ~ 17 min. WE really need a fix to this which accounts for GPUs that are not DP crippled. ;) ID: 68467 · Rating: 0 · rate: / Reply Quote

VietOZ Send message Joined: 28 Mar 18 Posts: 14 Credit: 761,475,797 RAC: 0	Message 68468 - Posted: 2 Apr 2019, 4:24:39 UTC - in response to Message 68467. 2 GPUs should give you 400 tasks. Run 1 task per GPU and setup 6 instances. That still give you 12 tasks at a time. Add another GPU to your coproc.xml and lock it. Set cache to 10/10 and you'll get 600 tasks per instance. Make a tickler for every 5 minutes, that way when 600 tasks run out, the machine still get work after a few minutes idle. I know it's not a long term solution, but help Jake out. Give him time to pin point the problem. There are many ways to get around this. My VII maybe losing about 80k points per day with this issue ... not really a big deal. ID: 68468 · Rating: 0 · rate: / Reply Quote

jpmboy Send message Joined: 29 Apr 17 Posts: 33 Credit: 7,041,502,264 RAC: 0	Message 68469 - Posted: 2 Apr 2019, 12:08:46 UTC - in response to Message 68468. Thank you for the reply. I'll try jerry-rigging something when I get a chance. ID: 68469 · Rating: 0 · rate: / Reply Quote

Jim1348 Send message Joined: 9 Jul 17 Posts: 100 Credit: 16,967,906 RAC: 0	Message 68475 - Posted: 4 Apr 2019, 21:06:50 UTC I don't know if this is the right thread or not, but since no one seems to know what the problem is, I will try it. I attached my RX 570 (Win7 64-bit) at 10 AM and got 74 work units. The take 1 minute 49 seconds to run. Then after a couple of hours I got another 72, and then a few hours therafter 71 work units. Then, at 5 PM I ran out and got nothing. But a manual request caused another 71 to download. If they run out again, it is back to Folding. I can't stay up all night to get them. ID: 68475 · Rating: 0 · rate: / Reply Quote

Jim1348 Send message Joined: 9 Jul 17 Posts: 100 Credit: 16,967,906 RAC: 0	Message 68478 - Posted: 5 Apr 2019, 16:05:07 UTC - in response to Message 68475. Then, at 5 PM I ran out and got nothing. But a manual request caused another 71 to download. If they run out again, it is back to Folding. I can't stay up all night to get them. After the last work unit finishes and it gets zero on the next request, BOINC waits 10 minutes and tries again. Then, it gets a full load (85 on the last request). It is a strange problem, but not a major one for me. ID: 68478 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 19 Jul 10 Posts: 750 Credit: 20,231,302 RAC: 8,305	Message 68479 - Posted: 5 Apr 2019, 19:19:24 UTC Last modified: 5 Apr 2019, 19:22:10 UTC @ all having issue to get WUs here because of that 10 minutes thing... Have you tried to set your cache to: "Store at least 0.01 days of work" and whatever else additional and then set "network activity based on preferences" and not "network activity always"? This should actually limit scheduler requests to once in 14.4 minutes... I think. I don't have a Milkyway compatible GPU to try that. Just a thought until it's fixed on the server... ID: 68479 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 24 Jan 11 Posts: 734 Credit: 564,777,765 RAC: 12,467	Message 68480 - Posted: 5 Apr 2019, 20:07:16 UTC - in response to Message 68465. 200 computed tasks in less 10 minutes? It is not possible even for fastest machines. Very best computers with few modern powerful GPUs working in parallel dedicated to the single project of MW can do 200 tasks "only" in ~20-40 min. Beg to differ. If I have a host with 8 RTX 2080 TI cards or similar, I can easily crunch through 200 tasks in ten minutes. There are many hosts with mining rig pedigrees that have multiple gpus. I have a minimum of 3 cards in every host. ID: 68480 · Rating: 0 · rate: / Reply Quote

bluestang Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0	Message 68481 - Posted: 5 Apr 2019, 22:53:57 UTC - in response to Message 68480. 200 computed tasks in less 10 minutes? It is not possible even for fastest machines. Very best computers with few modern powerful GPUs working in parallel dedicated to the single project of MW can do 200 tasks "only" in ~20-40 min. Beg to differ. If I have a host with 8 RTX 2080 TI cards or similar, I can easily crunch through 200 tasks in ten minutes. There are many hosts with mining rig pedigrees that have multiple gpus. I have a minimum of 3 cards in every host. Now that's an XtremeSystem! Just like our team likes :) ID: 68481 · Rating: 0 · rate: / Reply Quote

mmonnin Send message Joined: 2 Oct 16 Posts: 167 Credit: 1,012,160,885 RAC: 0	Message 68483 - Posted: 6 Apr 2019, 1:59:19 UTC - in response to Message 68480. Last modified: 6 Apr 2019, 2:01:26 UTC 200 computed tasks in less 10 minutes? It is not possible even for fastest machines. Very best computers with few modern powerful GPUs working in parallel dedicated to the single project of MW can do 200 tasks "only" in ~20-40 min. Beg to differ. If I have a host with 8 RTX 2080 TI cards or similar, I can easily crunch through 200 tasks in ten minutes. There are many hosts with mining rig pedigrees that have multiple gpus. I have a minimum of 3 cards in every host. Then your task count is higher with more cards. 200 is the limit for 1 GPU and the statement was in regards to 1 single GPU. Only a TV or 7 is crunching in that time with 200 tasks per GPU. It seems like its hard enough to get the admins to realize the issue wasn't how many task can be downloaded at once but the timeout issue completely preventing tasks from downloading at all. Please stay on topic instead of e-peening about omg my gpus can do it in 10minutes. ID: 68483 · Rating: 0 · rate: / Reply Quote

bluestang Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0	Message 68484 - Posted: 6 Apr 2019, 3:06:26 UTC 8x 2080ti = 1x Radeon VII for MilkyWay lol (Sorry, couldn't resist) ID: 68484 · Rating: 0 · rate: / Reply Quote

San-Fernando-Valley Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0	Message 68485 - Posted: 6 Apr 2019, 7:41:54 UTC - in response to Message 68483. ... Please stay on topic instead of e-peening about omg my gpus can do it in 10minutes. THIS last sentence of yours is way off .... You are missing the point. Have a nice day! (just had to say it) ID: 68485 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 24 Jan 11 Posts: 734 Credit: 564,777,765 RAC: 12,467	Message 68487 - Posted: 6 Apr 2019, 22:52:10 UTC - in response to Message 68483. and the statement was in regards to 1 single GPU That WAS NOT apparent from just your post that no computer could crunch through 200 tasks in 10 minutes. ID: 68487 · Rating: 0 · rate: / Reply Quote

Hurr1cane78 Send message Joined: 7 May 14 Posts: 57 Credit: 208,535,911 RAC: 0	Message 68490 - Posted: 7 Apr 2019, 11:43:15 UTC keen observer of the skies and this forum, enlighten me with more stats of the rtx 2080 ti owners please, seriously , with max out instances on only single rtx2080ti , how many WU's can you peel please ID: 68490 · Rating: 0 · rate: / Reply Quote

Aurum Send message Joined: 11 Jul 17 Posts: 20 Credit: 1,430,091,009 RAC: 12,247	Message 68491 - Posted: 8 Apr 2019, 20:42:41 UTC Last modified: 8 Apr 2019, 20:43:11 UTC I never get enough MW WUs. I can run 200 at a time. Some computers don't get any while some others get a steady supply. They all have the same app_config.xml. Server Status shows 10,000 WUs ready but I have computers that haven't gotten any in forever. Is there some kind of governor on this project??? I can't find anything to explain the difference. ID: 68491 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 68492 - Posted: 9 Apr 2019, 15:25:23 UTC Hey Everyone, Sorry I didn't had a chance to reply last week. I was at a conference sharing some of the great new results we have. I am back now so I am going to start reading through this thread to see what I can do to fix this issue on my end. I saw that some people were suggesting changing the rpc_delay which I think we have set to 90 seconds. I'll post again soon when I put a plan together. Jake ID: 68492 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 68493 - Posted: 9 Apr 2019, 16:01:48 UTC Hey Everyone, I see that the rpc_delay what not set to 90 seconds as I thought. I have updated this accordingly. Hopefully this will solve our issues. Best, Jake ID: 68493 · Rating: 0 · rate: / Reply Quote

Manfred Reiff Send message Joined: 27 Apr 18 Posts: 11 Credit: 72,923,580 RAC: 0	Message 68494 - Posted: 9 Apr 2019, 16:44:16 UTC - in response to Message 68428. To me the increase of the feeder's shared memory size did not work. I still receive only 20 (!) workunits at any given time. I'm using an Intel Core i9-7900X with 20 processors. Increasing the amount of downloaded workunits (from 2 + 2 days to, for instance, 2 + 4 or 2 + 5 days) does not work. The maximum number of workunits will never exceed 20. I can't see a parallel with the type of CPU used because I also use a computer working with an Intel Core i7-8700K (with 12 processors). That computer is getting much more workunits at any given time. At present there are 41 WUs waiting and I'll get new workunits for finished and uploaded WUs. That's a bit irritating... Greetings from the summerly warm german midlands Manfred ID: 68494 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 68495 - Posted: 9 Apr 2019, 17:47:35 UTC Hey Manfred, Most of these solutions were implemented with GPUs in mind and are limited to GPUs. I'll take a look at the number of CPU workunits currently allowed and maybe try doubling it. Best, Jake ID: 68495 · Rating: 0 · rate: / Reply Quote

Vortac Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0	Message 68496 - Posted: 10 Apr 2019, 10:20:54 UTC Last modified: 10 Apr 2019, 10:21:35 UTC Looks like rpc_delay of 90 secs is a bit hard on the server? Since yesterday, my Event Log is showing a lot of these: 10/04/2019 12:09:37 \| Project communication failed: attempting access to reference site 10/04/2019 12:09:37 \| Milkyway@Home \| Scheduler request failed: Failure when receiving data from the peer 10/04/2019 12:09:38 \| Internet access OK - project servers may be temporarily down. and these 10/04/2019 12:17:09 \| Milkyway@Home \| Scheduler request failed: Couldn't connect to server 10/04/2019 12:17:10 \| Project communication failed: attempting access to reference site 10/04/2019 12:17:12 \| Internet access OK - project servers may be temporarily down ID: 68496 · Rating: 0 · rate: / Reply Quote

30 Workunit Limit Per Request - Fix Implemented