Message boards :
News :
30 Workunit Limit Per Request - Fix Implemented
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 13 Dec 17 Posts: 46 Credit: 2,421,362,376 RAC: 0 |
Looks like rpc_delay of 90 secs is a bit hard on the server? Since yesterday, my Event Log is showing a lot of these: Same with me, it seems quite hard on the server indeed. |
Send message Joined: 27 Apr 18 Posts: 11 Credit: 72,923,580 RAC: 0 |
Hi Jake, I don't know which settings you changed on your server(s) but yesterday evening local time (CEST = UTC+2h) I received much, much more new workunits for both of my computers that at present calculate Milkyway@Home. Before that "changes" I only received 20 WUs for my newest computer (Intel Core i9-7900X with 20 processors) and approx. 60 for my upgraded computer (now: Intel Core i7-8700K with 12 processors). I received approx. 600 WUs for each computer, that's 1,200 in total! Just, WOW! A lot of WUs to be calculated until April 22... Whatever you changed - many thanks! Notice: At present I'm only working CPU WUs. My GeForce 1080 Ti (Intel Core i9 computer) is currently working on Collitz WUs with some Einstein@Home interruptions. My second computer (Intel Core i7) is equipped with a GeForce 1070 Ti. The GPU is solely used for Collitz WUs. Greetings from Remscheid Manfred |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
I'll bump the RPC delay up to 3 minutes. If this is too long, let me know. Best, Jake |
Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0 |
Much better at 3 minutes since it won't be hammering the server with requests. As long as you let us have a nice big cache full of WUs we are all good :) Although I'm not sure the WU download issue is fixed for GPUs like @Manfred said it is for CPUs. I still can't keep a full cache until they all run out and it downloads them all again??? And getting this alot now: 4/10/2019 11:41:58 AM | Milkyway@Home | Sending scheduler request: Requested by project. 4/10/2019 11:41:58 AM | Milkyway@Home | Reporting 9 completed tasks 4/10/2019 11:41:58 AM | Milkyway@Home | Requesting new tasks for AMD/ATI GPU 4/10/2019 11:42:10 AM | | Project communication failed: attempting access to reference site 4/10/2019 11:42:10 AM | Milkyway@Home | Scheduler request failed: Failure when receiving data from the peer 4/10/2019 11:42:11 AM | | Internet access OK - project servers may be temporarily down. Doesn't look like stats sites are making contact either, so not sure what's up. |
Send message Joined: 30 Mar 09 Posts: 63 Credit: 621,582,726 RAC: 0 |
Hi Jake, I get this; 10-4-2019 18:08:24 | Milkyway@Home | Sending scheduler request: Requested by project. 10-4-2019 18:08:24 | Milkyway@Home | Reporting 2 completed tasks 10-4-2019 18:08:24 | Milkyway@Home | Requesting new tasks for AMD/ATI GPU 10-4-2019 18:08:24 | Milkyway@Home | [http] HTTP_OP::init_post(): http://milkyway.cs.rpi.edu/milkyway_cgi/cgi 10-4-2019 18:08:24 | Milkyway@Home | [http] HTTP_OP::libcurl_exec(): ca-bundle set 10-4-2019 18:08:25 | Milkyway@Home | [http] [ID#1] Info: Connection 14796 seems to be dead! 10-4-2019 18:08:25 | Milkyway@Home | [http] [ID#1] Info: Closing connection 14796 10-4-2019 18:08:25 | Milkyway@Home | [http] [ID#1] Info: timeout on name lookup is not supported 10-4-2019 18:08:25 | Milkyway@Home | [http] [ID#1] Info: Hostname was NOT found in DNS cache 10-4-2019 18:08:25 | Milkyway@Home | [http] [ID#1] Info: Trying 128.113.126.23... 10-4-2019 18:08:46 | Milkyway@Home | [http] [ID#1] Info: connect to 128.113.126.23 port 80 failed: Timed out 10-4-2019 18:08:46 | Milkyway@Home | [http] [ID#1] Info: Failed to connect to milkyway.cs.rpi.edu port 80: Timed out 10-4-2019 18:08:46 | Milkyway@Home | [http] [ID#1] Info: Closing connection 14797 10-4-2019 18:08:46 | Milkyway@Home | [http] HTTP error: Couldn't connect to server 10-4-2019 18:08:47 | Milkyway@Home | Message from task: 0 10-4-2019 18:08:47 | | Project communication failed: attempting access to reference site 10-4-2019 18:08:47 | | [http] HTTP_OP::init_get(): http://www.google.com/ 10-4-2019 18:08:47 | | [http] HTTP_OP::libcurl_exec(): ca-bundle set 10-4-2019 18:08:47 | Milkyway@Home | Computation for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554838388_259291_0 finished 10-4-2019 18:08:47 | Milkyway@Home | Starting task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554838388_259043_0 10-4-2019 18:08:47 | Milkyway@Home | Scheduler request failed: Couldn't connect to server 10-4-2019 18:08:47 | | [http] [ID#0] Info: Found bundle for host www.google.com: 0x420e9d0 10-4-2019 18:08:47 | | [http] [ID#0] Info: Re-using existing connection! (#14792) with host www.google.com 10-4-2019 18:08:47 | | [http] [ID#0] Info: Connected to www.google.com (172.217.168.196) port 80 (#14792) 10-4-2019 18:08:47 | | [http] [ID#0] Sent header to server: GET / HTTP/1.1 10-4-2019 18:08:47 | | [http] [ID#0] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.6.9) 10-4-2019 18:08:47 | | [http] [ID#0] Sent header to server: Host: www.google.com 10-4-2019 18:08:47 | | [http] [ID#0] Sent header to server: Accept: */* 10-4-2019 18:08:47 | | [http] [ID#0] Sent header to server: Accept-Encoding: deflate, gzip 10-4-2019 18:08:47 | | [http] [ID#0] Sent header to server: Content-Type: application/x-www-form-urlencoded 10-4-2019 18:08:47 | | [http] [ID#0] Sent header to server: Accept-Language: nl_NL 10-4-2019 18:08:47 | | [http] [ID#0] Sent header to server: 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: HTTP/1.1 200 OK 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: Date: Wed, 10 Apr 2019 16:08:44 GMT 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: Expires: -1 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: Cache-Control: private, max-age=0 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: Content-Type: text/html; charset=ISO-8859-1 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info." 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: Content-Encoding: gzip 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: Server: gws 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: Content-Length: 5391 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: X-XSS-Protection: 0 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: X-Frame-Options: SAMEORIGIN 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: Set-Cookie: 1P_JAR=2019-04-10-16; expires=Fri, 10-May-2019 16:08:44 GMT; path=/; domain=.google.com 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: Set-Cookie: NID=181=bwE86MbCUawsBGBqs_fbVtR-dbTSa4bch72lxHSb7KkM9bxznXqB3nk65g-EKFS_M1mB9LlYUDrSGlqc_-pq6UbxzZAeQxp9LgGNmFXK2NFqR3FjtVseMt_SDO5_oV-GaCnjzLZSwHAWyNtXGIKj_Is-PwPRUgugsACmhglVlDA; expires=Thu, 10-Oct-2019 16:08:44 GMT; path=/; domain=.google.com; HttpOnly 10-4-2019 18:08:47 | | [http] [ID#0] Received header from server: 10-4-2019 18:08:47 | | [http] [ID#0] Info: Connection #14792 to host www.google.com left intact 10-4-2019 18:08:48 | | Internet access OK - project servers may be temporarily down. I will set some more debug logs to help you understand the problem |
Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0 |
According to BOINC Wiki, next_rpc_delay means: "Make another RPC ASAP after this amount of time elapses". So, with next_rpc_delay of 180 secs, we are forcing ALL clients to contact server every 3 mins, even if they have nothing to report. Looks like a huge burden on the server. I have checked the settings of my other BOINC projects (in ProgramData\BOINC) and apparently none use the next_rpc_delay setting at all, they use only request_delay. Perhaps we would be better off with just a request_delay setting of 180 secs and next_rpc_delay unspecified? |
Send message Joined: 30 Mar 09 Posts: 63 Credit: 621,582,726 RAC: 0 |
Another log; 10-4-2019 18:30:58 | | [work_fetch] ------- start work fetch state ------- 10-4-2019 18:30:58 | | [work_fetch] target work buffer: 259200.00 + 0.00 sec 10-4-2019 18:30:58 | | [work_fetch] --- project states --- 10-4-2019 18:30:58 | Milkyway@Home | [work_fetch] REC 119466.096 prio -38.414 can't request work: scheduler RPC backoff (8735.63 sec) 10-4-2019 18:30:58 | | [work_fetch] --- state for CPU --- 10-4-2019 18:30:58 | | [work_fetch] shortfall 1386459.51 nidle 0.00 saturated 28117.82 busy 0.00 10-4-2019 18:30:58 | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences 10-4-2019 18:30:58 | | [work_fetch] --- state for AMD/ATI GPU --- 10-4-2019 18:30:58 | | [work_fetch] shortfall 113602.96 nidle 0.00 saturated 144319.80 busy 0.00 10-4-2019 18:30:58 | Milkyway@Home | [work_fetch] share 0.000 10-4-2019 18:30:58 | | [work_fetch] ------- end work fetch state ------- 10-4-2019 18:30:58 | | [work_fetch] No project chosen for work fetch 10-4-2019 18:31:01 | | [work_fetch] Request work fetch: Backoff ended for Cosmology@Home 10-4-2019 18:31:04 | | [work_fetch] ------- start work fetch state ------- 10-4-2019 18:31:04 | | [work_fetch] target work buffer: 259200.00 + 0.00 sec 10-4-2019 18:31:04 | | [work_fetch] --- project states --- 10-4-2019 18:31:04 | Milkyway@Home | [work_fetch] REC 119466.096 prio -54.314 can't request work: scheduler RPC backoff (8730.30 sec) 10-4-2019 18:31:04 | | [work_fetch] --- state for CPU --- 10-4-2019 18:31:04 | | [work_fetch] shortfall 1386444.14 nidle 0.00 saturated 28125.51 busy 0.00 10-4-2019 18:31:04 | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences 10-4-2019 18:31:04 | | [work_fetch] --- state for AMD/ATI GPU --- 10-4-2019 18:31:04 | | [work_fetch] shortfall 113650.18 nidle 0.00 saturated 144237.46 busy 0.00 10-4-2019 18:31:04 | Milkyway@Home | [work_fetch] share 0.000 10-4-2019 18:31:04 | | [work_fetch] ------- end work fetch state ------- 10-4-2019 18:31:05 | | [http_xfer] [ID#1] HTTP: wrote 2946 bytes 10-4-2019 18:31:05 | | [work_fetch] Request work fetch: RPC complete 10-4-2019 18:31:10 | | [work_fetch] ------- start work fetch state ------- 10-4-2019 18:31:10 | | [work_fetch] target work buffer: 259200.00 + 0.00 sec 10-4-2019 18:31:10 | | [work_fetch] --- project states --- 10-4-2019 18:31:10 | Milkyway@Home | [work_fetch] REC 119466.096 prio -38.405 can't request work: scheduler RPC backoff (8723.90 sec) 10-4-2019 18:31:10 | | [work_fetch] --- state for CPU --- 10-4-2019 18:31:10 | | [work_fetch] shortfall 1386446.16 nidle 0.00 saturated 28124.50 busy 0.00 10-4-2019 18:31:10 | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences 10-4-2019 18:31:10 | | [work_fetch] --- state for AMD/ATI GPU --- 10-4-2019 18:31:10 | | [work_fetch] shortfall 113699.39 nidle 0.00 saturated 144155.11 busy 0.00 10-4-2019 18:31:10 | Milkyway@Home | [work_fetch] share 0.000 10-4-2019 18:31:10 | | [work_fetch] ------- end work fetch state ------- 10-4-2019 18:31:10 | | [work_fetch] No project chosen for work fetch 10-4-2019 18:31:12 | | [work_fetch] Request work fetch: Backoff ended for Cosmology@Home 10-4-2019 18:31:15 | | [work_fetch] ------- start work fetch state ------- 10-4-2019 18:31:15 | | [work_fetch] target work buffer: 259200.00 + 0.00 sec 10-4-2019 18:31:15 | | [work_fetch] --- project states --- 10-4-2019 18:31:15 | Milkyway@Home | [work_fetch] REC 119466.096 prio -54.303 can't request work: scheduler RPC backoff (8718.85 sec) 10-4-2019 18:31:15 | | [work_fetch] --- state for CPU --- 10-4-2019 18:31:15 | | [work_fetch] shortfall 1386446.17 nidle 0.00 saturated 28124.49 busy 0.00 10-4-2019 18:31:15 | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences 10-4-2019 18:31:15 | | [work_fetch] --- state for AMD/ATI GPU --- 10-4-2019 18:31:15 | | [work_fetch] shortfall 113741.79 nidle 0.00 saturated 144090.83 busy 0.00 10-4-2019 18:31:15 | Milkyway@Home | [work_fetch] share 0.000 10-4-2019 18:31:15 | | [work_fetch] ------- end work fetch state ------- 10-4-2019 18:31:15 | | [work_fetch] No project chosen for work fetch 10-4-2019 18:32:00 | Milkyway@Home | Message from task: 0 10-4-2019 18:32:00 | | [work_fetch] Request work fetch: application exited 10-4-2019 18:32:00 | Milkyway@Home | Computation for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554838388_240454_1 finished 10-4-2019 18:32:00 | Milkyway@Home | Starting task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554838388_188776_1 10-4-2019 18:32:01 | | [work_fetch] ------- start work fetch state ------- 10-4-2019 18:32:01 | | [work_fetch] target work buffer: 259200.00 + 0.00 sec 10-4-2019 18:32:01 | | [work_fetch] --- project states --- 10-4-2019 18:32:01 | Milkyway@Home | [work_fetch] REC 119467.655 prio -54.178 can't request work: scheduler RPC backoff (8673.18 sec) 10-4-2019 18:32:01 | | [work_fetch] --- state for CPU --- 10-4-2019 18:32:01 | | [work_fetch] shortfall 1386531.14 nidle 0.00 saturated 28082.01 busy 0.00 10-4-2019 18:32:01 | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences 10-4-2019 18:32:01 | | [work_fetch] --- state for AMD/ATI GPU --- 10-4-2019 18:32:01 | | [work_fetch] shortfall 114107.80 nidle 0.00 saturated 143502.34 busy 0.00 10-4-2019 18:32:01 | Milkyway@Home | [work_fetch] share 0.000 10-4-2019 18:32:01 | | [work_fetch] ------- end work fetch state ------- 10-4-2019 18:32:01 | | [work_fetch] No project chosen for work fetch |
Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0 |
Alright, I can see in my sched_reply_milkyway.cs.rpi.edu_milkyway.xml that next_rpc_delay is now deleted. The server is working much more smoothly now. But that old problem remains - it's possible to get 200 tasks now, but the client goes through them without getting any new ones. It reports XX completed tasks about every 90 secs and it requests new work every time, but it doesn't get any. Have no idea what is causing this problem, but it's probably unrelated with next_rpc_delay. Jake, perhaps you can try increasing request_delay from 91 secs to 180 or so, maybe it would help? It's just a wild guess, I can't think of anything else, perhaps someone more knowledgeable will be of more help. |
Send message Joined: 27 Apr 18 Posts: 11 Credit: 72,923,580 RAC: 0 |
Yeep, Vortac! I agree with you. My computers are "running" through all the downloaded workunits and uploading finished WUs is running very smoothly after some upload problems throughout our afternoon hours. Please, Jake and all the others working for the project, conserve the actual excellent status till ethernity... (if possible) In the past I changed settings for downloading new M@H WUs from 2 + 2 days (= 4 days) to 2 + 4 or 2 + 5 days or more over and over again. But I did not received more than 20 WUs for my i9-7900X computer for months. The i7-8700K computer always received 50+ new workunits. During server down times my i9 often ran out of workunits. Hopefully that's not important anymore ?! I hope I can finish most workunits (or all of them) within the time-limits (April 21 and 22). Greetings from Remscheid Manfred "Yoda" |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey Vortac, I'm equally stumped here. My only thought is that sometimes we get unlucky and you request work when another group of people have also request/before the feeder can refill the queue. I'm going to do a little thinking before I try to implement any solutions to this issue for now. Best, Jake |
Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0 |
I'm equally stumped here. My only thought is that sometimes we get unlucky and you request work when another group of people have also request/before the feeder can refill the queue. I'm going to do a little thinking before I try to implement any solutions to this issue for now. Nah, I don't think it's up to luck. After the queue is completely emptied (all tasks completed AND reported), it always gets refilled with new 200 tasks the first time client contacts the server. But there's always a few mins of idle, from the time the queue is completely emptied to the time when client contacts the server and gets a new set of 200 tasks. Few minutes are not a big deal obviously, but it would be interesting to track down the exact issue, haven't seen anything similar before. |
Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0 |
Until you do find a solution, can you increase the max allowed WUs in progress at once per GPU from the current 200 (600 max) to maybe double, say 1200 max? That would at least decrease the amount of idle time per day. |
Send message Joined: 8 Jan 18 Posts: 44 Credit: 43,781,437 RAC: 4,605 |
I'm equally stumped here. My only thought is that sometimes we get unlucky and you request work when another group of people have also request/before the feeder can refill the queue. I'm going to do a little thinking before I try to implement any solutions to this issue for now.I've only been casually lurking in this thread, but now after re-reading through everything I have a couple of questions. Is everyone that is experiencing this problem running out of tasks for their CPU, GPU, or both? It may be helpful to know if you are restricting the number of CPUs or GPUs that work on tasks. If you are experiencing this problem with just GPUs, are they AMD GPUs, or Nvidia GPUs? The reason I ask is that I am having a similar problem with SETI@Home. I can maintain a healthy queue of CPU tasks, but my AMD GPU (Vega 8, part of the Ryzen 3 2200G) will only download one task at a time, if I'm lucky. I have not really played with any settings to attempt to trick it into downloading more tasks. I did pose this question on the BOINC form here, and the bug is noted and is being worked on (eventually). I don't know if the problem here at MW is the same one that I am experiencing at SETI. My Ryzen rig only crunches MW as a backup and I have not tested to see if the same problem occurs if I run MW exclusively. However, I thought that I would point out the problem that I have been experiencing in case it is the same one that you are experiencing. |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey Everyone, I bumped up the number of workunits allowed at any given time a little bit. Let me know if that helps. Jake |
Send message Joined: 27 Apr 18 Posts: 11 Credit: 72,923,580 RAC: 0 |
Hi Bill, to start with I only calculate CPU based Milkyway@Home tasks. My GPUs are doing Collatz workunits with some Einstein@Home workunits (approx. 10-20 per day) throughout the day. Until Tuesday I had a lot of problems with Milkyway@Home. I received only 20 workunits at any given time for my computer equipped with an Intel Core i9-7900X with 20 processors. For my other computer (equipped with an Intel Core i7-8700K with 12 processors) I received dozens of workunits. At first I tried to change the amount of downloaded workunits from 2 + 2 days to 2 + 4, 2 + 5 and more but... a failure! Tuesday evening (local time = UTC+2 hours) I suddenly received hundreds of M@H workunits for both of my comupters (in total: 1,200 workunits) Due to fast CPUs I could decrease the number to 600 till this morning. Unfortunately I didn't disabled downloading new tasks on my i9 machine before I went to work at 6:30am. So I got 280+ new tasks today with a total of actually 900 for both machines. Now I disabled settings to finished these huge amount of tasks first till 23 April. But thanks to Jake Milkyway@Home servers seem to work much, much better and much more stable than before!!! Earlier I also worked on both M@H CPU and GPU tasks but actually I'm concentrating on Collatz Conjuncture GPU tasks so I only crunch M@H CPU tasks. The i9 CPU I can work on 20 tasks in parallel. That's enough... Concerning your problem with SETI@Home I had the same trouble with S@H and Einstein@Home. It needed much patience and weeks of experiments to find a solution that works WITH ME. Maybe my advise may also help you. My standard setting is "receiving tasks for 2 days plus another 2 days", in total for 4 days. I increased the maximum number of downloaded tasks several times to 2 + 4 days. But it didn't worked. Manual updating wasn't successful, too. So I let do the manager the downloading automatically. To do this I stopped and closed the manager and RESTARTED (!) Windows 10 (although probably this isn't necessary). Then I restarted Windows and then the BOINC manager and et voilà new tasks were downloaded. On an other occasion this did not worked. So I waited until the next morning. When I restarted BOINC working it worked. I also noticed that you have to activate ALL tasks of the respective project in your manager so the manager can download new tasks. I don't know why. If you have deactivated some or all project tasks - donwloading new tasks will fail acc. to my experience. Under normal conditions a high number of tasks for a single project is not necessary. But I made the experience that S@H, E@H and M@H servers are down frequently for maintainance or because of other reasons for several hours once every week (sometimes only a few hours, sometimes even longer). From late autuum 2018 till, let's say, February or the beginning of march this year Milkyway@Home had a lot of trouble with its server stability. Servers were down for many days so you ran out of new tasks when you use the standard settings (that's because I increased the maximum number of tasks to be downloaded) and your computer is just consuming electric power without doing anything useful. I haven't went deeper into the probable options of the projects. I only created special "app_config.xml" files to manipulate project behaviour with the help of more experienced users. I hope this and the changes Jake made in the meantime since you posted will help. Best wishes and good luck! Manfred BTW... my graphic cards are nVidia GeForce GTX 1080 Ti and GTX 1070 Ti. |
Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0 |
Bill, I looked at your logs and I think this problem is completely different. For some reason, your client requests 0 secs of GPU work for SETI@home - and receives the same. But in this case (Milkyway@home), client requests lots of GPU work, however none is assigned from the server: 11/04/2019 18:18:06 | Milkyway@Home | [work_fetch] set_request() for NVIDIA GPU: ninst 1 nused_total 32.87 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 127884.51 11/04/2019 18:18:06 | Milkyway@Home | [sched_op] Starting scheduler request 11/04/2019 18:18:06 | Milkyway@Home | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (127884.51 sec, 0.00 inst) 11/04/2019 18:18:06 | Milkyway@Home | Sending scheduler request: To fetch work. 11/04/2019 18:18:06 | Milkyway@Home | Reporting 6 completed tasks 11/04/2019 18:18:06 | Milkyway@Home | Requesting new tasks for NVIDIA GPU 11/04/2019 18:18:06 | Milkyway@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices 11/04/2019 18:18:06 | Milkyway@Home | [sched_op] NVIDIA GPU work request: 127884.51 seconds; 0.00 devices 11/04/2019 18:18:09 | Milkyway@Home | Scheduler request completed: got 0 new tasks 11/04/2019 18:18:09 | Milkyway@Home | [sched_op] Server version 713 11/04/2019 18:18:09 | Milkyway@Home | Project requested delay of 91 seconds 11/04/2019 18:18:09 | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554915636_381123_0 11/04/2019 18:18:09 | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554915636_381183_0 11/04/2019 18:18:09 | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554915636_381167_0 11/04/2019 18:18:09 | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554915636_381153_0 11/04/2019 18:18:09 | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554915636_381131_0 11/04/2019 18:18:09 | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1554915636_381217_0 11/04/2019 18:18:09 | Milkyway@Home | [work_fetch] backing off NVIDIA GPU 699 sec 11/04/2019 18:18:09 | Milkyway@Home | [sched_op] Deferring communication for 00:01:31 11/04/2019 18:18:09 | Milkyway@Home | [sched_op] Reason: requested by project 11/04/2019 18:18:09 | | [work_fetch] Request work fetch: RPC complete |
Send message Joined: 8 Jan 18 Posts: 44 Credit: 43,781,437 RAC: 4,605 |
Thanks for the input, Manfred and Vortac. From what I understand in the post on BOINC, the problem I'm experiencing is an artificially high REC (not RAC) due to an incorrect GFLOPS calculation. Perhaps I should have mentioned that before. Regardless, if my problem isn't related to your problem then I suppose it is moot. Manfred, I did adjust the storage requirements in the past, and it did not help at all. My problem is an actual bug in the code that needs to get corrected. Other people will fix the bug once time allows, so I will just have to wait it out until then. |
Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0 |
This is how a successful RPC with the Milkyway server looks in the Event Log. No tasks are reported (because the queue is completely empty by now) but 200 new tasks are assigned: 11/04/2019 18:56:52 | Milkyway@Home | [work_fetch] set_request() for NVIDIA GPU: ninst 1 nused_total 0.00 nidle_now 0.20 fetch share 1.00 req_inst 1.00 req_secs 129416.79 11/04/2019 18:56:52 | Milkyway@Home | [sched_op] Starting scheduler request 11/04/2019 18:56:52 | Milkyway@Home | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (129416.79 sec, 1.00 inst) 11/04/2019 18:56:52 | Milkyway@Home | Sending scheduler request: Requested by user. 11/04/2019 18:56:52 | Milkyway@Home | Requesting new tasks for NVIDIA GPU 11/04/2019 18:56:52 | Milkyway@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices 11/04/2019 18:56:52 | Milkyway@Home | [sched_op] NVIDIA GPU work request: 129416.79 seconds; 1.00 devices 11/04/2019 18:56:56 | | [work_fetch] Request work fetch: project finished uploading 11/04/2019 18:56:56 | Milkyway@Home | Scheduler request completed: got 200 new tasks 11/04/2019 18:56:56 | Milkyway@Home | [sched_op] Server version 713 11/04/2019 18:56:56 | Milkyway@Home | Project requested delay of 91 seconds 11/04/2019 18:56:56 | Milkyway@Home | [sched_op] estimated total CPU task duration: 0 seconds 11/04/2019 18:56:56 | Milkyway@Home | [sched_op] estimated total NVIDIA GPU task duration: 12201 seconds 11/04/2019 18:56:56 | Milkyway@Home | [sched_op] Deferring communication for 00:01:31 11/04/2019 18:56:56 | Milkyway@Home | [sched_op] Reason: requested by project 11/04/2019 18:56:56 | | [work_fetch] Request work fetch: RPC complete |
Send message Joined: 24 Jul 12 Posts: 40 Credit: 7,123,301,054 RAC: 0 |
Vortac Very interesting. From your log this one worked: 11/04/2019 18:56:52 | Milkyway@Home | [sched_op] NVIDIA GPU work request: 129416.79 seconds; 1.00 devices 11/04/2019 18:56:56 | Milkyway@Home | Scheduler request completed: got 200 new tasks and this one failed: 11/04/2019 18:18:06 | Milkyway@Home | [sched_op] NVIDIA GPU work request: 127884.51 seconds; 0.00 devices 11/04/2019 18:18:09 | Milkyway@Home | Scheduler request completed: got 0 new tasks Notice the failed request had 0.00 devices. The question here is why 0 devices? Joe |
Send message Joined: 7 May 14 Posts: 57 Credit: 206,540,646 RAC: 6 |
hi, im still not getting new wu's after completion? ive have to manually update if I catch it ?? |
©2024 Astroinformatics Group