Message boards :
News :
New Server Update
Message board moderation
Previous · 1 · 2 · 3 · 4
Author | Message |
---|---|
Send message Joined: 12 Nov 16 Posts: 3 Credit: 4,435,482 RAC: 0 |
When I look at the sched_reply_milkyway.cs.rpi.edu_milkyway.xmlfile, it refers to the following files: http://milkyway3.phys.rpi.edu/milkyway/download/milkyway_1.46_windows_x86_64__opencl_ati_101.exe http://milkyway3.phys.rpi.edu/milkyway/download/milkyway_1.46_windows_x86_64.exe When I open these links in chrome, the site is unreachable. This is a new install for my current computer, so the executables weren't downloaded yet. When I look at the URL https://milkyway.cs.rpi.edu/milkyway/download/, I am able to find the executables. This seems to be a problem from the migration, that the scheduler didn't change the URLs? The real files are now at : https://milkyway.cs.rpi.edu/milkyway/download/milkyway_1.46_windows_x86_64__opencl_ati_101.exe https://milkyway.cs.rpi.edu/milkyway/download/milkyway_1.46_windows_x86_64.exe I downloaded them manually, but had difficulty BOINC to realize that the files were already downloaded, to use them, and not to say that there are stalled downloads. Now it seems to work |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey eeeeee, Thanks for catching that. I'll take a look and see why it's still putting the local DNS name in instead of milkyway.cs.rpi.edu. Jake |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey bluestang, Thank you so much! Unfortunately, I can't use the banner image you provided because RPI requires us to have the Rensselaer logo on the banner. When you get the new one made up, I'll take a look at it and run it by Heidi. If she approves, it will become the new banner. I really appreciate it. Best, Jake |
Send message Joined: 15 Dec 10 Posts: 3 Credit: 135,590,045 RAC: 2,913 |
Is there a non techie fix for the failed download problem? thanks |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
Hi Jake, maybe I'm not understanding all these things correctly, BUT: ... I would personally worry more about the download problem, than about an "unimportant" banner (that doesn't impair crunching) ... For example: Check out messages ... #68390 or mine #68377 ... That milkyway ..... exe is still not downloading, unless one uses a box of tools ... ... and my box of tools is wearing out ... PLEASE FIX ! Thanks, in spite of all that, for your continious effort to sort everything out ... I know from our own server/software migrations that that is a lot of work (mostly frustrating), but it will eventually be rewarded. Greetings |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey San-Fernando-Valley, I have now implemented a fix for the download issue. Hopefully everything is working on your ends now. Best, Jake |
Send message Joined: 8 Jan 18 Posts: 44 Credit: 43,598,770 RAC: 4,913 |
I found two very minor bugs with the website. To prevent all the little stuff from bogging down the more important crunching stuff, I have created a new post in the website form here. Besides, nobody had posted in there since 2017, so I'm sure the forum was lonely ;). |
Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0 |
It's still hard to obtain enough work. I often get "Scheduler request completed: got 0 new tasks" and "Project has no tasks available". |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
Hi Jake, thanks for your time !! Working fine now - just like before the server migration. Have a beer on me! BTW: The present header looks fine to me. Cheers SFV |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey Vortac, Sorry to hear that you can't get enough work. I'll think about increasing the number of workunits we cache on the server. We have considerably more resources on the server now so that shouldn't be an issue anymore. Jake |
Send message Joined: 18 Jul 10 Posts: 76 Credit: 636,511,214 RAC: 28,125 |
Jake - I have the same issue a Vortac. For what ever reason I run out of tasks (GPU). The Boinc Manager keep requesting new tasks but it just says "Got 0 tasks". If I manually ask for a project update, I get 30 new tasks. The a couple of minutes later 30 more. Then 30 more. I don't think the "30" is the issue. Maybe there is a "debug" parm that might indicate WHY it is not downloading new tasks. |
Send message Joined: 18 Jul 10 Posts: 76 Credit: 636,511,214 RAC: 28,125 |
Vortac - Do you have a fast gpu? I have theory on what is happening. I finish a GPU task every 4 minutes. Milkyway reports it and requests new tasks. There is a "timer" in Milkyway that only lets you download new tasks every 600 seconds (I think). It is called "backoiff". If a new download request was made before the 600 seconds is up, no tasks get downloaded and the timer is reset to 600 seconds. Since I am finishing a task every 240 seconds, I eventually run out of work. Now the requests for new work every 240 seconds stop. Eventually, the Boinc Manager will request new work. If no work is downloaded, the Bonic Manager keeps increasing the time between requests until the 600 seconds threshold is exceeded. Then new works starts flowing again. |
Send message Joined: 22 Apr 09 Posts: 95 Credit: 4,808,181,963 RAC: 0 |
Indeed, my machine with Titan V is running out of work regularly. My other machine with two 7970s (which are much slower) is also running out of work occasionally, but not nearly so often. |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Vortac and wb8ili, Is this the case for every request? What is the most workunits you have received from a request? Our current configuration settings allow for 600 download per request so I'm trying to pinpoint where this error is occurring. Best, Jake |
Send message Joined: 18 Jul 10 Posts: 76 Credit: 636,511,214 RAC: 28,125 |
Jake - I get 30 tasks when I manually request an update. Always 30. I can manually request tasks every 90 seconds. Less than 90 seconds gets a "last request too recent" message. Every 90+ seconds I can get 30 new tasks. My theory (below) has to be modified to indicate "user requested" requests for work give different results that reporting/requests. Shown below is a typical sequence (I added the --->). Task ends. Request for work No tasks downloaded. And then two messages which I think might be important. --> 3/27/2019 2:38:58 PM | Milkyway@Home | Computation for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1553630102_244511_0 finished 3/27/2019 2:38:59 PM | Milkyway@Home | Starting task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1553630102_244495_0 3/27/2019 2:39:00 PM | | [work_fetch] ------- start work fetch state ------- 3/27/2019 2:39:00 PM | | [work_fetch] target work buffer: 432000.00 + 0.00 sec 3/27/2019 2:39:00 PM | | [work_fetch] --- project states --- 3/27/2019 2:39:00 PM | climateprediction.net | [work_fetch] REC 1130.804 prio -0.034 can request work 3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] REC 105434.987 prio -99.943 can request work 3/27/2019 2:39:00 PM | SETI@home | [work_fetch] REC 29324.189 prio -0.000 can't request work: "no new tasks" requested via Manager 3/27/2019 2:39:00 PM | World Community Grid | [work_fetch] REC 65.675 prio -0.000 can't request work: "no new tasks" requested via Manager 3/27/2019 2:39:00 PM | | [work_fetch] --- state for CPU --- 3/27/2019 2:39:00 PM | | [work_fetch] shortfall 0.00 nidle 0.00 saturated 764204.67 busy 0.00 3/27/2019 2:39:00 PM | climateprediction.net | [work_fetch] share 1.000 3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences 3/27/2019 2:39:00 PM | SETI@home | [work_fetch] share 0.000 blocked by project preferences 3/27/2019 2:39:00 PM | World Community Grid | [work_fetch] share 0.000 3/27/2019 2:39:00 PM | | [work_fetch] --- state for NVIDIA GPU --- 3/27/2019 2:39:00 PM | | [work_fetch] shortfall 395988.98 nidle 0.00 saturated 36011.02 busy 0.00 3/27/2019 2:39:00 PM | climateprediction.net | [work_fetch] share 0.000 no applications 3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] share 1.000 3/27/2019 2:39:00 PM | SETI@home | [work_fetch] share 0.000 3/27/2019 2:39:00 PM | World Community Grid | [work_fetch] share 0.000 3/27/2019 2:39:00 PM | | [work_fetch] ------- end work fetch state ------- 3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] set_request() for NVIDIA GPU: ninst 1 nused_total 139.00 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 395988.98 3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (395988.98 sec, 0.00 inst) --> 3/27/2019 2:39:00 PM | Milkyway@Home | Sending scheduler request: To fetch work. --> 3/27/2019 2:39:00 PM | Milkyway@Home | Reporting 1 completed tasks --> 3/27/2019 2:39:00 PM | Milkyway@Home | Requesting new tasks for NVIDIA GPU --> 3/27/2019 2:39:01 PM | Milkyway@Home | Scheduler request completed: got 0 new tasks 3/27/2019 2:39:01 PM | Milkyway@Home | [work_fetch] backing off NVIDIA GPU 873 sec 3/27/2019 2:39:01 PM | | [work_fetch] Request work fetch: RPC complete 3/27/2019 2:39:06 PM | | [work_fetch] ------- start work fetch state ------- 3/27/2019 2:39:06 PM | | [work_fetch] target work buffer: 432000.00 + 0.00 sec 3/27/2019 2:39:06 PM | | [work_fetch] --- project states --- 3/27/2019 2:39:06 PM | climateprediction.net | [work_fetch] REC 1130.804 prio -1.023 can request work --> 3/27/2019 2:39:06 PM | Milkyway@Home | [work_fetch] REC 105434.987 prio -3331.879 can't request work: scheduler RPC backoff (85.82 sec) 3/27/2019 2:39:06 PM | SETI@home | [work_fetch] REC 29324.189 prio -0.000 can't request work: "no new tasks" requested via Manager 3/27/2019 2:39:06 PM | World Community Grid | [work_fetch] REC 65.675 prio -0.000 can't request work: "no new tasks" requested via Manager 3/27/2019 2:39:06 PM | | [work_fetch] --- state for CPU --- 3/27/2019 2:39:06 PM | | [work_fetch] shortfall 0.00 nidle 0.00 saturated 764197.27 busy 0.00 3/27/2019 2:39:06 PM | climateprediction.net | [work_fetch] share 1.000 3/27/2019 2:39:06 PM | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences 3/27/2019 2:39:06 PM | SETI@home | [work_fetch] share 0.000 blocked by project preferences 3/27/2019 2:39:06 PM | World Community Grid | [work_fetch] share 0.000 3/27/2019 2:39:06 PM | | [work_fetch] --- state for NVIDIA GPU --- 3/27/2019 2:39:06 PM | | [work_fetch] shortfall 395994.02 nidle 0.00 saturated 36005.98 busy 0.00 3/27/2019 2:39:06 PM | climateprediction.net | [work_fetch] share 0.000 no applications --> 3/27/2019 2:39:06 PM | Milkyway@Home | [work_fetch] share 0.000 project is backed off (resource backoff: 867.71, inc 600.00) 3/27/2019 2:39:06 PM | SETI@home | [work_fetch] share 0.000 3/27/2019 2:39:06 PM | World Community Grid | [work_fetch] share 0.000 3/27/2019 2:39:06 PM | | [work_fetch] ------- end work fetch state ------- 3/27/2019 2:39:06 PM | | [work_fetch] No project chosen for work fetch |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Okay, I think I pinpointed the issue. I think we have too few workunits preloaded into shared memory. The workunits are unsent and available in the database, but they're never being pulled into memory for the scheduler to assign. Working on a fix. Best, Jake |
Send message Joined: 2 Oct 16 Posts: 167 Credit: 1,007,578,527 RAC: 32,302 |
Vortac and wb8ili, I am seeing the same as wb8ili and I described it above: It seems like I can't get any work until my queue runs completely dry and then I'll download 200 more tasks. Those 200 will complete and the queue will continue to drop. Tasks are being reported immediately but no tasks are downloaded to keep the queue topped off. If I try to manually update I'm just told the last request was too recent. That continues until everything is gone, I run a couple of tasks from a backup project and then I can update to get 200 more tasks. The older server would keep tasks at 80 pretty much at all times without user intervention. |
Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0 |
Since it's a website thing, I posted a new banner in that thread... https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4421&postid=68445#68445 |
Send message Joined: 12 Nov 16 Posts: 3 Credit: 4,435,482 RAC: 0 |
The fix is indeed working, I reset-ed the project, and tracked the files, and the good URL is sent. Good fix! |
©2024 Astroinformatics Group