Welcome to MilkyWay@home

New Server Update


Advanced search

Message boards : News : New Server Update
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
eeeeee

Send message
Joined: 12 Nov 16
Posts: 3
Credit: 4,173,114
RAC: 1,706
3 million credit badge3 year member badge
Message 68390 - Posted: 25 Mar 2019, 15:27:36 UTC - in response to Message 68338.  

When I look at the
sched_reply_milkyway.cs.rpi.edu_milkyway.xml
file, it refers to the following files:
http://milkyway3.phys.rpi.edu/milkyway/download/milkyway_1.46_windows_x86_64__opencl_ati_101.exe
http://milkyway3.phys.rpi.edu/milkyway/download/milkyway_1.46_windows_x86_64.exe

When I open these links in chrome, the site is unreachable. This is a new install for my current computer, so the executables weren't downloaded yet.

When I look at the URL https://milkyway.cs.rpi.edu/milkyway/download/, I am able to find the executables. This seems to be a problem from the migration, that the scheduler didn't change the URLs?
The real files are now at :
https://milkyway.cs.rpi.edu/milkyway/download/milkyway_1.46_windows_x86_64__opencl_ati_101.exe
https://milkyway.cs.rpi.edu/milkyway/download/milkyway_1.46_windows_x86_64.exe

I downloaded them manually, but had difficulty BOINC to realize that the files were already downloaded, to use them, and not to say that there are stalled downloads. Now it seems to work
ID: 68390 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 0
50 million credit badge7 year member badgeextraordinary contributions badge
Message 68393 - Posted: 26 Mar 2019, 14:18:46 UTC

Hey eeeeee,

Thanks for catching that. I'll take a look and see why it's still putting the local DNS name in instead of milkyway.cs.rpi.edu.

Jake
ID: 68393 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 0
50 million credit badge7 year member badgeextraordinary contributions badge
Message 68394 - Posted: 26 Mar 2019, 14:21:06 UTC

Hey bluestang,

Thank you so much! Unfortunately, I can't use the banner image you provided because RPI requires us to have the Rensselaer logo on the banner. When you get the new one made up, I'll take a look at it and run it by Heidi. If she approves, it will become the new banner.

I really appreciate it.

Best,

Jake
ID: 68394 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Steven Case

Send message
Joined: 15 Dec 10
Posts: 3
Credit: 128,199,278
RAC: 3,405
100 million credit badge9 year member badge
Message 68395 - Posted: 26 Mar 2019, 16:01:55 UTC

Is there a non techie fix for the failed download problem?
thanks
ID: 68395 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
San-Fernando-Valley

Send message
Joined: 13 Apr 17
Posts: 17
Credit: 37,361,247
RAC: 7,408
30 million credit badge3 year member badge
Message 68396 - Posted: 26 Mar 2019, 16:06:58 UTC

Hi Jake,

maybe I'm not understanding all these things correctly, BUT:

... I would personally worry more about the download problem, than about an "unimportant" banner (that doesn't impair crunching) ...

For example: Check out messages ... #68390 or mine #68377 ...

That milkyway ..... exe is still not downloading, unless one uses a box of tools ...
... and my box of tools is wearing out ...

PLEASE FIX !

Thanks, in spite of all that, for your continious effort to sort everything out ...

I know from our own server/software migrations that that is a lot of work (mostly frustrating), but it will eventually be rewarded.

Greetings
ID: 68396 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 0
50 million credit badge7 year member badgeextraordinary contributions badge
Message 68397 - Posted: 26 Mar 2019, 16:09:55 UTC

Hey San-Fernando-Valley,

I have now implemented a fix for the download issue. Hopefully everything is working on your ends now.

Best,

Jake
ID: 68397 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bill
Avatar

Send message
Joined: 8 Jan 18
Posts: 35
Credit: 15,847,818
RAC: 61,679
10 million credit badge2 year member badge
Message 68401 - Posted: 26 Mar 2019, 16:57:12 UTC

I found two very minor bugs with the website. To prevent all the little stuff from bogging down the more important crunching stuff, I have created a new post in the website form here. Besides, nobody had posted in there since 2017, so I'm sure the forum was lonely ;).
ID: 68401 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Vortac

Send message
Joined: 22 Apr 09
Posts: 95
Credit: 2,676,260,785
RAC: 2,136,689
2 billion credit badge10 year member badgeextraordinary contributions badge
Message 68415 - Posted: 27 Mar 2019, 8:48:45 UTC

It's still hard to obtain enough work. I often get "Scheduler request completed: got 0 new tasks" and "Project has no tasks available".
ID: 68415 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
San-Fernando-Valley

Send message
Joined: 13 Apr 17
Posts: 17
Credit: 37,361,247
RAC: 7,408
30 million credit badge3 year member badge
Message 68416 - Posted: 27 Mar 2019, 13:10:26 UTC - in response to Message 68397.  

Hi Jake,
thanks for your time !!

Working fine now - just like before the server migration.

Have a beer on me!

BTW: The present header looks fine to me.

Cheers
SFV
ID: 68416 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 0
50 million credit badge7 year member badgeextraordinary contributions badge
Message 68418 - Posted: 27 Mar 2019, 16:07:30 UTC

Hey Vortac,

Sorry to hear that you can't get enough work. I'll think about increasing the number of workunits we cache on the server. We have considerably more resources on the server now so that shouldn't be an issue anymore.

Jake
ID: 68418 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wb8ili

Send message
Joined: 18 Jul 10
Posts: 74
Credit: 365,109,939
RAC: 281,873
300 million credit badge10 year member badgeextraordinary contributions badge
Message 68419 - Posted: 27 Mar 2019, 16:16:16 UTC

Jake -

I have the same issue a Vortac. For what ever reason I run out of tasks (GPU). The Boinc Manager keep requesting new tasks but it just says "Got 0 tasks". If I manually ask for a project update, I get 30 new tasks. The a couple of minutes later 30 more. Then 30 more.

I don't think the "30" is the issue.

Maybe there is a "debug" parm that might indicate WHY it is not downloading new tasks.
ID: 68419 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wb8ili

Send message
Joined: 18 Jul 10
Posts: 74
Credit: 365,109,939
RAC: 281,873
300 million credit badge10 year member badgeextraordinary contributions badge
Message 68422 - Posted: 27 Mar 2019, 17:07:12 UTC

Vortac - Do you have a fast gpu?

I have theory on what is happening.

I finish a GPU task every 4 minutes.
Milkyway reports it and requests new tasks.
There is a "timer" in Milkyway that only lets you download new tasks every 600 seconds (I think). It is called "backoiff".
If a new download request was made before the 600 seconds is up, no tasks get downloaded and the timer is reset to 600 seconds.
Since I am finishing a task every 240 seconds, I eventually run out of work.

Now the requests for new work every 240 seconds stop.
Eventually, the Boinc Manager will request new work. If no work is downloaded, the Bonic Manager keeps increasing the time between requests until the 600 seconds threshold is exceeded.
Then new works starts flowing again.
ID: 68422 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Vortac

Send message
Joined: 22 Apr 09
Posts: 95
Credit: 2,676,260,785
RAC: 2,136,689
2 billion credit badge10 year member badgeextraordinary contributions badge
Message 68424 - Posted: 27 Mar 2019, 18:13:16 UTC - in response to Message 68422.  

Indeed, my machine with Titan V is running out of work regularly. My other machine with two 7970s (which are much slower) is also running out of work occasionally, but not nearly so often.
ID: 68424 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 0
50 million credit badge7 year member badgeextraordinary contributions badge
Message 68425 - Posted: 27 Mar 2019, 18:31:57 UTC

Vortac and wb8ili,

Is this the case for every request? What is the most workunits you have received from a request? Our current configuration settings allow for 600 download per request so I'm trying to pinpoint where this error is occurring.

Best,
Jake
ID: 68425 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wb8ili

Send message
Joined: 18 Jul 10
Posts: 74
Credit: 365,109,939
RAC: 281,873
300 million credit badge10 year member badgeextraordinary contributions badge
Message 68426 - Posted: 27 Mar 2019, 18:47:07 UTC

Jake -

I get 30 tasks when I manually request an update. Always 30.

I can manually request tasks every 90 seconds. Less than 90 seconds gets a "last request too recent" message.

Every 90+ seconds I can get 30 new tasks.

My theory (below) has to be modified to indicate "user requested" requests for work give different results that reporting/requests.

Shown below is a typical sequence (I added the --->).

Task ends.
Request for work
No tasks downloaded.
And then two messages which I think might be important.



--> 3/27/2019 2:38:58 PM | Milkyway@Home | Computation for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1553630102_244511_0 finished
3/27/2019 2:38:59 PM | Milkyway@Home | Starting task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1553630102_244495_0
3/27/2019 2:39:00 PM | | [work_fetch] ------- start work fetch state -------
3/27/2019 2:39:00 PM | | [work_fetch] target work buffer: 432000.00 + 0.00 sec
3/27/2019 2:39:00 PM | | [work_fetch] --- project states ---
3/27/2019 2:39:00 PM | climateprediction.net | [work_fetch] REC 1130.804 prio -0.034 can request work
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] REC 105434.987 prio -99.943 can request work
3/27/2019 2:39:00 PM | SETI@home | [work_fetch] REC 29324.189 prio -0.000 can't request work: "no new tasks" requested via Manager
3/27/2019 2:39:00 PM | World Community Grid | [work_fetch] REC 65.675 prio -0.000 can't request work: "no new tasks" requested via Manager
3/27/2019 2:39:00 PM | | [work_fetch] --- state for CPU ---
3/27/2019 2:39:00 PM | | [work_fetch] shortfall 0.00 nidle 0.00 saturated 764204.67 busy 0.00
3/27/2019 2:39:00 PM | climateprediction.net | [work_fetch] share 1.000
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences
3/27/2019 2:39:00 PM | SETI@home | [work_fetch] share 0.000 blocked by project preferences
3/27/2019 2:39:00 PM | World Community Grid | [work_fetch] share 0.000
3/27/2019 2:39:00 PM | | [work_fetch] --- state for NVIDIA GPU ---
3/27/2019 2:39:00 PM | | [work_fetch] shortfall 395988.98 nidle 0.00 saturated 36011.02 busy 0.00
3/27/2019 2:39:00 PM | climateprediction.net | [work_fetch] share 0.000 no applications
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] share 1.000
3/27/2019 2:39:00 PM | SETI@home | [work_fetch] share 0.000
3/27/2019 2:39:00 PM | World Community Grid | [work_fetch] share 0.000
3/27/2019 2:39:00 PM | | [work_fetch] ------- end work fetch state -------
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] set_request() for NVIDIA GPU: ninst 1 nused_total 139.00 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 395988.98
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (395988.98 sec, 0.00 inst)
--> 3/27/2019 2:39:00 PM | Milkyway@Home | Sending scheduler request: To fetch work.
--> 3/27/2019 2:39:00 PM | Milkyway@Home | Reporting 1 completed tasks
--> 3/27/2019 2:39:00 PM | Milkyway@Home | Requesting new tasks for NVIDIA GPU
--> 3/27/2019 2:39:01 PM | Milkyway@Home | Scheduler request completed: got 0 new tasks
3/27/2019 2:39:01 PM | Milkyway@Home | [work_fetch] backing off NVIDIA GPU 873 sec
3/27/2019 2:39:01 PM | | [work_fetch] Request work fetch: RPC complete
3/27/2019 2:39:06 PM | | [work_fetch] ------- start work fetch state -------
3/27/2019 2:39:06 PM | | [work_fetch] target work buffer: 432000.00 + 0.00 sec
3/27/2019 2:39:06 PM | | [work_fetch] --- project states ---
3/27/2019 2:39:06 PM | climateprediction.net | [work_fetch] REC 1130.804 prio -1.023 can request work
--> 3/27/2019 2:39:06 PM | Milkyway@Home | [work_fetch] REC 105434.987 prio -3331.879 can't request work: scheduler RPC backoff (85.82 sec)
3/27/2019 2:39:06 PM | SETI@home | [work_fetch] REC 29324.189 prio -0.000 can't request work: "no new tasks" requested via Manager
3/27/2019 2:39:06 PM | World Community Grid | [work_fetch] REC 65.675 prio -0.000 can't request work: "no new tasks" requested via Manager
3/27/2019 2:39:06 PM | | [work_fetch] --- state for CPU ---
3/27/2019 2:39:06 PM | | [work_fetch] shortfall 0.00 nidle 0.00 saturated 764197.27 busy 0.00
3/27/2019 2:39:06 PM | climateprediction.net | [work_fetch] share 1.000
3/27/2019 2:39:06 PM | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences
3/27/2019 2:39:06 PM | SETI@home | [work_fetch] share 0.000 blocked by project preferences
3/27/2019 2:39:06 PM | World Community Grid | [work_fetch] share 0.000
3/27/2019 2:39:06 PM | | [work_fetch] --- state for NVIDIA GPU ---
3/27/2019 2:39:06 PM | | [work_fetch] shortfall 395994.02 nidle 0.00 saturated 36005.98 busy 0.00
3/27/2019 2:39:06 PM | climateprediction.net | [work_fetch] share 0.000 no applications
--> 3/27/2019 2:39:06 PM | Milkyway@Home | [work_fetch] share 0.000 project is backed off (resource backoff: 867.71, inc 600.00)
3/27/2019 2:39:06 PM | SETI@home | [work_fetch] share 0.000
3/27/2019 2:39:06 PM | World Community Grid | [work_fetch] share 0.000
3/27/2019 2:39:06 PM | | [work_fetch] ------- end work fetch state -------
3/27/2019 2:39:06 PM | | [work_fetch] No project chosen for work fetch
ID: 68426 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 0
50 million credit badge7 year member badgeextraordinary contributions badge
Message 68427 - Posted: 27 Mar 2019, 18:52:59 UTC

Okay, I think I pinpointed the issue. I think we have too few workunits preloaded into shared memory. The workunits are unsent and available in the database, but they're never being pulled into memory for the scheduler to assign.

Working on a fix.

Best,

Jake
ID: 68427 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Oct 16
Posts: 145
Credit: 501,142,386
RAC: 125,904
500 million credit badge3 year member badge
Message 68433 - Posted: 27 Mar 2019, 22:12:55 UTC - in response to Message 68425.  

Vortac and wb8ili,

Is this the case for every request? What is the most workunits you have received from a request? Our current configuration settings allow for 600 download per request so I'm trying to pinpoint where this error is occurring.

Best,
Jake


I am seeing the same as wb8ili and I described it above:

It seems like I can't get any work until my queue runs completely dry and then I'll download 200 more tasks. Those 200 will complete and the queue will continue to drop. Tasks are being reported immediately but no tasks are downloaded to keep the queue topped off. If I try to manually update I'm just told the last request was too recent. That continues until everything is gone, I run a couple of tasks from a backup project and then I can update to get 200 more tasks. The older server would keep tasks at 80 pretty much at all times without user intervention.
ID: 68433 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bluestang

Send message
Joined: 13 Oct 16
Posts: 101
Credit: 779,574,593
RAC: 1,684,195
500 million credit badge3 year member badge
Message 68446 - Posted: 28 Mar 2019, 20:15:19 UTC

Since it's a website thing, I posted a new banner in that thread...

https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4421&postid=68445#68445
ID: 68446 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
eeeeee

Send message
Joined: 12 Nov 16
Posts: 3
Credit: 4,173,114
RAC: 1,706
3 million credit badge3 year member badge
Message 68448 - Posted: 28 Mar 2019, 21:48:10 UTC - in response to Message 68397.  

The fix is indeed working, I reset-ed the project, and tracked the files, and the good URL is sent. Good fix!
ID: 68448 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4

Message boards : News : New Server Update

©2020 Astroinformatics Group