Running Modfit on MilkyWay@home
log in

Advanced search

Message boards : News : Running Modfit on MilkyWay@home

Previous · 1 · 2 · 3 · 4
Author Message
Profile Thunder
Avatar
Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,034,515
RAC: 3,030

Message 64859 - Posted: 11 Jul 2016, 14:28:06 UTC
Last modified: 11 Jul 2016, 14:29:58 UTC

Well, there's still some sort of issue, because for the last 4 minutes, I've gotten back to back "Scheduler request completed: got 0 tasks" messages and the fast machine ran out of work.

Like I said, the change made a considerable difference, but it's still not enough to actually keep a fast GPU "fed".

Edit: The next request after waiting a couple minutes only sent 9 tasks, so there are still evidently times where the server is running dry
____________

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 438
Credit: 9,633,115
RAC: 158,695

Message 64860 - Posted: 11 Jul 2016, 15:38:56 UTC
Last modified: 11 Jul 2016, 18:47:28 UTC

I will continue to tweak it a bit later today. I have to give a talk in an hour and a half so expect to see an improvement in a couple of hours.

Jake

[edit]
Hey I just increased it a bit more. Let me know if you are still having requests which do not return work.

[/edit]

Profile Thunder
Avatar
Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,034,515
RAC: 3,030

Message 64864 - Posted: 11 Jul 2016, 22:34:02 UTC
Last modified: 11 Jul 2016, 22:36:24 UTC

Here's a machine that I noticed had run completely out. Tried updating and got this (all times are central):

36000 Milkyway@Home 7/11/2016 5:27:05 PM Reporting 12 completed tasks
36001 Milkyway@Home 7/11/2016 5:27:05 PM Requesting new tasks for AMD/ATI GPU
36002 Milkyway@Home 7/11/2016 5:27:07 PM Scheduler request completed: got 0 new tasks
36003 Milkyway@Home 7/11/2016 5:28:39 PM update requested by user
36004 Milkyway@Home 7/11/2016 5:28:42 PM Sending scheduler request: Requested by user.
36005 Milkyway@Home 7/11/2016 5:28:42 PM Requesting new tasks for AMD/ATI GPU
36006 Milkyway@Home 7/11/2016 5:28:45 PM Scheduler request completed: got 0 new tasks
36007 Milkyway@Home 7/11/2016 5:30:11 PM update requested by user
36008 Milkyway@Home 7/11/2016 5:30:15 PM Sending scheduler request: Requested by user.
36009 Milkyway@Home 7/11/2016 5:30:15 PM Requesting new tasks for AMD/ATI GPU
36010 Milkyway@Home 7/11/2016 5:30:17 PM Scheduler request completed: got 0 new tasks
36011 Milkyway@Home 7/11/2016 5:31:21 PM update requested by user
36012 Milkyway@Home 7/11/2016 5:31:23 PM Sending scheduler request: Requested by user.
36013 Milkyway@Home 7/11/2016 5:31:23 PM Requesting new tasks for AMD/ATI GPU
36014 Milkyway@Home 7/11/2016 5:31:26 PM Scheduler request completed: got 5 new tasks

The server status page (which I know only shows a snapshot) is saying only 12 tasks are available to send on it's most recent update.

Edit: Have tried several times since on 2 different machines and most updates are returning 0 tasks. :-(
____________

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 438
Credit: 9,633,115
RAC: 158,695

Message 64865 - Posted: 12 Jul 2016, 13:42:49 UTC

Thunder,

Looks like it just needed a little while to catch up. Your computer looks like it has 50+ workunits to crunch through now. Let me know if it still looks like a problem on your end.

Jake

Profile Thunder
Avatar
Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,034,515
RAC: 3,030

Message 64866 - Posted: 12 Jul 2016, 13:57:26 UTC

Only because I'm updating it manually any time I'm near it. (For the two machines that have reasonably fast GPUs, 50 tasks is about 4 minutes for 1, 55 minutes for the other)

It's kind of become "feast or famine" now. Most scheduler requests either return 0 tasks or nearly 60. There's very little in-between.

Judging from the increase in speed of the whole project, I'm guessing you've increased the available work and the users have responded by getting it (and getting it done).

I know you're concerned about computers returning a lot of error tasks and how that affects the science. One possible idea would be to increase the maximum error tasks on each WU from 2 to 3 (as a start) and see if that reduces the number that are completely thrown out as possibly "bug" WUs. I know of very few projects that do the number of tasks that this one does that has that threshold set so low.

Beyond that, there are mechanisms in BOINC to establish "trusted" vs non computers and assign more/less work accordingly.
____________

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 438
Credit: 9,633,115
RAC: 158,695

Message 64867 - Posted: 12 Jul 2016, 14:16:13 UTC

I will continue playing around with server settings today.

Jake

Profile Thunder
Avatar
Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,034,515
RAC: 3,030

Message 64868 - Posted: 12 Jul 2016, 14:41:26 UTC
Last modified: 12 Jul 2016, 14:45:23 UTC

I noticed I just downloaded a new set of parameters and looks like all of the tasks for them are failing with computation errors.

http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=691866&offset=0&show_names=0&state=6&appid=

Edit: It also just deferred communication on the last scheduler request for uh... a full 24 hours. O.o
____________

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 438
Credit: 9,633,115
RAC: 158,695

Message 64869 - Posted: 12 Jul 2016, 15:02:05 UTC

I just killed the runs with errors. Sorry about that.

Jake

Profile Thunder
Avatar
Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,034,515
RAC: 3,030

Message 64901 - Posted: 16 Jul 2016, 17:41:27 UTC

I'm not sure if something in particular changed on Friday or if it's just a matter of more computers being switched back to the project, but the server is back to not having nearly enough work available.

For about the last 24 hours, I've been seeing about 80-90% of scheduler requests for MW work (not N-body) giving the "got 0 new tasks" response.

I think it's a matter of how much work is being generated, based on the fact that the server status page continually shows the unsent tasks somewhere in the 'teens and the in progress numbers dropping continually. :-(
____________

Vortac
Send message
Joined: 22 Apr 09
Posts: 77
Credit: 1,052,830,866
RAC: 51,164

Message 64902 - Posted: 16 Jul 2016, 23:22:21 UTC - in response to Message 64901.

Same here, getting plenty of "0 new tasks" messages. Of course, on a powerful machine even one such occurrence is enough to empty the GPU queue. Sometimes I even get "No tasks sent. This computer has reached a limit on tasks in progress" although I have 0 (zero) queued GPU tasks. I guess it's referring to N-body CPU tasks, which are plentiful.

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 438
Credit: 9,633,115
RAC: 158,695

Message 64904 - Posted: 18 Jul 2016, 16:00:47 UTC

Okay, thank you for letting me know. I have an idea for how to fix that.

Jake

Profile Thunder
Avatar
Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,034,515
RAC: 3,030

Message 64908 - Posted: 19 Jul 2016, 14:41:33 UTC

I don't know what you changed recently, but all morning the server has been INCREDIBLY reliable about always sending the full amount requested!

It's been great! :-)
____________

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 438
Credit: 9,633,115
RAC: 158,695

Message 64909 - Posted: 19 Jul 2016, 15:18:58 UTC

Hey Thunder,

I recompiled the server binaries to have a larger work unit cushion (hold more work units in reserve). If you find it still runs out on you let me know, there is still quite a bit of tweaking I can do.

Jake

Profile Thunder
Avatar
Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,034,515
RAC: 3,030

Message 64910 - Posted: 19 Jul 2016, 18:17:26 UTC - in response to Message 64909.

So far no troubles running out! (If I could dance, I'd be doing a little jig.)

Considering the number of tasks in progress for the project has increased by over 30k since that change, I'd say a lot more people are happy as well. :-)

As I've said before, if you are willing to increase the number allowed on-hand a bit, that would be ideal (I think about 90 instead of 60 would keep mine fully fed even switching between projects), but *NOT* if that's going to cause you to get so many errors/aborts/etc. as to compromise the science.

Seriously good work, Jake! This was an absolute sea change in the function of the project. :-)
____________

Profile Thunder
Avatar
Send message
Joined: 9 Jul 08
Posts: 85
Credit: 44,034,515
RAC: 3,030

Message 64934 - Posted: 23 Jul 2016, 22:56:26 UTC

Unfortunately this weekend has brought a fresh starvation of work. About 2/3 of requests are getting 0 tasks and those that do are dribbling out a few at a time.

Perhaps once an hour I'm getting a full 60 tasks when requested.
____________

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 438
Credit: 9,633,115
RAC: 158,695

Message 64938 - Posted: 25 Jul 2016, 12:29:04 UTC

Hey Thunder,

I am looking into how to increase the frequency with which the work unit generator checks how many work units to create. This will keep the queue full of work units, but this is going to take a little work.

Jake

Previous · 1 · 2 · 3 · 4
Post to thread

Message boards : News : Running Modfit on MilkyWay@home


Main page · Your account · Message boards


Copyright © 2017 AstroInformatics Group