Work should be flowing
log in

Advanced search

Message boards : News : Work should be flowing

1 · 2 · Next
Author Message
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 40811 - Posted: 5 Jul 2010, 21:06:22 UTC

As usual, I go out of town and the server crashes. Work should be flowing, just give the server some time to catch up.
--Travis
____________

Black_Jac
Send message
Joined: 10 May 10
Posts: 9
Credit: 13,403,734
RAC: 0

Message 40820 - Posted: 5 Jul 2010, 23:15:14 UTC

Yeah, it is running. But only half...

My scheduler is only asking for CPU tasks.

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0

Message 40821 - Posted: 5 Jul 2010, 23:18:28 UTC

You crashed, Collatz kept dropping off the internet and SETI was only allowing 20 units at a time.

I had to get work from DNETC just to keep the GPU warm this weekend.
____________

Black_Jac
Send message
Joined: 10 May 10
Posts: 9
Credit: 13,403,734
RAC: 0

Message 40822 - Posted: 5 Jul 2010, 23:21:15 UTC - in response to Message 40820.

Yeah, it is running. But only half...

My scheduler is only asking for CPU tasks.



Scratch that...Just gave the scheduler a Pwnt, sorted it out.

John Clark
Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0

Message 40824 - Posted: 6 Jul 2010, 0:17:39 UTC

Thanks Travis. All seems to be running well ATM
____________
Go away, I was asleep


Profile Arif Mert Kapicioglu
Send message
Joined: 14 Dec 09
Posts: 159
Credit: 573,720,351
RAC: 0

Message 40836 - Posted: 7 Jul 2010, 11:41:23 UTC

Recently, i have been receiving one task per gpu. has the limitation been reduced again?

Also, the server crashes more frequently in last several weeks. How about upgrading the server?

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0

Message 40837 - Posted: 7 Jul 2010, 14:26:21 UTC - in response to Message 40836.

Recently, i have been receiving one task per gpu. has the limitation been reduced again?
Is Boinc set to 'connect every' larger than 7? Or you built up a debt for another project and Boinc is trying to sort it out. This is a Boinc problem not MW.


____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

Profile Arif Mert Kapicioglu
Send message
Joined: 14 Dec 09
Posts: 159
Credit: 573,720,351
RAC: 0

Message 40842 - Posted: 7 Jul 2010, 20:11:17 UTC - in response to Message 40837.

i set it to 5 and it solved the problem. thank you

David C!
Avatar
Send message
Joined: 30 Oct 09
Posts: 4
Credit: 76,212
RAC: 0

Message 40921 - Posted: 14 Jul 2010, 11:08:36 UTC

I keep having problems with Milkyway. I don't seem to be getting credit for completed work, and after completion sometimes there is no work available.

7/14/2010 6:11:25 AM Milkyway@home Requesting new tasks
7/14/2010 6:11:26 AM Milkyway@home Scheduler request completed: got 0 new tasks
7/14/2010 6:11:26 AM Milkyway@home Message from server: No work available
7/14/2010 6:19:31 AM Milkyway@home Sending scheduler request: To fetch work.
7/14/2010 6:19:31 AM Milkyway@home Requesting new tasks
7/14/2010 6:19:32 AM Milkyway@home Scheduler request completed: got 0 new tasks
7/14/2010 6:19:32 AM Milkyway@home Message from server: No work available
7/14/2010 6:41:39 AM Milkyway@home Sending scheduler request: To fetch work.

It seems to be a waste of resources to work on this project.

Len LE/GE
Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0

Message 40924 - Posted: 14 Jul 2010, 11:24:39 UTC - in response to Message 40921.

a) no need to post the same question in multiple threads

b) your WU's are waiting for validation
see http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=116980

c) you could run an optimized app for your CPU (see number crunching section)

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 519
Credit: 281,480,125
RAC: 205

Message 40929 - Posted: 14 Jul 2010, 15:58:10 UTC

The title of the thread makes a good statement -- work should be flowing. Unfortunately, it hasn't been nor has the validator been functioning for the past 12+ hours. I take it that Travis is out of town again as it seems that 1) Problems happen when he is not around to babysit the servers and 2) No one else at RPI is empowered or trained to resolve the problems.
____________

Roland Mengel
Send message
Joined: 11 Jul 10
Posts: 1
Credit: 475,666
RAC: 0

Message 41061 - Posted: 25 Jul 2010, 11:59:20 UTC

Can't get ny new work. anybody know why. Same for SETI
Thanks R. Mengel
____________

Profile The Gas Giant
Avatar
Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 29

Message 41062 - Posted: 25 Jul 2010, 12:14:41 UTC - in response to Message 41061.

Can't get ny new work. anybody know why. Same for SETI
Thanks R. Mengel

No problem for me on either project.

Profile Werkstatt
Send message
Joined: 19 Feb 08
Posts: 350
Credit: 123,760,875
RAC: 1,243

Message 41063 - Posted: 25 Jul 2010, 15:31:06 UTC - in response to Message 41061.

Can't get ny new work. anybody know why. Same for SETI
Thanks R. Mengel


Sorry, no problem for me.

John Clark
Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0

Message 41069 - Posted: 26 Jul 2010, 9:26:34 UTC

I was thinking how well Milkyway was doing up until yesterday, then the validator jammed up last night. Was it my thoughts that jinxed the system?

Going to take a day or so before the servers are kicked I think.
____________
Go away, I was asleep


BarryAZ
Send message
Joined: 1 Sep 08
Posts: 519
Credit: 281,480,125
RAC: 205

Message 41082 - Posted: 26 Jul 2010, 19:48:16 UTC - in response to Message 41081.

Shane, sorry for the off topic query, but there hasn't been any reply to the server problem which popped up (again) yesterday (Sunday). Folks probably expected some sort of information this morning, but perhaps Travis is out of town....
____________

Shane Reilly
Volunteer moderator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 May 10
Posts: 57
Credit: 2,138
RAC: 0

Message 41084 - Posted: 26 Jul 2010, 20:18:46 UTC
Last modified: 26 Jul 2010, 20:46:13 UTC

I sent Travis an email regarding the issue with the validator. Any additional information you could give on the problem would be helpful. Urgent issues can be emailed to astro@cs.rpi.edu to alert all the project developers (there are currently 11).

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 519
Credit: 281,480,125
RAC: 205

Message 41085 - Posted: 26 Jul 2010, 20:47:47 UTC

OK -- thanks for the reply.

The problem appears identical to the server issues that occurred about two weeks ago. No new work is available and work units are not getting validated.

I suspect resolving this (temporarily) would need a server (or process) restart.

As to resolving this at the root cause (as it is at least somewhat repetitive) I've no clue (aside from having Travis live onsite with the server 24/7 <smile>).

Again, my suspicion as to why there has been no fix (temporary or otherwise) or reply to posts over in the number crunching message board is that Travis is not around at the moment, and, frankly, when he's not around, things go to automatic pilot with less direct attention. It seems that the server is 'sensitive' to Travis not being around to keep it company and tends to cajole Travis (and us) by becoming problematic when he is not around.
____________

Shane Reilly
Volunteer moderator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 May 10
Posts: 57
Credit: 2,138
RAC: 0

Message 41089 - Posted: 26 Jul 2010, 21:34:01 UTC
Last modified: 26 Jul 2010, 21:36:49 UTC

I had a brief conversation with Travis and he says the validator had crashed and should be up and running now. Thank you for the heads up.

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 519
Credit: 281,480,125
RAC: 205

Message 41476 - Posted: 15 Aug 2010, 18:53:43 UTC - in response to Message 41089.

The validator has crashed again. In fact, until the root cause issue underneath the validator (and work generator) crashes is recognized and dealt with, you can expect to see this as very much a recurring problem (it has been a recurring problem now for months).

As noted elsewhere, the workaround is to set up a automatic process which stop/starts the various processes or does a full server down/restart to clear out the problems (temporarily) -- but this is only a workaround, as it is fairly clear that there is an underlying root cause problem which needs attention.
____________

1 · 2 · Next
Post to thread

Message boards : News : Work should be flowing


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group