Welcome to MilkyWay@home

Problems downloading Stars and Volume files

Message boards : Number crunching : Problems downloading Stars and Volume files
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 272 - Posted: 9 Nov 2007, 19:11:00 UTC

what seems to be happening is that people are downloading a lot of work units, so for some work units they've already been processed and removed from the server by the time a user gets to them in their queue. i'm looking into a way to fix this so there won't be stuck downloads.
ID: 272 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 286 - Posted: 9 Nov 2007, 20:11:18 UTC

i've updated our server to limit the amount of in progress work units a client has, this should help fix the problem with people not being able to download files.
ID: 286 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pieface

Send message
Joined: 7 Nov 07
Posts: 13
Credit: 20,143,762
RAC: 0
Message 288 - Posted: 9 Nov 2007, 20:59:13 UTC

Hmmm... how about making the WU's 50x longer, that would sure save a lot of upload/download on multi-core machines, I think one of mine went thru four or five hundred wu's yesterday (counting the ones that choked on download). I'm sure this must be stressing your servers as well?
ID: 288 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 8 Oct 07
Posts: 24
Credit: 111,325
RAC: 2
Message 289 - Posted: 9 Nov 2007, 21:12:25 UTC

Happy to report that I have had no stuck download problems with the latest bunch of wus (under windows) which arrive in pairs which I presume, from the "Reached host limit of 2" message, is by design.

Seeing a checkpoint message, I was even brave enough to exit and restart Boinc to test that, and checkpointing works OK. (Big issue on some other projects with much longer wus.)

Running Ubuntu on a virtual machine gives "unrecoverable error" for every wu even before it gets a chance to start so I've set no new work on that one for now.
ID: 289 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matthias Lehmkuhl

Send message
Joined: 29 Sep 07
Posts: 18
Credit: 4,533,464
RAC: 0
Message 312 - Posted: 10 Nov 2007, 14:44:15 UTC
Last modified: 10 Nov 2007, 14:46:48 UTC

I got an download error - Linux 32b with Boinc 5.10.28

Sa 10 Nov 2007 15:09:33 CET|Milkyway@home|Sending scheduler request: To fetch work. Requesting 173 seconds of work, reporting 0 completed tasks
Sa 10 Nov 2007 15:09:38 CET|Milkyway@home|Scheduler request succeeded: got 1 new tasks
Sa 10 Nov 2007 15:09:40 CET|Milkyway@home|Started download of astronomy_1.07_i686-pc-linux-gnu
Sa 10 Nov 2007 15:09:40 CET|Milkyway@home|Started download of parameters_generated_1194582642_44170
Sa 10 Nov 2007 15:09:41 CET|Milkyway@home|Giving up on download of parameters_generated_1194582642_44170: file not found
Sa 10 Nov 2007 15:09:41 CET|Milkyway@home|Started download of stars.txt
Sa 10 Nov 2007 15:09:43 CET|Milkyway@home|Finished download of astronomy_1.07_i686-pc-linux-gnu
Sa 10 Nov 2007 15:09:43 CET|Milkyway@home|Giving up on download of stars.txt: file not found
Sa 10 Nov 2007 15:09:43 CET|Milkyway@home|Started download of volume.txt
Sa 10 Nov 2007 15:09:45 CET|Milkyway@home|Giving up on download of volume.txt: file not found

Edit:
after that i got only the message:
Sa 10 Nov 2007 15:33:32 CET|Milkyway@home|Sending scheduler request: To fetch work. Requesting 173 seconds of work, reporting 0 completed tasks
Sa 10 Nov 2007 15:33:37 CET|Milkyway@home|Scheduler request succeeded: got 0 new tasks


Matthias
ID: 312 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matthias Lehmkuhl

Send message
Joined: 29 Sep 07
Posts: 18
Credit: 4,533,464
RAC: 0
Message 313 - Posted: 10 Nov 2007, 15:32:40 UTC

One more download error with my Linux machine

Sa 10 Nov 2007 16:06:29 CET|Milkyway@home|Sending scheduler request: To fetch work. Requesting 174 seconds of work, reporting 0 completed tasks
Sa 10 Nov 2007 16:06:34 CET|Milkyway@home|Scheduler request succeeded: got 1 new tasks
Sa 10 Nov 2007 16:06:36 CET|Milkyway@home|Started download of parameters_generated_1194740565_7292
Sa 10 Nov 2007 16:06:36 CET|Milkyway@home|Started download of stars.txt
Sa 10 Nov 2007 16:06:38 CET||Project communication failed: attempting access to reference site
Sa 10 Nov 2007 16:06:38 CET|Milkyway@home|Finished download of parameters_generated_1194740565_7292
Sa 10 Nov 2007 16:06:38 CET|Milkyway@home|Giving up on download of stars.txt: file not found
Sa 10 Nov 2007 16:06:38 CET|Milkyway@home|Started download of volume.txt
Sa 10 Nov 2007 16:06:41 CET||Access to reference site succeeded - project servers may be temporarily down.
Sa 10 Nov 2007 16:06:41 CET|Milkyway@home|Giving up on download of volume.txt: file not found

This time the download of the parameters_generated file was ok

after that again only this message:

Sa 10 Nov 2007 16:07:39 CET|Milkyway@home|Sending scheduler request: To report completed tasks. Requesting 432 seconds of work, reporting 1 completed tasks
Sa 10 Nov 2007 16:07:44 CET|Milkyway@home|Scheduler request succeeded: got 0 new tasks

Matthias
ID: 313 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 316 - Posted: 10 Nov 2007, 16:02:24 UTC

i think the problem with these is the test we were running finished so we stopped generating new work. there should be more work up and available soon.
ID: 316 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 29 Aug 07
Posts: 115
Credit: 501,600,404
RAC: 4,799
Message 319 - Posted: 10 Nov 2007, 17:21:16 UTC - in response to Message 316.  
Last modified: 10 Nov 2007, 17:22:19 UTC

i think the problem with these is the test we were running finished so we stopped generating new work. there should be more work up and available soon.


Just FYI, even after the limit applied to the number of WUs a machine can get at a time, the downloads are getting stuck. When I checked my machines this morning over 10 of them had this problem.

Edit: all the stuck files were the parameters* files. None were stuck trying to download stars or volume.

ID: 319 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stick

Send message
Joined: 8 Oct 07
Posts: 52
Credit: 5,630,511
RAC: 223
Message 327 - Posted: 10 Nov 2007, 18:04:42 UTC - in response to Message 319.  

Edit: all the stuck files were the parameters* files. None were stuck trying to download stars or volume.


I've gotten several sets of "stuck" parameter files. And download "retries" always fail immediately. The only way around the problem I've found is to "abort" them.
ID: 327 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Philadelphia
Avatar

Send message
Joined: 9 Nov 07
Posts: 131
Credit: 180,454
RAC: 0
Message 340 - Posted: 10 Nov 2007, 19:27:17 UTC - in response to Message 327.  

Edit: all the stuck files were the parameters* files. None were stuck trying to download stars or volume.


I've gotten several sets of "stuck" parameter files. And download "retries" always fail immediately. The only way around the problem I've found is to "abort" them.


Ditto for me, I just suspended the project temporarily until the problem is identified. I even detached and reattached to see if that would help with no luck.

CLICK TO HELP BUILD
ID: 340 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 341 - Posted: 10 Nov 2007, 19:32:59 UTC - in response to Message 340.  

Edit: all the stuck files were the parameters* files. None were stuck trying to download stars or volume.


I've gotten several sets of "stuck" parameter files. And download "retries" always fail immediately. The only way around the problem I've found is to "abort" them.


Ditto for me, I just suspended the project temporarily until the problem is identified. I even detached and reattached to see if that would help with no luck.


some of the parameter sets from the first run got messed up. the best thing to do is just abort them. with the increased work unit time, and the server limiting queues i think the problem should be fixed for future searches.
ID: 341 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 343 - Posted: 10 Nov 2007, 20:05:22 UTC - in response to Message 341.  

i've been trying to find a way to either manually restore these work units server side so they can be downloaded without people having to abort them, or something else along those lines but no luck so far.
ID: 343 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Maxxina

Send message
Joined: 3 Oct 07
Posts: 11
Credit: 412,763
RAC: 128
Message 359 - Posted: 10 Nov 2007, 22:16:54 UTC

I download a stars but volume is still not find .
ID: 359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 361 - Posted: 10 Nov 2007, 22:24:43 UTC - in response to Message 359.  

I download a stars but volume is still not find .


I think i'm going to move to boinc trying to download the volume file from a fixed URL as opposed to specifying a file in the work unit. i think this should help as that file shouldn't ever be deleted.
ID: 361 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Maxxina

Send message
Joined: 3 Oct 07
Posts: 11
Credit: 412,763
RAC: 128
Message 364 - Posted: 10 Nov 2007, 22:41:07 UTC

little confused , work should be working without volume file ? because me job are running ;]
ID: 364 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AMDave

Send message
Joined: 29 Aug 07
Posts: 3
Credit: 16,603,856
RAC: 0
Message 372 - Posted: 11 Nov 2007, 1:07:40 UTC
Last modified: 11 Nov 2007, 1:20:30 UTC

Installed 5.10.8 on AMD64, RHEL5, 2.6.18-8.1.15.el5

I can see the files downloading into the folder
~/BOINC/projects/milkyway.cs.rpi.edu_milkyway

astronomy_1.07_i686-pc-linux-gnu
parameters_generated_1194765844_83
parameters_generated_1194773034_362
parameters_generated_1194773041_456
parameters_generated_1194802758_61
parameters_generated_1194802761_94
parameters_generated_1194804157_264
parameters_generated_1194804159_288
parameters_generated_1194804163_343
stars.txt
volume2.txt


But the BOINC Manager says all of the downloads have failed.

Some of the files created have a zero file size including stars.txt
The message log shows the download is successful.
So it looks like they were sent from the server that way.

ID: 372 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AMDave

Send message
Joined: 29 Aug 07
Posts: 3
Credit: 16,603,856
RAC: 0
Message 373 - Posted: 11 Nov 2007, 1:59:18 UTC - in response to Message 372.  

After about 15 attempts I just got a complete stars.txt file and now I'm crunching something.
ID: 373 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile rebirther
Avatar

Send message
Joined: 28 Aug 07
Posts: 52
Credit: 8,353,747
RAC: 0
Message 388 - Posted: 11 Nov 2007, 15:27:10 UTC

If anyone has some issues to download missing files go here:
http://milkyway.cs.rpi.edu/milkyway/download/
ID: 388 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Problems downloading Stars and Volume files

©2024 Astroinformatics Group