Message boards :
News :
Server Trouble
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 22 · Next
Author | Message |
---|---|
Send message Joined: 13 Dec 17 Posts: 46 Credit: 2,421,362,376 RAC: 0 |
Well, I did try all sorts of 'gymnastics' (maybe not in that particular sequence, but definitely suspend, restart, etc... all that). This is just ridiculous, there shouldn't be any need for that sort of babysitting. The admins should sit and spend some time to sort the server issues completely and not do a piecemeal job of scratching here and there and hoping that things will sort themselves out. |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
Well, I did try all sorts of 'gymnastics' (maybe not in that particular sequence, but definitely suspend, restart, etc... all that). This is just ridiculous, there shouldn't be any need for that sort of babysitting. The admins should sit and spend some time to sort the server issues completely and not do a piecemeal job of scratching here and there and hoping that things will sort themselves out.I don't see why you're getting so upset about it. Join more than one project then you won't even care when one isn't available. It's not like they're supplying your wages or food. |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
It takes no time at all if there are enough disks. If the system isn't loaded to the max, one or two disks broken in a RAID don't cause a problem. And when you put another in, it rebuilds it in the background without anyone noticing. I used to run a server with RAID 6, two could fail without problem. As soon as one failed, I just put another one in. Didn't even have to touch the keyboard or reboot or anything. Slide one out, slide one in.. |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
Peter: +1 |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
Max: Well, ... (maybe not in that particular sequence, but definitely suspend, restart, etc... all that). ... ... the sequence does matter - at least in my cases. and it has nothing to do with still "unsolved" server issues ... Just let it sit a while (usually max 20 minutes) and you'll get new WUs. I know, it could all be better (like at EatH), but that is life ... Relax, have a beer or two on me ! Have a great Sunday. |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
Max:Where is it Sunday? I just press update on the project and get some. That removes any 3 hour backoff created by my client. |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
Peter: Where is it Sunday? I just press update on the project and get some. That removes any 3 hour backoff created by my client.[/quote] Well tomorrow is Sunday. When I press update nothing happens ... The backoff time first goes up with each try and after several more repeats it goes down around to 1:30 minutes and that is it. Nothing is loaded, so I have to do some fiddeling .... (as mentioned before). Ok, so I'll just say "have a nice day". Cheers |
Send message Joined: 15 Jun 13 Posts: 15 Credit: 2,070,897,222 RAC: 0 |
I'm not getting any workunits either... but I notice on the status page there are 3,591,958 workunits to waiting for validation right now. It may be trying to churn through these before it starts handing out new work. It's down by about a million since the new drive was installed, so, if I'm right, it could be a few more days before new units start going out. I don't mind though... my GPU is just churning through my backup project. :) |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
It dropped to that a while ago then stopped, and was sending out new work too. Maybe he's paused things to let the disk rebuild get done? I'm thinking these are hamster powered disks he's using. I tried a couple of other GPU projects and some projects are terrible. SRBase only uses your first GPU. Numberfields doesn't work on 280X cards. Both of these keep giving me GPU work after I told them not to. So I keep aborting them until the server learns it's lesson! |
Send message Joined: 15 Jun 13 Posts: 15 Credit: 2,070,897,222 RAC: 0 |
Hmm, I see that the status page hasn't updated in a few hours. Normally it updates once or twice an hour... Disk rebuild times obviously vary depending on the disk, but I've seen as low as an hour or two for a 300GB 10k rpm drive, to two days for 4TB 720rpm. No idea what MW's disk specs are. The validation queue was consistently going up until Tom announced he'd replaced the drive, so the rebuild was probably quick. I get the feeling the validation queue is lower now, but the page simply isn't updating for some reason. As far as other GPU projects go, I've been pretty happy with MLC@Home. Amicable Numbers also worked well, but that requires a ton of system RAM. EDIT: Got exactly one new task, just now (19:06 UTC). So it's working, just light-years (heh heh) behind. Lots for it to still catch up on. EDIT 2: And 299 more 90 seconds later! So it's kind of working, here and there. |
Send message Joined: 3 May 18 Posts: 7 Credit: 45,954 RAC: 0 |
Hallo together, hallo Tom, how long make your server trouble???? I have many works done for this group,but the points I don't get. The last work I don't have get my points for work, because by the server crash!!! Please repair the server and give me my workpoints.!!!Thanks Tom.....!!![color=red] |
Send message Joined: 15 Jun 13 Posts: 15 Credit: 2,070,897,222 RAC: 0 |
Hallo together, hallo Tom, how long make your server trouble???? I have many works done for this group,but the points I don't get. The last work I don't have get my points for work, because by the server crash!!! Please check out the Server Status page. Right now it says "Workunits waiting for validation: 3423883". This is why you don't have your points yet. This number is slowly decreasing now, so you will get your points within the next few days. |
Send message Joined: 3 May 18 Posts: 7 Credit: 45,954 RAC: 0 |
Hallo Tom, Thank you very much!!! I love the work for the universe! It is fantastic work and I learn much about the cosmos! |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
Hmm, I see that the status page hasn't updated in a few hours. Normally it updates once or twice an hour...My main desktop's data drive is a 4TB 7200rpm (TV and security camera and software installers) is 3/4 full and that only takes 5.5 hours to backup, so should be similar for a rebuild. I've never run a server with a 7200 drive! But if he's still running validations and serving us at the same time, the rebuild will be slower. No idea what MW's disk specs are.Tom seems embarrassed to say! He really ought to get some SSDs. As far as other GPU projects go, I've been pretty happy with MLC@Home. Amicable Numbers also worked well, but that requires a ton of system RAM.I love this one because it's the only double precision one, and I have cards very good at that, I bought them for MW on purpose, since I like the science topic. EDIT: Got exactly one new task, just now (19:06 UTC). So it's working, just light-years (heh heh) behind. Lots for it to still catch up on. EDIT 2: And 299 more 90 seconds later! So it's kind of working, here and there.If I leave my computers alone, sometimes I spot they've got a full batch of 300 per GPU. If I pester them I get nothing. I've acquired five R9 280X cards now. I love it when they reduce the price from £130 to £50 because they "don't work". No display output? Don't care! Some of the VRAM broken, won't run big programs like Einstein, but MW is ok! |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
L-I-V-T: RELAX ... |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
L-I-V-T:An Infernal Lucifer cannot relax. |
Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0 |
UH OH! trouble in paradise again.. 3/19/2022 10:09:24 PM | Milkyway@Home | Reporting 2 completed tasks 3/19/2022 10:09:24 PM | Milkyway@Home | Requesting new tasks for CPU 3/19/2022 10:09:46 PM | Milkyway@Home | Scheduler request failed: Failure when receiving data from the peer 3/19/2022 10:09:47 PM | | Project communication failed: attempting access to reference site 3/19/2022 10:09:48 PM | | Internet access OK - project servers may be temporarily down. |
Send message Joined: 15 Jun 13 Posts: 15 Credit: 2,070,897,222 RAC: 0 |
My main desktop's data drive is a 4TB 7200rpm (TV and security camera and software installers) is 3/4 full and that only takes 5.5 hours to backup, so should be similar for a rebuild. Rebuild times don't match the transfer rate of the drives, especially when rebuilding from parity. I've heard stories of multiple-day rebuilds of very large drives. (Larger than MW probably has or needs.) I've never run a server with a 7200 drive! It's actually more common than you think... or at least it was when I worked on that stuff several years back. Most commonly, they were the "capacity" tier of hybrid SAN arrays. One array I maintained had a full 3U tray of 3TB 7.2k disks, and rebuilding one of them once took over a day. They still use 7.2k in file servers, too - heck, AWS even offers them for cloud file servers, if you were to set one up. But if he's still running validations and serving us at the same time, the rebuild will be slower. Considering the validations weren't going at all when the failed drive was removed, I'm inclined to think the rebuild finished already. (Unless the validation pace picks up massively in the next several hours, in which case, I guess not!) Tom seems embarrassed to say! He really ought to get some SSDs. I doubt MW has the money. I mean, just a few years ago they were begging for money just to keep the research team going. I don't think they have funding for new hardware. I love this one because it's the only double precision one, and I have cards very good at that, I bought them for MW on purpose, since I like the science topic. Yeah, I have a Titan V myself... it's a pretty good performer for normal FP32 things, but so are cards a fifth of the price! So for other projects it's at least pretty productive, but still a waste overall. If I leave my computers alone, sometimes I spot they've got a full batch of 300 per GPU. If I pester them I get nothing. That's funny... I only got 300 tasks today by pestering it. Otherwise, getting nothing. :D I've acquired five R9 280X cards now. I love it when they reduce the price from £130 to £50 because they "don't work". No display output? Don't care! Some of the VRAM broken, won't run big programs like Einstein, but MW is ok! Nice :D the 280X is a pretty good FP64 card, especially at those prices. |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
Validation queue is increasing again. Dare I say there is a secondary problem? |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
Rebuild times don't match the transfer rate of the drives, especially when rebuilding from parity. I've heard stories of multiple-day rebuilds of very large drives. (Larger than MW probably has or needs.)In my experience it goes as fast as the drive, but I did build overpowered servers so there was plenty CPU time and drive speed available. Having a $35K budget did help. I obviously made more than one with that, but it all fitted in one cabinet. After I'd released a colleague I locked in it for a laugh. It's actually more common than you think... or at least it was when I worked on that stuff several years back. Most commonly, they were the "capacity" tier of hybrid SAN arrays. One array I maintained had a full 3U tray of 3TB 7.2k disks, and rebuilding one of them once took over a day. They still use 7.2k in file servers, too - heck, AWS even offers them for cloud file servers, if you were to set one up.At the time I didn't need an enormous capacity, so I chose speed. They got bigger after a couple of years when the originals started failing and larger ones were cheaper. You'd think enterprise drives would be reliable.... I got them replaced under warranty but didn't wait for the replacements, I bought larger ones then sold the replacements. They would be refurbished drives and I didn't want that crap. I wonder if MW bought them from me on Ebay? Considering the validations weren't going at all when the failed drive was removed, I'm inclined to think the rebuild finished already. (Unless the validation pace picks up massively in the next several hours, in which case, I guess not!)It's going backwards now. I doubt MW has the money. I mean, just a few years ago they were begging for money just to keep the research team going. I don't think they have funding for new hardware.A disk is peanuts compared to the staff wages. He did say recent donations had been good, and he's going to put it on the homepage. Yeah, I have a Titan V myself... it's a pretty good performer for normal FP32 things, but so are cards a fifth of the price! So for other projects it's at least pretty productive, but still a waste overall.Only time I'll buy a Nvidia is if it's primarily for gaming, or if there are no DP projects. I detest this shrinking of DP speed. But the Nvidia I was soon going to buy has gone from £800 to £1200! Bitcoins causing a shortage? Surely that's been going on for years now? That's funny... I only got 300 tasks today by pestering it. Otherwise, getting nothing. :DI think it's just luck. With 7 computers asking, one of them will notice. When I see loads appear on the screen, I tell the rest to ask. That's why there are none left for you :-P Nice :D the 280X is a pretty good FP64 card, especially at those prices.The 7970 is pretty much identical (5% slower) and much cheaper. I find those sometimes. |
©2024 Astroinformatics Group