Welcome to MilkyWay@home

Server Trouble


Advanced search

Message boards : News : Server Trouble
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 22 · Next

AuthorMessage
ProfileWrend
Avatar

Send message
Joined: 4 Nov 12
Posts: 96
Credit: 251,528,484
RAC: 0
200 million credit badge9 year member badge
Message 72601 - Posted: 8 Apr 2022, 19:14:50 UTC - in response to Message 72600.  

Likewise for me, having double precision optimized Titan Black GPUs, I prefer utilizing them with MW@H where they can do some good, so currently set up for 8 tasks in parallel on the 2 GPUs (4 each), when I can get them at least, and 8 Einstein tasks on the i7-3930K CPU. My computer is getting a bit long in the tooth, but still does a decent job of it.
ID: 72601 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
arcturus

Send message
Joined: 20 Nov 07
Posts: 50
Credit: 2,288,695
RAC: 0
2 million credit badge14 year member badge
Message 72602 - Posted: 8 Apr 2022, 20:37:55 UTC - in response to Message 72601.  

Is the mortality rate for GPUs any higher than CPUs when running tasks such as this? With their sky high price these days it's a costly piece of hardware to lose.
ID: 72602 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 631
Credit: 487,577,796
RAC: 123,799
300 million credit badge11 year member badgeextraordinary contributions badge
Message 72605 - Posted: 8 Apr 2022, 21:58:04 UTC - in response to Message 72602.  

Is the mortality rate for GPUs any higher than CPUs when running tasks such as this? With their sky high price these days it's a costly piece of hardware to lose.

I've never lost a gpu in 20 years of using them. Only thing that can break is the cooling solution. And replacement fans or conversion to water blocks is the solution.
ID: 72605 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileWrend
Avatar

Send message
Joined: 4 Nov 12
Posts: 96
Credit: 251,528,484
RAC: 0
200 million credit badge9 year member badge
Message 72606 - Posted: 8 Apr 2022, 22:06:04 UTC - in response to Message 72602.  
Last modified: 8 Apr 2022, 22:18:15 UTC

I haven't used them nonstop for crunching the whole time, but my Titan Black cards have been in use for about... I guess 7 years now. In addition to the occasional dust removal, so far I've only had to replace the thermal paste on the GPUs once when they started overheating and throttling themselves. That did the trick and they're almost as good as new, dropping over 10°C under load.

Bearing that in mind, you probably want to keep a close eye on loads, voltage levels, temps, and fan speeds so you know what to expect from your cards and what you're willing to ask of them.

For now I've settled for having my cards run at about 80% load and 75% target power while dynamically (power load based) clocked down to 862MHz and a fan speed ranging between 50 to 60%.

Of course your mileage may vary, but all things in moderation, as they say.
ID: 72606 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72610 - Posted: 8 Apr 2022, 22:31:03 UTC - in response to Message 72595.  

just for MW@H. That would probably kick some serious butt, but will need to wait and see if the project stabilizes regarding the recent spate of down times. I have only been here since November 2021, but have seen several major down times. Eventually it will get sorted out to the good,
MW was running smoothly for years before this recent problem.

I've got 4 GPUs on one machine - one plugged in normally, the other three just sat on the bookshelf above, connected with a quad USB riser into one PCI-E socket.

Romeo and Juliet!

https://imgur.com/a/Ah9uOh2
ID: 72610 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72611 - Posted: 8 Apr 2022, 22:35:55 UTC - in response to Message 72602.  

Is the mortality rate for GPUs any higher than CPUs when running tasks such as this? With their sky high price these days it's a costly piece of hardware to lose.
I buy old ones for $100 or less, often busted display outputs so they sell cheap but I don't need the display. R9 280X, the best card for MW. They last me 2 or 3 years. The gridcoins pay for that. CPUs never ever wear out.
ID: 72611 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72612 - Posted: 8 Apr 2022, 22:36:26 UTC - in response to Message 72605.  

Is the mortality rate for GPUs any higher than CPUs when running tasks such as this? With their sky high price these days it's a costly piece of hardware to lose.

I've never lost a gpu in 20 years of using them. Only thing that can break is the cooling solution. And replacement fans or conversion to water blocks is the solution.
What's the oldest GPU you've ever still had running?
ID: 72612 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72613 - Posted: 8 Apr 2022, 22:37:55 UTC - in response to Message 72606.  

I haven't used them nonstop for crunching the whole time, but my Titan Black cards have been in use for about... I guess 7 years now. In addition to the occasional dust removal, so far I've only had to replace the thermal paste on the GPUs once when they started overheating and throttling themselves. That did the trick and they're almost as good as new, dropping over 10°C under load.

Bearing that in mind, you probably want to keep a close eye on loads, voltage levels, temps, and fan speeds so you know what to expect from your cards and what you're willing to ask of them.

For now I've settled for having my cards run at about 80% load and 75% target power while dynamically (power load based) clocked down to 862MHz and a fan speed ranging between 50 to 60%.

Of course your mileage may vary, but all things in moderation, as they say.
I run mine flat out (stock, not overclocked) and get a 2 or 3 years at full blast 24/7 before they give up. But they were already several years old when I bought them, could have been gamers, could have been miners, I don't often know the history.
ID: 72613 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileHRFMguy

Send message
Joined: 12 Nov 21
Posts: 188
Credit: 173,259,176
RAC: 2,021,316
100 million credit badge
Message 72614 - Posted: 9 Apr 2022, 0:44:18 UTC - in response to Message 72610.  

Romeo and Juliet!

Beautiful, and look very contented!
ID: 72614 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileHRFMguy

Send message
Joined: 12 Nov 21
Posts: 188
Credit: 173,259,176
RAC: 2,021,316
100 million credit badge
Message 72615 - Posted: 9 Apr 2022, 0:57:21 UTC - in response to Message 72610.  

MW was running smoothly for years before this recent problem.

I sure hope they can get back to that. Prof Heidi was talking about a king sized multi galaxy study in her recent video. I would be all over that, but it will take a smooth running, well oiled server suite to pull it off. Tom is spread way to thin.....
ID: 72615 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72616 - Posted: 9 Apr 2022, 0:57:44 UTC - in response to Message 72614.  

Romeo and Juliet!

Beautiful, and look very contented!
I got them at 6 months old from a woman who's "brother's father" (don't ask!) used to breed cockatiels and died (I think, maybe it was a care home), they are already in love. I think they need to be 2 years old until they breed though.
ID: 72616 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72617 - Posted: 9 Apr 2022, 0:58:39 UTC - in response to Message 72615.  

MW was running smoothly for years before this recent problem.

I sure hope they can get back to that. Prof Heidi was talking about a king sized multi galaxy study in her recent video. I would be all over that, but it will take a smooth running, well oiled server suite to pull it off. Tom is spread way to thin.....
There's another new student here, Dylan, but I don't know how much of his time is on MW.
ID: 72617 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
.clair.

Send message
Joined: 3 Mar 13
Posts: 49
Credit: 766,158,330
RAC: 8,844
500 million credit badge9 year member badge
Message 72618 - Posted: 9 Apr 2022, 1:03:57 UTC - in response to Message 72612.  
Last modified: 9 Apr 2022, 1:07:40 UTC

Is the mortality rate for GPUs any higher than CPUs when running tasks such as this? With their sky high price these days it's a costly piece of hardware to lose.

I've never lost a gpu in 20 years of using them. Only thing that can break is the cooling solution. And replacement fans or conversion to water blocks is the solution.
What's the oldest GPU you've ever still had running?

I have HD4650 {AGP slot and XP32} running MooWraper that runs cool and overclocked {tick , tick , tick , boom . . . . }
But here on MW I had two 7970`s cook , I always kept temps below 70c , though they where the single fan `hair drier` kind , now I prefer to keep temps below 60c for longer life
The bottom of a set of three on the mobo slots was {still workz ok} a twin fan , now I wont but any single fan `coffin` coolers on high power cards.
And replacement fans from ebay china have always worked for me {most fit ok}
ID: 72618 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72619 - Posted: 9 Apr 2022, 1:12:40 UTC - in response to Message 72618.  
Last modified: 9 Apr 2022, 1:14:45 UTC

I have HD4650 {AGP slot and XP32} running MooWraper that runs cool and overclocked {tick , tick , tick , boom . . . . }
I had something like that and it would only run SETI. While replacing the fan my screwdriver slipped and ripped a gouge in the circuit board. I threw it out.

But here on MW I had two 7970`s cook , I always kept temps below 70c , though they where the single fan `hair drier` kind , now I prefer to keep temps below 60c for longer life
The bottom of a set of three on the mobo slots was {still workz ok} a twin fan , now I wont but any single fan `coffin` coolers on high power cards.
And replacement fans from ebay china have always worked for me {most fit ok}
I have one 7970 and five 280X (same chip). I set the fans to be off at 50C, 50% at 70C and 100% at 80C (for quietness), with a graph slope inbetween so it's gradual, also no speed drop until 9C difference so I don't hear them speeding up and down too much, they usually sit at about 73C. I have the same fan speeds set for CPUs.

I replace fans with cheap high speed PWM 80mm case fans from Evercool and just strap them onto the GPU. They actually run quieter.
ID: 72619 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cavalary
Avatar

Send message
Joined: 23 Aug 11
Posts: 21
Credit: 8,499,980
RAC: 9,952
5 million credit badge11 year member badge
Message 72620 - Posted: 9 Apr 2022, 1:16:34 UTC

Yep, something seems to have been fixed, got WUs 3h ago too.
ID: 72620 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72621 - Posted: 9 Apr 2022, 1:17:56 UTC
Last modified: 9 Apr 2022, 1:19:23 UTC

Fan curve: https://imgur.com/a/GS1Cebb

4 GPUs on one machine :-) https://www.dropbox.com/s/8af33dmzkdi21n4/4gpu.jpg?dl=0 Running cooler than usual as they're on Folding not MW.
ID: 72621 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72622 - Posted: 9 Apr 2022, 1:21:33 UTC - in response to Message 72620.  

Yep, something seems to have been fixed, got WUs 3h ago too.
I did not. This is at 11:53pm UTC:

231	Milkyway@Home	09-04-2022 12:53 AM	Sending scheduler request: To fetch work.	
232	Milkyway@Home	09-04-2022 12:53 AM	Requesting new tasks for AMD/ATI GPU	
233	Milkyway@Home	09-04-2022 12:53 AM	Scheduler request completed: got 0 new tasks	
234	Milkyway@Home	09-04-2022 12:53 AM	Project requested delay of 91 seconds
ID: 72622 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileWrend
Avatar

Send message
Joined: 4 Nov 12
Posts: 96
Credit: 251,528,484
RAC: 0
200 million credit badge9 year member badge
Message 72624 - Posted: 9 Apr 2022, 3:23:54 UTC

I've been able to get a few more batches requesting updates here and there during the day, but now more recently, within the past couple hours or so, seem to be getting them more consistently automatically.
ID: 72624 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
arcturus

Send message
Joined: 20 Nov 07
Posts: 50
Credit: 2,288,695
RAC: 0
2 million credit badge14 year member badge
Message 72625 - Posted: 9 Apr 2022, 3:26:40 UTC

Thanks for the gpu longevity answers, I'll stick with more worry free CPU's only. That's one thing I like about World Community Grid, it's CPU only.
ID: 72625 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 5 Jul 11
Posts: 731
Credit: 330,035,975
RAC: 208,504
300 million credit badge11 year member badge
Message 72626 - Posted: 9 Apr 2022, 3:38:23 UTC - in response to Message 72625.  
Last modified: 9 Apr 2022, 3:44:01 UTC

Thanks for the gpu longevity answers, I'll stick with more worry free CPU's only. That's one thing I like about World Community Grid, it's CPU only.
The amount of work a GPU can do is tremendous compared with CPUs. Just buy old cheap 2nd hand ones than it doesn't annoy you if they die. Plus, you don't need to buy MB, RAM, disk, etc. for them. I get ones which have faulty outputs really cheap. You don't need a display if they're the second card, or if you access it remotely.

WCG does actually give out covid GPU tasks occasionally. I'm expecting a lot of tasks when they come back online though, since the scientists will be starved of results - they were previously limiting GPU work as the scientists couldn't keep up.

And if you want to do a lot of biology, consider folding at home on AMD GPUs. It's not Boinc, but it's easy enough to set up. If you have Nvidia GPUs. you can use GPUGrid on Boinc.

I told my 4 GPU machine to ask MW for work every 2 minutes and I'm getting quite a bit :-) Sorry Tom. If you can fix the annoying problem of not being able to send and receive at once, I'll stop doing that....
ID: 72626 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 22 · Next

Message boards : News : Server Trouble

©2022 Astroinformatics Group