Welcome to MilkyWay@home

Server Downtime March 28, 2022 (12 hours starting 00:00 UTC)

Message boards : News : Server Downtime March 28, 2022 (12 hours starting 00:00 UTC)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 15 · Next

AuthorMessage
Profile Bull Schuck

Send message
Joined: 20 Feb 22
Posts: 12
Credit: 16,836,989
RAC: 0
Message 72560 - Posted: 7 Apr 2022, 16:31:22 UTC - in response to Message 72557.  

From what I can see, it looks like the long, long list of n-body simulations is what is sticking in queue for other folks to process for validation. Is the system set up to run through all of them as a first pass before sending out follow-up WUs? Just trying to figure out why my Macs, which are only processing Separation tasks, have about 25 - 50% of tasks in Validation Inconclusive but my Windows laptop, running almost all N-body simulations, has > 90% of tasks in Validation Inconclusive.

To all the folks who have been here a while and are crunching heavy numbers, I tip my hat. To the folks who are actually running this operation, like Tom, you have my deepest thanks. I stopped my Physics education with my BA. The ability for folks to progress beyond that is something I can only marvel at.
ID: 72560 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 72561 - Posted: 7 Apr 2022, 18:05:31 UTC

Oh that's very interesting that there's a difference between separation tasks and n-body tasks in validation inconclusive. There isn't a priority for one application over the other, but N-body only runs on CPU, which can muck things up sometimes. I wonder if there's a problem somewhere in n-body that's causing this...
ID: 72561 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 72562 - Posted: 7 Apr 2022, 18:07:57 UTC

Seems the main application processes crashed when I cleared out the stale DB processes. I've brought them back up now.
ID: 72562 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 211
Credit: 108,188,090
RAC: 4,959
Message 72563 - Posted: 7 Apr 2022, 18:23:28 UTC - in response to Message 72562.  
Last modified: 7 Apr 2022, 18:33:27 UTC

Seems the main application processes crashed when I cleared out the stale DB processes. I've brought them back up now.


Yup - I had just finished preparing a post about that and decided to check whether anyone else had beaten me to it - all that typing for nothing :-)

Thanks for sorting it out...

Cheers - Al.
ID: 72563 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
arcturus

Send message
Joined: 20 Nov 07
Posts: 54
Credit: 2,663,789
RAC: 0
Message 72564 - Posted: 7 Apr 2022, 19:23:20 UTC

Seeing the word 'crashes' quite a bit lately. Appears there's work to be done in post hard drive failure recovery.
ID: 72564 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 72565 - Posted: 7 Apr 2022, 21:41:55 UTC

Turned off the n-body workunit generator, but left the separation workunit generator on. I'll check on the numbers again tomorrow and see if I want to turn the n-body workunit generator back on at that point.
ID: 72565 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,038,236
RAC: 0
Message 72566 - Posted: 7 Apr 2022, 22:02:18 UTC - in response to Message 72565.  

Turned off the n-body workunit generator, but left the separation workunit generator on. I'll check on the numbers again tomorrow and see if I want to turn the n-body workunit generator back on at that point.

Agreed. Based on this chart, I think you can leave it off for a few days....

ID: 72566 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 4 Jul 09
Posts: 92
Credit: 17,303,727
RAC: 2,383
Message 72568 - Posted: 7 Apr 2022, 23:32:27 UTC - in response to Message 72549.  

I think that you should stop guessing.... you sound very negative and cold in the view of life.

Who pays you to volunteer?

Bill F
In October of 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.


ID: 72568 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 72570 - Posted: 8 Apr 2022, 6:16:46 UTC - in response to Message 72556.  

I just killed some stale processes in the DB. It seems like some queries get hung up because the WU tables are quite large at the moment, and then they slow everything down because they lock certain parts of the DB while they run. Killing them every so often seems to help. I had been trying to avoid doing that to see if the server could handle things on its own at this point, but apparently it still needs the regular kicks now and then to keep functioning as per normal.
Boinc really needs some paid programmers to sort this crap out. Their software (server and client side) is a piece of shit.

Sorry about the validation backlog and the slowness, hopefully what I just did speeds things up a bit. I'll keep killing stale tasks regularly until the validation backlog AND the WUs ready to send both decrease.
Thanks for your help.

As for the pay, if you donate to milkyway, none of that goes into my (or any other milkyway personnel's) pocket. The funds are used strictly for project upgrades and development. Our salaries are funded by grants and/or teaching, not volunteer donations. We don't ask for donations because they make us richer, they just help us make the project better. Just wanted to clear that up!
Thanks. So everything we donate will go to server hardware?
ID: 72570 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 72571 - Posted: 8 Apr 2022, 6:18:14 UTC - in response to Message 72558.  

Also for a system administrator working full time it's roughly a $50k starting position
Lucky you, I got $26K, after 5 years. The UK doesn't pay workers very well. I assume you're talking US dollars?


I live in Australia, and for us "junior system admin" is a $50-80k per year salary depending on experience. Also I was quoting Australian dollar. So USD would be $37k starting
Still more than rip-off Britain.
ID: 72571 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 72572 - Posted: 8 Apr 2022, 6:23:06 UTC - in response to Message 72560.  

From what I can see, it looks like the long, long list of n-body simulations is what is sticking in queue for other folks to process for validation. Is the system set up to run through all of them as a first pass before sending out follow-up WUs? Just trying to figure out why my Macs, which are only processing Separation tasks, have about 25 - 50% of tasks in Validation Inconclusive but my Windows laptop, running almost all N-body simulations, has > 90% of tasks in Validation Inconclusive.

To all the folks who have been here a while and are crunching heavy numbers, I tip my hat. To the folks who are actually running this operation, like Tom, you have my deepest thanks. I stopped my Physics education with my BA. The ability for folks to progress beyond that is something I can only marvel at.
+1 I have a BSc Hons in Physics and wish I'd continued.
ID: 72572 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 72573 - Posted: 8 Apr 2022, 6:27:56 UTC - in response to Message 72564.  

Seeing the word 'crashes' quite a bit lately. Appears there's work to be done in post hard drive failure recovery.
What needs to be done is to kick the "programmers" at Boinc up the behind.
ID: 72573 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 72574 - Posted: 8 Apr 2022, 6:29:50 UTC - in response to Message 72568.  

I think that you should stop guessing.... you sound very negative and cold in the view of life.

Who pays you to volunteer?
If you want to argue with me, you'll need to provide more specific queries.
ID: 72574 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 72576 - Posted: 8 Apr 2022, 10:39:43 UTC - in response to Message 72573.  

Seeing the word 'crashes' quite a bit lately. Appears there's work to be done in post hard drive failure recovery.


What needs to be done is to kick the "programmers" at Boinc up the behind.


Unfortunately they don't get paid to fix Boinc, it's ALL volunteer and though they do a decent job of keeping things going they are not putting forth much effort into upgrading Boinc to the next level.

And since I did not go to College at all I am glad to see all you guys that have and the degrees you guys have is humbling
ID: 72576 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,900,464
RAC: 0
Message 72584 - Posted: 8 Apr 2022, 14:29:36 UTC

None of Nbody Simulation seem to be validated for me, any one getting them validated ?
ID: 72584 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Robert Coplin

Send message
Joined: 23 Sep 13
Posts: 19
Credit: 36,223,867
RAC: 0
Message 72589 - Posted: 8 Apr 2022, 15:00:52 UTC - in response to Message 72584.  

I only had like 10 of my N-Body work units validated and all of those were created on March 6th and only need over 9400 more N-Body work units to be validated
ID: 72589 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bull Schuck

Send message
Joined: 20 Feb 22
Posts: 12
Credit: 16,836,989
RAC: 0
Message 72590 - Posted: 8 Apr 2022, 15:36:03 UTC

So it looks like a) the N-body tasks are getting sent out (I just got ~70), and for the first time in a while, they are not all _0 tasks. I am seeing _1 and _2 tasks, meaning that credit may be around the corner!
AND b) the Separation tasks are not getting sent out reliably. I am not seeing any and my queue has run dry.

I recall Tom pointing out that he was going to suspend the N-body tasks, did the wrong switch get flipped? I've done that before.

Please don't take this as complaining. Nobody "owes" my computers tasks or credit. Just trying to capture the situation today.
ID: 72590 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cavalary
Avatar

Send message
Joined: 23 Aug 11
Posts: 33
Credit: 11,062,253
RAC: 0
Message 72596 - Posted: 8 Apr 2022, 16:28:14 UTC

Yep, just noticed that it's been over a day since I got any more tasks too. Just doing separation and just using CPU, server message says GPU tasks are available but nothing coming for CPU it seems.
ID: 72596 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 72607 - Posted: 8 Apr 2022, 22:19:27 UTC - in response to Message 72576.  

What needs to be done is to kick the "programmers" at Boinc up the behind.
Unfortunately they don't get paid to fix Boinc, it's ALL volunteer and though they do a decent job of keeping things going they are not putting forth much effort into upgrading Boinc to the next level.
There are upgrades and I have them (if you run windows I can give you them). David Anderson is refusing to release an official version. Some people are trying to persuade him....

And since I did not go to College at all I am glad to see all you guys that have and the degrees you guys have is humbling
If it makes you feel any better, my degree never did me any good. Couldn't find a job in physics, got IT jobs based solely on my hobby!
ID: 72607 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 72608 - Posted: 8 Apr 2022, 22:23:18 UTC - in response to Message 72590.  

So it looks like a) the N-body tasks are getting sent out (I just got ~70), and for the first time in a while, they are not all _0 tasks. I am seeing _1 and _2 tasks, meaning that credit may be around the corner!
AND b) the Separation tasks are not getting sent out reliably. I am not seeing any and my queue has run dry.
Same here. I've just connected FOUR GPUs to one machine and unfortunately that machine hates Folding at Home (don't know why, it crashes the AMD driver), so I was going to try Milkyway on the GPUs which seems to be easier on the driver, but there are none, I'm having to run collatz on it!

I recall Tom pointing out that he was going to suspend the N-body tasks, did the wrong switch get flipped? I've done that before.
So have I and it caused an explosion, at least with computers you can't start fires.

Please don't take this as complaining. Nobody "owes" my computers tasks or credit. Just trying to capture the situation today.
Indeed. I'm happy to run whatever is available and credit comes whenever. I just use credit to see how well I'm doing, it's also a nice indicator of if a computer is playing up.
ID: 72608 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 15 · Next

Message boards : News : Server Downtime March 28, 2022 (12 hours starting 00:00 UTC)

©2024 Astroinformatics Group