Welcome to MilkyWay@home

Daily graphs of server_status

Message boards : Number crunching : Daily graphs of server_status
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,910,661
RAC: 12,491
Message 74473 - Posted: 17 Oct 2022, 15:06:48 UTC
Last modified: 17 Oct 2022, 15:23:11 UTC

I guess its time to do daily postings of what the graph looks like for workunits and discussions(?):
ID: 74473 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74474 - Posted: 17 Oct 2022, 15:28:36 UTC - in response to Message 74473.  

I guess its time to do daily postings of what the graph looks like for workunits and discussions(?):


5 Million waiting for validation by the end of tomorrow?
ID: 74474 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,910,661
RAC: 12,491
Message 74475 - Posted: 17 Oct 2022, 15:37:25 UTC - in response to Message 74474.  

I guess its time to do daily postings of what the graph looks like for workunits and discussions(?):


5 Million waiting for validation by the end of tomorrow?


I am going to omit the graph from the quote.

But yes it does seem to be trending that way
ID: 74475 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile alk44
Avatar

Send message
Joined: 2 Mar 20
Posts: 131
Credit: 315,523,238
RAC: 32,497
Message 74476 - Posted: 17 Oct 2022, 21:37:17 UTC - in response to Message 74473.  

Yes, it's obvious that the computers at Milkyway are not able to keep up with the WU's we are producing. I don't understand why they are not able to keep up since we are seemingly not finishing anymore now than when it suddenly stopped being able to keep up. The System always says everything is running and I've heard very little from Tom on the subject.
Wish we knew what is wrong.
ID: 74476 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,029,129
RAC: 37,181
Message 74477 - Posted: 17 Oct 2022, 21:43:27 UTC - in response to Message 74473.  

I guess its time to do daily postings of what the graph looks like for workunits and discussions(?):
Yes, please do.
ID: 74477 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 74486 - Posted: 18 Oct 2022, 15:01:11 UTC - in response to Message 74476.  

Yes, it's obvious that the computers at Milkyway are not able to keep up with the WU's we are producing. I don't understand why they are not able to keep up since we are seemingly not finishing anymore now than when it suddenly stopped being able to keep up. The System always says everything is running and I've heard very little from Tom on the subject.
Wish we knew what is wrong.


The workunit generators don't generate tasks (wingman tasks or initial tasks) if the WU pools have more tasks than they should. So when the nbody pool had like 100k tasks in it, any tasks that you sent back were essentially put on hold because the WU generator wouldn't make any wingman tasks for validation until the pool was cleared.

It's not so much that the "computer can't keep up" as it is that the WU pool being overfilled is a nasty bug. I want to migrate to the new server hardware ASAP but other people have to get things compiled and sorted out before that can happen. Currently, the server is slow becaus eit is short on memory, so in that regard the computer can't keep up.
ID: 74486 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,910,661
RAC: 12,491
Message 74488 - Posted: 18 Oct 2022, 16:17:21 UTC

I guess its that time for a daily graph of server_status:
ID: 74488 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile alk44
Avatar

Send message
Joined: 2 Mar 20
Posts: 131
Credit: 315,523,238
RAC: 32,497
Message 74489 - Posted: 18 Oct 2022, 16:55:48 UTC - in response to Message 74486.  

Yes, it's obvious that the computers at Milkyway are not able to keep up with the WU's we are producing. I don't understand why they are not able to keep up since we are seemingly not finishing anymore now than when it suddenly stopped being able to keep up. The System always says everything is running and I've heard very little from Tom on the subject.
Wish we knew what is wrong.


The workunit generators don't generate tasks (wingman tasks or initial tasks) if the WU pools have more tasks than they should. So when the nbody pool had like 100k tasks in it, any tasks that you sent back were essentially put on hold because the WU generator wouldn't make any wingman tasks for validation until the pool was cleared.

It's not so much that the "computer can't keep up" as it is that the WU pool being overfilled is a nasty bug. I want to migrate to the new server hardware ASAP but other people have to get things compiled and sorted out before that can happen. Currently, the server is slow becaus eit is short on memory, so in that regard the computer can't keep up.



Thanks Tom, really appreciate the update info. Is the memory the type that we can help you with or is it "special" and only something we can contribute money for you to buy or is it even worth it, since you are talking about migrating to a new server anyway? It certainly will be nice when everything is updated and running smoothly again.
Thanks for your great efforts!!

Allen
ID: 74489 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,029,129
RAC: 37,181
Message 74490 - Posted: 18 Oct 2022, 18:23:29 UTC - in response to Message 74489.  

Yes, it's obvious that the computers at Milkyway are not able to keep up with the WU's we are producing. I don't understand why they are not able to keep up since we are seemingly not finishing anymore now than when it suddenly stopped being able to keep up. The System always says everything is running and I've heard very little from Tom on the subject.
Wish we knew what is wrong.


The workunit generators don't generate tasks (wingman tasks or initial tasks) if the WU pools have more tasks than they should. So when the nbody pool had like 100k tasks in it, any tasks that you sent back were essentially put on hold because the WU generator wouldn't make any wingman tasks for validation until the pool was cleared.

It's not so much that the "computer can't keep up" as it is that the WU pool being overfilled is a nasty bug. I want to migrate to the new server hardware ASAP but other people have to get things compiled and sorted out before that can happen. Currently, the server is slow becaus eit is short on memory, so in that regard the computer can't keep up.



Thanks Tom, really appreciate the update info. Is the memory the type that we can help you with or is it "special" and only something we can contribute money for you to buy or is it even worth it, since you are talking about migrating to a new server anyway? It certainly will be nice when everything is updated and running smoothly again.
Thanks for your great efforts!!

Allen
+1 What Allen said. How much $$ to bulk up the memory? If the memory is upgraded after the new sever gets online, the old one can be used as a backup, or it can be run in parallel with he new one. Maybe they could split the load 50/50. I presume you are going to need more hardware for the multi-galaxy sims anyway, so there's also that....
ID: 74490 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74496 - Posted: 19 Oct 2022, 9:24:36 UTC - in response to Message 74490.  
Last modified: 19 Oct 2022, 9:44:56 UTC

Is it possible to track 2 items on a graph ..would be interesting to see Nbody average run time and total waiting for validation on the same graph. My own, probably misguided view is things have not been the same since we got these very long Nbody Simulation WU’s.

I have messed around with the graphs and going back 90 days there seems to be a massive spike in Nbody average run time that coincides with the growth of waiting for validation.Average Nbody run time is around 15 times higher than it used to be. I may be misguided in thinking the two are linked.
ID: 74496 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,910,661
RAC: 12,491
Message 74498 - Posted: 19 Oct 2022, 16:14:07 UTC
Last modified: 19 Oct 2022, 16:14:23 UTC

A daily graph:


Seems we have a small recovery
ID: 74498 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,910,661
RAC: 12,491
Message 74499 - Posted: 19 Oct 2022, 16:21:11 UTC - in response to Message 74496.  

Is it possible to track 2 items on a graph ..would be interesting to see Nbody average run time and total waiting for validation on the same graph. My own, probably misguided view is things have not been the same since we got these very long Nbody Simulation WU’s.

I have messed around with the graphs and going back 90 days there seems to be a massive spike in Nbody average run time that coincides with the growth of waiting for validation.Average Nbody run time is around 15 times higher than it used to be. I may be misguided in thinking the two are linked.


You'll need to add this query:


Then add this override:


Then you'll get this graph:
ID: 74499 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74500 - Posted: 19 Oct 2022, 16:47:29 UTC - in response to Message 74499.  

Thanks for that Kiska most informative. There seems to be a good correlation to me , especially latterly.
ID: 74500 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74501 - Posted: 19 Oct 2022, 18:33:00 UTC - in response to Message 74500.  
Last modified: 19 Oct 2022, 18:33:23 UTC

Thanks again Kiska would it be possible to make that the daily graph please.
ID: 74501 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,910,661
RAC: 12,491
Message 74508 - Posted: 20 Oct 2022, 13:56:19 UTC

Here is the daily graph:


I am sure you'll note my weird posting times, that is because I do it when I get home from work :D

Also another change is the addition of average run times on the right axis
ID: 74508 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 74509 - Posted: 20 Oct 2022, 14:19:17 UTC
Last modified: 20 Oct 2022, 14:19:29 UTC

Thanks Tom, really appreciate the update info. Is the memory the type that we can help you with or is it "special" and only something we can contribute money for you to buy or is it even worth it, since you are talking about migrating to a new server anyway? It certainly will be nice when everything is updated and running smoothly again.
Thanks for your great efforts!!

Allen


The new server hardware has more than 2x the amount of memory as the current server, so once we migrate it should be less of a problem. The problem is migrating...!
ID: 74509 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,910,661
RAC: 12,491
Message 74510 - Posted: 20 Oct 2022, 14:27:47 UTC

Here is another graph I decided to add:
ID: 74510 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,910,661
RAC: 12,491
Message 74511 - Posted: 20 Oct 2022, 14:28:46 UTC - in response to Message 74509.  

Thanks Tom, really appreciate the update info. Is the memory the type that we can help you with or is it "special" and only something we can contribute money for you to buy or is it even worth it, since you are talking about migrating to a new server anyway? It certainly will be nice when everything is updated and running smoothly again.
Thanks for your great efforts!!

Allen


The new server hardware has more than 2x the amount of memory as the current server, so once we migrate it should be less of a problem. The problem is migrating...!


Hope you're finding these graphs useful :D

btw sorry for the 1 request per minute to the server_status page
ID: 74511 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74512 - Posted: 20 Oct 2022, 14:50:53 UTC - in response to Message 74508.  

Thanks very useful…..
ID: 74512 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74513 - Posted: 20 Oct 2022, 14:51:39 UTC - in response to Message 74508.  

Thanks very useful…..
ID: 74513 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Daily graphs of server_status

©2024 Astroinformatics Group