Message boards :
News :
Server Downtime March 28, 2022 (12 hours starting 00:00 UTC)
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 15 · Next
Author | Message |
---|---|
Send message Joined: 10 Apr 19 Posts: 408 Credit: 120,203,200 RAC: 0 |
No no no, please don't set that, something more sensible please? Unless you actually have that amount of RAM its not going to help. Yeah, I was surprised when I saw that it was already set to that value. Maybe in the future I'll set it to 1 GB in order to moderate the amount of memory allocated to the feeder and scheduler pool? I just cycled a bunch of unsent WUs that were waiting to get sent, which should reduce the load on the server after they get deleted from the DB. I will try turning off the WU generators if the number of WUs waiting to get sent doesn't decrease after the server status page updates. |
Send message Joined: 10 Apr 19 Posts: 408 Credit: 120,203,200 RAC: 0 |
Tom - thanks for the clarification. I think it's supposed to update every 30 min, or at least every hour. It used to update pretty frequently. I think it just takes forever for the DB queries that generate it to go through because of how slow things are at the moment. |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
@ Tom: @ Tom: Nothing has changed regarding the unfindable 600 tasks in progress. Still not getting any GPU work ... It is late - "see" you tomorrow! |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
It looks like those 600 tasks in progress, but nowhere to be seen in BOINC Manager, are tasks solely with a minimum quorum of one.What should they be? I can't access most of mine, it says "can't find workunit", but those I can say "minimum quorum" 1, but "initial replication" 2. If they only need 1, why make 2? |
Send message Joined: 16 Mar 10 Posts: 213 Credit: 108,358,578 RAC: 4,707 |
To those of you commenting on tasks that the MilkyWay site says are "In Progress" but aren't showing up in BOINC Manager (or equivalent)... They are tasks that have got "orphaned" because of network issues -- if you check your BOINC log for around the time the MW site claims the tasks were sent you'll probably find there's a request for work that failed on a timeout. For what it's worth, one of my machines currently has 195 In Progress according to the MW site but BOINC Manager shows 131; the other is alleged to have 117 but actually has 100! In both cases, I can associate all the "lost" tasks with attempted connections that timed out. I don't know whether these can be picked up again if one resets the project; in theory, the server should be able to determine what tasks are "lost" as the result of the reset and send them again (up to 12 at a time, I think), However, when I tried that with one of my machines a few days ago I only got new tasks when the reset completed; that might've been because of the server issues, though. I'll try again next time I run out of work (unless someone else beats me to a conclusive test now the server is behaving somewhat better...) If they can't be re-claimed by a reset, I guess we're stuck with them in the same way some of us have odd tasks left over (in various states) from the database record renumbering exercise of early 2021 :-) Cheers - Al. |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
@ Peter: It looks like those 600 tasks in progress, but nowhere to be seen in BOINC Manager, are tasks solely with a minimum quorum of one.What should they be? I can't access most of mine, it says "can't find workunit", but those I can say "minimum quorum" 1, but "initial replication" 2. If they only need 1, why make 2? Haven't tried to access all of the 600. They are distributed over several rigs. I checked about 50-60 of them and they all have a "minimum quorum" and "initial replication" of 1 (one). Strange - hope Tom can fix this. |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
@ alanb1951: To those of you commenting on tasks that the MilkyWay site says are "In Progress" but aren't showing up in BOINC Manager (or equivalent) Thanks for the interesting info! But what do you mean by "BOINC log"? Do you mean the "Event log"? It gets cleared when you exit BOINC. I can't find any other log anywhere. I did a "Reset", but nothing has changed. My other (main) problem seems to be, that I am still unable to get tasks. The "update" request gives the "Event log" message "3/30/2022 8:52:03 AM | Milkyway@Home | Scheduler request completed: got 0 new tasks" repeatedly. Hope Tom can do something about this. Have a nice day! |
Send message Joined: 13 Apr 17 Posts: 256 Credit: 604,411,638 RAC: 0 |
I'm completely at loss. Restarted BOINC for "I don't know how many times" and suddenly I am getting tasks on all rigs without having changed anything. GREAT! Sounds like magic ... |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
@ Peter:I've looked at more and the replication is only 2 on some. I guess that was a mistake or one got lost as per earlier messages in here. |
Send message Joined: 11 Jul 20 Posts: 2 Credit: 35,499,678 RAC: 324 |
Good day. I want to ask if anyone has the same problem with their own account as me. As of 28 Mar no account has been credited to my account credit for submitted finished work. All finished work goes to the folder waiting for validation and then moves to the folder partial validation. And since the same date, my average credit has not changed and the average credit should decrease when it does does not apply any credit. |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
Good day.It's not you, the server is overloaded after a disk problem. Credit will appear when it's caught up. Keep an eye on the server status page at the 3 million tasks not yet validated. When those get done, you should get credit. |
Send message Joined: 21 Feb 22 Posts: 66 Credit: 817,008 RAC: 0 |
Monty- it isn't you, it is the system. The server lost a hard drive and was limping along for a while, and now that there is a new hard drive in the system, it will take some time for it to recover and catch up on all the validating. The system did manage to go from over 4 million waiting for validation to 3088742 now, so I'll take that as a good sign of things moving in the right direction. I seem to be getting WUs in a random way. I do a manual request here and there so the time between requests doesn't get to long, but otherwise I haven't restarted boinc or my computer. I've got some now, so I'll hope the system is on the mend after all the stuff Tom did the other day. I do think the one poster was right and that the db is overloaded and slow which is why there are WUs but we don't always get them when we ask. |
Send message Joined: 11 Jul 20 Posts: 2 Credit: 35,499,678 RAC: 324 |
Thank you for the explanation |
Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,900,464 RAC: 0 |
As far as I can see validation is still at least 9-10 days behind. Although compared to others in numbers mine are low ,my Valid Inconclusive number is steadily increasing covering almost the whole of March. I am still bemused that the volume of WU’s ready to send is up around 18 Million, I thought this was going to be stopped until things stabilised. Surely it would better to stop generating new WU’s until things have caught up ? |
Send message Joined: 31 Mar 12 Posts: 96 Credit: 152,502,177 RAC: 0 |
@ Peter:I've looked at more and the replication is only 2 on some. I guess that was a mistake or one got lost as per earlier messages in here. Its not a mistake, its a feature of BOINC. Some "Workunits" get chosen to replicate 2 "tasks" others may only have 1 "task" Source: https://boinc.berkeley.edu/trac/wiki/ValidationSummary#Adaptivereplication and https://boinc.berkeley.edu/trac/wiki/AdaptiveReplication |
Send message Joined: 31 Mar 12 Posts: 96 Credit: 152,502,177 RAC: 0 |
As far as I can see validation is still at least 9-10 days behind. Although compared to others in numbers mine are low ,my Valid Inconclusive number is steadily increasing covering almost the whole of March. It has according to my stats: You can view this dashboard at: https://grafana.kiska.pw/goto/hA52pAy7k?orgId=1 |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
As far as I can see validation is still at least 9-10 days behind. Although compared to others in numbers mine are low ,my Valid Inconclusive number is steadily increasing covering almost the whole of March.Boinc server software, like the client software, is a cobbled together piece of crap. It's not easy to control what it does. |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
WTF? Boinc being intelligent?@ Peter:I've looked at more and the replication is only 2 on some. I guess that was a mistake or one got lost as per earlier messages in here. |
Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,900,464 RAC: 0 |
As far as I can see validation is still at least 9-10 days behind. Although compared to others in numbers mine are low ,my Valid Inconclusive number is steadily increasing covering almost the whole of March.Boinc server software, like the client software, is a cobbled together piece of crap. It's not easy to control what it does. Love the explanation! |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
:-) If you say anything like that on the main boinc forums you get banned. I've been banned 38 times.As far as I can see validation is still at least 9-10 days behind. Although compared to others in numbers mine are low ,my Valid Inconclusive number is steadily increasing covering almost the whole of March.Boinc server software, like the client software, is a cobbled together piece of crap. It's not easy to control what it does. |
©2024 Astroinformatics Group