Welcome to MilkyWay@home

Validation Pending too many tasks

Message boards : Number crunching : Validation Pending too many tasks
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,853
RAC: 272
Message 74427 - Posted: 12 Oct 2022, 15:48:29 UTC - in response to Message 74426.  

Now it’s 3690194….not sure who is getting WU’s validated it certainly is not me.
ID: 74427 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
San-Fernando-Valley

Send message
Joined: 13 Apr 17
Posts: 256
Credit: 604,411,638
RAC: 0
Message 74428 - Posted: 12 Oct 2022, 17:34:00 UTC

Response times are back "in the cellar".
ID: 74428 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,029,129
RAC: 37,181
Message 74429 - Posted: 12 Oct 2022, 19:03:53 UTC

Cheer up folks! It's only gonna get worse! The dead time between separation GPU reloads is gone! Wohoo! I seem to be getting new work at about 125 or so left to finish. Noice, so to speak.
ID: 74429 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 13 Oct 21
Posts: 43
Credit: 225,019,025
RAC: 9,810
Message 74432 - Posted: 12 Oct 2022, 20:59:08 UTC - in response to Message 74429.  

Really? I haven't seen it yet and just had to manually request tasks as the queue emptied out. Even with an empty queue I haven't been getting the max 300 tasks like before, just got 224 and even less before.
ID: 74432 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Speedy51

Send message
Joined: 12 Jun 10
Posts: 57
Credit: 6,163,587
RAC: 156
Message 74434 - Posted: 13 Oct 2022, 0:25:15 UTC

I believe when you get sent over 200 _0 GPU tasks it doesn't help the situation of clearing the queue, In fact it makes it longer
    I believe the best way to clear the weighting backlog is to:
    Stop creating new tasks allow the queue(s) to clear
    Allow new tasks to be created to allow work to be validated

ID: 74434 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,029,129
RAC: 37,181
Message 74435 - Posted: 13 Oct 2022, 1:43:52 UTC - in response to Message 74432.  

Really? I haven't seen it yet and just had to manually request tasks as the queue emptied out. Even with an empty queue I haven't been getting the max 300 tasks like before, just got 224 and even less before.
Well krap. Just watched it count down to zero and had to tickle it to send more work. Earlier today I was characterizing the AMD GPU by changing the app config file. Maybe that is what did it. I must confess, I did go all happy feet over it. At least I got the 300 WUs.
ID: 74435 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,943,471
RAC: 22,405
Message 74437 - Posted: 13 Oct 2022, 12:08:50 UTC - in response to Message 74434.  

I believe when you get sent over 200 _0 GPU tasks it doesn't help the situation of clearing the queue, In fact it makes it longer
    I believe the best way to clear the weighting backlog is to:
    Stop creating new tasks allow the queue(s) to clear
    Allow new tasks to be created to allow work to be validated



It would also help if they could put any resends as next to be sent out instead of at the end of the now very long queue, maybe put resends in their own folder and tell the Server to send those out before going back to the normal queue to send out tasks
ID: 74437 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,943,471
RAC: 22,405
Message 74438 - Posted: 13 Oct 2022, 12:11:32 UTC - in response to Message 74434.  

I believe when you get sent over 200 _0 GPU tasks it doesn't help the situation of clearing the queue, In fact it makes it longer
    I believe the best way to clear the weighting backlog is to:
    Stop creating new tasks allow the queue(s) to clear
    Allow new tasks to be created to allow work to be validated



It would also help if they could put any task with a non _0 as next to be sent out instead of at the end of the now very long queue, maybe put any task with a non _0 in it's name in their own folder and tell the Server to send those out before going back to the normal folder to send out tasks
ID: 74438 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Speedy51

Send message
Joined: 12 Jun 10
Posts: 57
Credit: 6,163,587
RAC: 156
Message 74440 - Posted: 13 Oct 2022, 20:48:59 UTC - in response to Message 74438.  

I believe when you get sent over 200 _0 GPU tasks it doesn't help the situation of clearing the queue, In fact it makes it longer
    I believe the best way to clear the weighting backlog is to:
    Stop creating new tasks allow the queue(s) to clear
    Allow new tasks to be created to allow work to be validated



It would also help if they could put any task with a non _0 as next to be sent out instead of at the end of the now very long queue, maybe put any task with a non _0 in it's name in their own folder and tell the Server to send those out before going back to the normal folder to send out tasks

Another way this could be achieved is anything higher than _0 is set with a shorter deadline, in theory this should push it to the front of your processing queue
ID: 74440 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 208
Credit: 105,442,501
RAC: 36,767
Message 74441 - Posted: 13 Oct 2022, 20:49:17 UTC - in response to Message 74438.  
Last modified: 13 Oct 2022, 20:57:00 UTC

I believe when you get sent over 200 _0 GPU tasks it doesn't help the situation of clearing the queue, In fact it makes it longer
    I believe the best way to clear the weighting backlog is to:
    Stop creating new tasks allow the queue(s) to clear
    Allow new tasks to be created to allow work to be validated



It would also help if they could put any task with a non _0 as next to be sent out instead of at the end of the now very long queue, maybe put any task with a non _0 in it's name in their own folder and tell the Server to send those out before going back to the normal folder to send out tasks
There is an option to the feeder that allows for "priority" tasks to get precedence, but priority is apparently only assigned to work units that have overdue results (which, I think, includes "Not started by deadline" as well as "No Reply") so it wouldn't apply here.

However, there is also an option to use priority then work-unit id -- theoretically that would help clear out the older units first, so it looks like a sensible default for projects that don't need to prioritize new work[1]. If that's already engaged here, it doesn't seem to be the solution to clearing this backlog, although it ought to push out non _0 tasks as early as possible...

[Edit - just seen Speedy51's post, which effectively points to the same place!]

Cheers - Al.

[1] In general, using adaptive replication (as at MilkyWay) might suggest that getting as many work units as possible processed as quickly as possible is the goal, in which case prioritizing retries that are not due to time-outs may not be appropriate (or necessary if there isn't already a huge backlog!...)
ID: 74441 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,853
RAC: 272
Message 74442 - Posted: 14 Oct 2022, 6:16:02 UTC - in response to Message 74441.  

The waiting for validation backlog is over 4 Million and climbing. Simulation backlog seems to be around 14 days and Separation around 7. Would it be a good idea to stop generating new tasks until the backlog is reduced ?
ID: 74442 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 13 Oct 21
Posts: 43
Credit: 225,019,025
RAC: 9,810
Message 74443 - Posted: 14 Oct 2022, 6:23:47 UTC - in response to Message 74441.  

It seems like whatever the problem is it's affecting the Validator. If tasks were getting validation attempts they'd be getting marked as valid, invalid, or inconclusive. Instead they're stuck in pending. Upon quick look it seems like almost none of the pending tasks have "wing-man" tasks generated yet. Otherwise they'd show up as assigned (to a machine) or unsent. So the only thing that the Task Generator can do is generate new, _0, tasks. But that also doesn't seem to be working well, at least for Separation, as work there is hard to get.

In general, tasks get assigned to users in the order they were created so we just need to keep crunching since when the validation starts working again, and second-attempt tasks start getting generated, they'd be going to the back of the queue and so to get to them we need to process everything in front of them. So I'd say that the best thing to do is just to keep crunching. Hopefully things can get resolved soon on the server side of it. Good thing is that there's a plan to replace the server, by the end of the year I believe, which should prevent the recurrence of significant problems the project has experienced this year.
ID: 74443 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,943,471
RAC: 22,405
Message 74444 - Posted: 14 Oct 2022, 10:37:20 UTC - in response to Message 74443.  
Last modified: 14 Oct 2022, 10:38:59 UTC

It seems like whatever the problem is it's affecting the Validator. If tasks were getting validation attempts they'd be getting marked as valid, invalid, or inconclusive. Instead they're stuck in pending. Upon quick look it seems like almost none of the pending tasks have "wing-man" tasks generated yet. Otherwise they'd show up as assigned (to a machine) or unsent. So the only thing that the Task Generator can do is generate new, _0, tasks. But that also doesn't seem to be working well, at least for Separation, as work there is hard to get.

In general, tasks get assigned to users in the order they were created so we just need to keep crunching since when the validation starts working again, and second-attempt tasks start getting generated, they'd be going to the back of the queue and so to get to them we need to process everything in front of them. So I'd say that the best thing to do is just to keep crunching. Hopefully things can get resolved soon on the server side of it. Good thing is that there's a plan to replace the server, by the end of the year I believe, which should prevent the recurrence of significant problems the project has experienced this year.


You say you are having trouble getting tasks and I see that you are doing gpu separation tasks, I'm doing both cpu and gpu separation tasks, not on the same machine, and am only having problems getting tasks when the website is non responsive. I have had to setup a backup cpu project when I can't get those tasks but I seem to be keeping the cache full enough gpu tasks for the few machines running those. I have a 1 day plus 1/2 day setup as my cache for each machine.
ID: 74444 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Slywy

Send message
Joined: 22 Jul 12
Posts: 11
Credit: 1,008,373
RAC: 0
Message 74450 - Posted: 15 Oct 2022, 15:36:18 UTC - in response to Message 74424.  

Putting this here so I'll remember what it is now: Workunits waiting for validation 3685233


Now Workunits waiting for validation 4329195

Higher than before but some have been validated. Backlog + new?
ID: 74450 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,853
RAC: 272
Message 74460 - Posted: 16 Oct 2022, 8:37:24 UTC - in response to Message 74450.  
Last modified: 16 Oct 2022, 8:39:46 UTC

Putting this here so I'll remember what it is now: Workunits waiting for validation 3685233


Now Workunits waiting for validation 4329195

Higher than before but some have been validated. Backlog + new?


Now its 4379516....gone up again.

Simulation tasks around 17 days wait for validation, Separation some (one or two) instant most around 7 days. Have emptied my cache now.
ID: 74460 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,943,471
RAC: 22,405
Message 74462 - Posted: 16 Oct 2022, 10:53:11 UTC - in response to Message 74404.  

On 11 Oct 2022 I posted this:

In progress (747) · Validation pending (4510) · Validation inconclusive (256) · Valid (3269) · Invalid (2) · Error (1)

Workunits waiting for validation 3009782


And today, 16 Oct 2022, I have this:
In progress (692) · Validation pending (7400) · Validation inconclusive (409) · Valid (4311) · Invalid (2) · Error (2)

Workunits waiting for validation 4397896.

So almost 11 days worth of tasks have gotten validated in 5 calendar days

I am ONLY doing Separation tasks both cpu and gpu ones
ID: 74462 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,853
RAC: 272
Message 74464 - Posted: 16 Oct 2022, 11:34:41 UTC - in response to Message 74462.  

I can only do CPU tasks as I h a an Intel GPU which not supported. Maybe GPU tasks are validated quicker as they will undoubtedly be shorter.
ID: 74464 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
San-Fernando-Valley

Send message
Joined: 13 Apr 17
Posts: 256
Credit: 604,411,638
RAC: 0
Message 74465 - Posted: 16 Oct 2022, 11:48:19 UTC

Hate to say it, but my N-Body Simulation tasks are doing fine ...
ID: 74465 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 151,909,995
RAC: 12,498
Message 74468 - Posted: 16 Oct 2022, 15:31:35 UTC

Here is mine
State: All (125) · In progress (91) · Validation pending (0) · Validation inconclusive (6) · Valid (28) · Invalid (0) · Error (0)
Application: All (133) · Milkyway@home N-Body Simulation (125) 


I am doing NBody which seems to be validating tasks from... 4 weeks ago



Available here https://grafana.kiska.pw/d/boinc/boinc?orgId=1&var-project=milkyway@home&from=now-7d&to=now&viewPanel=3

The above link is for last 7 days, but you can increase or decrease the time range
ID: 74468 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skillz

Send message
Joined: 28 May 17
Posts: 76
Credit: 4,386,301,288
RAC: 0
Message 74470 - Posted: 17 Oct 2022, 0:30:08 UTC - in response to Message 74464.  
Last modified: 17 Oct 2022, 0:30:42 UTC

I can only do CPU tasks as I h a an Intel GPU which not supported. Maybe GPU tasks are validated quicker as they will undoubtedly be shorter.


Nope, I have over 200k GPU tasks waiting validation and that number just keeps increasing.

I backed a few of my rigs off it to work on something else. Might end up taking them all off to work on something else until the problem starts to resolve itself.
ID: 74470 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Validation Pending too many tasks

©2024 Astroinformatics Group