Welcome to MilkyWay@home

Validation Pending too many tasks

Message boards : Number crunching : Validation Pending too many tasks
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Jesse Viviano

Send message
Joined: 4 Feb 11
Posts: 86
Credit: 60,913,150
RAC: 0
Message 74471 - Posted: 17 Oct 2022, 1:45:33 UTC

I am wondering if something is wrong with either the transitioner or the validator. When sufficient numbers of successful task for a work unit are turned in, it is normally the job of the transitioner to send the work unit to the validator's queue. The validator can then declare the tasks as valid, invalid, or having an inconclusive validation where another task must be generated for the work unit so that the validator can try again with more results. Generally, inconclusive for most projects means that one of the tasks appears to be bad, so another task must be made to find and disqualify the bad result. This project uses that for either that or when it decides that it needs more tasks before validation can be performed. I have noticed that when the transitioner malfunctions in some projects, I have to wait for my result to hit its deadline. At the time that the deadline passes, something notices that enough tasks to attempt validation of the work unit have been returned, so the work unit is sent to the validator instead of having another task generated to replace a late task.

I have noticed that the only tasks in my queue that are validating are the ones that have 3 or more tasks in the work unit.

I currently have taken my machine off of BOINC duty, but that is because my apartment's air conditioner has failed and has nothing to do with this project's validation problem.
ID: 74471 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 74479 - Posted: 17 Oct 2022, 23:48:12 UTC - in response to Message 74462.  

On 11 Oct 2022 I posted this:

[quote]In progress (747) · Validation pending (4510) · Validation inconclusive (256) · Valid (3269) · Invalid (2) · Error (1)

Total Workunits waiting for validation 3009782


And today, 16 Oct 2022, I have this:
In progress (692) · Validation pending (7400) · Validation inconclusive (409) · Valid (4311) · Invalid (2) · Error (2)

Total Workunits waiting for validation 4397896.

17 Oct 2022 for me:
In progress (978) · Validation pending (8503) · Validation inconclusive (388) · Valid (3783) · Invalid (3) · Error (5)

Very little progress being made at all!!

Total Workunits waiting for validation 4922896
ID: 74479 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Slywy

Send message
Joined: 22 Jul 12
Posts: 11
Credit: 1,008,373
RAC: 0
Message 74482 - Posted: 18 Oct 2022, 11:12:45 UTC

I have three ready to report from last night that aren't going anyway. 50 with validation pending.
ID: 74482 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Slywy

Send message
Joined: 22 Jul 12
Posts: 11
Credit: 1,008,373
RAC: 0
Message 74483 - Posted: 18 Oct 2022, 12:25:06 UTC - in response to Message 74482.  
Last modified: 18 Oct 2022, 12:25:21 UTC

I have three ready to report from last night that aren't going anyway. 50 with validation pending.

I just realized servers are down.
ID: 74483 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
San-Fernando-Valley

Send message
Joined: 13 Apr 17
Posts: 256
Credit: 604,411,638
RAC: 0
Message 74484 - Posted: 18 Oct 2022, 14:28:08 UTC - in response to Message 74479.  

mikey:
I guess you are talking about Separation tasks ...
ID: 74484 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 74485 - Posted: 18 Oct 2022, 14:58:46 UTC

It seems like whatever the problem is it's affecting the Validator. If tasks were getting validation attempts they'd be getting marked as valid, invalid, or inconclusive. Instead they're stuck in pending. Upon quick look it seems like almost none of the pending tasks have "wing-man" tasks generated yet. Otherwise they'd show up as assigned (to a machine) or unsent. So the only thing that the Task Generator can do is generate new, _0, tasks. But that also doesn't seem to be working well, at least for Separation, as work there is hard to get.


The workunit generators don't generate tasks (wingman tasks or initial tasks) if the WU pools have more tasks than they should. So when the nbody pool had like 100k tasks in it, any tasks that you sent back were essentially put on hold because the WU generator wouldn't make any wingman tasks for validation until the pool was cleared.
ID: 74485 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 13 Oct 21
Posts: 44
Credit: 226,874,311
RAC: 17,400
Message 74491 - Posted: 18 Oct 2022, 20:25:38 UTC - in response to Message 74485.  

So it seems like the problems are unlikely to go away until after the migration to new hardware (and fixing the issues that might come with that) as the workunit pool overfill bug seems to be very persistent.

The WU overfill bug still doesn't explain the growing validation queue, I don't think. When the task generator creates new tasks, why does it seem like no wingman/validation tasks are being created, just new, initial ones, given the large and ever-growing Waiting for Validation queue? Overfill or not, I don't see why the Validation queue is so large and growing.
ID: 74491 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skillz

Send message
Joined: 28 May 17
Posts: 76
Credit: 4,398,902,783
RAC: 14,697
Message 74492 - Posted: 18 Oct 2022, 23:32:15 UTC - in response to Message 74491.  

So it seems like the problems are unlikely to go away until after the migration to new hardware (and fixing the issues that might come with that) as the workunit pool overfill bug seems to be very persistent.

The WU overfill bug still doesn't explain the growing validation queue, I don't think. When the task generator creates new tasks, why does it seem like no wingman/validation tasks are being created, just new, initial ones, given the large and ever-growing Waiting for Validation queue? Overfill or not, I don't see why the Validation queue is so large and growing.


The bug is the reason why. When there are already too many tasks waiting to be sent, new tasks are not created. The new tasks are the wingman tasks that need to be created and they wont create those tasks until the WU backlog of tasks waiting to send go down enough so the work generators will start generating new work.
ID: 74492 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 74493 - Posted: 19 Oct 2022, 1:50:44 UTC - in response to Message 74484.  

mikey:
I guess you are talking about Separation tasks ...


yes
ID: 74493 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 74494 - Posted: 19 Oct 2022, 1:57:29 UTC - in response to Message 74491.  

When the task generator creates new tasks, why does it seem like no wingman/validation tasks are being created, just new, initial ones, given the large and ever-growing Waiting for Validation queue? Overfill or not, I don't see why the Validation queue is so large and growing.


Because in alot of cases, in the past anyway, no wingman task is needed and instead of cancelling them they just never get generated in the first place. The problem seems to be that when the Server decides it needs a wingman task it adds it to the end of the task list not a separate folder that would get cleared before more _0 tasks are sent out.
ID: 74494 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 13 Oct 21
Posts: 44
Credit: 226,874,311
RAC: 17,400
Message 74495 - Posted: 19 Oct 2022, 6:33:37 UTC

Skillz, mikey,
There's no huge backlog like a few months ago that will take many weeks to clear. Current queues get cleared within a couple of days or so as the numbers are in thousands and tens of thousands instead of millions or tens of millions. So work generation still occurs just somewhat irregularly. With such a huge and growing validation queue, I'm wondering why the work that's being generated is not almost all wingman/validation work. Yes, as tasks get generated (new or resends) they go to the back of the queue but the queue gets cleared out every couple of days or so, thus I'd expect validation to be occurring regularly and not be piling up.

If you look at your Validation Inconclusive tasks you'll notice that there's a task In Progress or Unsent. In Validation Pending - there's nothing like that so that makes me think that validation hasn't been attempted yet on those tasks. I could be missing something but I suspect that something may be up with the validator.
ID: 74495 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 74497 - Posted: 19 Oct 2022, 10:48:07 UTC - in response to Message 74495.  

Skillz, mikey,
There's no huge backlog like a few months ago that will take many weeks to clear. Current queues get cleared within a couple of days or so as the numbers are in thousands and tens of thousands instead of millions or tens of millions. So work generation still occurs just somewhat irregularly. With such a huge and growing validation queue, I'm wondering why the work that's being generated is not almost all wingman/validation work. Yes, as tasks get generated (new or resends) they go to the back of the queue but the queue gets cleared out every couple of days or so, thus I'd expect validation to be occurring regularly and not be piling up.

If you look at your Validation Inconclusive tasks you'll notice that there's a task In Progress or Unsent. In Validation Pending - there's nothing like that so that makes me think that validation hasn't been attempted yet on those tasks. I could be missing something but I suspect that something may be up with the validator.


That makes sense why it's the way it is then, thanks for seeing that.

Alot of people have had 'ghost tasks' in the past and most projects get a few of them here and there, it will time out for me and then hopefully get sent out for real to someone else

My current numbers are:
In progress (957) · Validation pending (7855) · Validation inconclusive (484) · Valid (3588) · Invalid (2) · Error (5)

Which is a few more errors than the other day and ALOT more Validation pending tasks, the Validation inconclusive tasks have also gone up but the Valid tasks is still about the same, you can see I have a couple hundred more tasks in progress and that's because Einstein is running out of their GRP#1 tasks until they get a new batch at the end of the month

posted on 17 Oct:
In progress (747) · Validation pending (4510) · Validation inconclusive (256) · Valid (3269) · Invalid (2) · Error (1)
ID: 74497 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skillz

Send message
Joined: 28 May 17
Posts: 76
Credit: 4,398,902,783
RAC: 14,697
Message 74502 - Posted: 20 Oct 2022, 2:17:55 UTC

Oh yes, you are right. I didn't realize the works waiting to send was so low. I just assumed it was like last time and the work ready to send was way up there.

Something is wrong with the validator. It seems everytime the server restarts it starts working down the validation backlog, but then it stops and the validations just start to rise again. Whatever the server does when it first starts up and starts validating tasks isn't able to keep up with the amount of tasks being returned, so the few thousand it does gets over shadowed quickly in a day with the returns.
ID: 74502 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 74503 - Posted: 20 Oct 2022, 3:00:15 UTC - in response to Message 74495.  

Skillz, mikey,
There's no huge backlog like a few months ago that will take many weeks to clear. Current queues get cleared within a couple of days or so as the numbers are in thousands and tens of thousands instead of millions or tens of millions. So work generation still occurs just somewhat irregularly. With such a huge and growing validation queue, I'm wondering why the work that's being generated is not almost all wingman/validation work. Yes, as tasks get generated (new or resends) they go to the back of the queue but the queue gets cleared out every couple of days or so, thus I'd expect validation to be occurring regularly and not be piling up.


That's a good thing

If you look at your Validation Inconclusive tasks you'll notice that there's a task In Progress or Unsent. In Validation Pending - there's nothing like that so that makes me think that validation hasn't been attempted yet on those tasks. I could be missing something but I suspect that something may be up with the validator.


The first part makes sense and why it's growing I guess, I agree that the validator seems to have a problem, now whether that's the memory problem Tom was talking about or something else I don't know.
ID: 74503 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,900,464
RAC: 0
Message 74504 - Posted: 20 Oct 2022, 8:42:30 UTC - in response to Message 74503.  
Last modified: 20 Oct 2022, 8:48:13 UTC

As far as I can tell the current problem with validation did not start until early September when the profile of Nbody jobs changed from a few minutes to an unspecified number of hours, even days, ever since then the total waiting for validation has escalated. All the Nbody jobs I did have been aborted at least twice before I got them. Whether they will ever validate I don’t know. The coincidence between increasing Nbody run times and the validation backlog surely needs checking out ?
ID: 74504 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 74505 - Posted: 20 Oct 2022, 10:34:02 UTC - in response to Message 74504.  

As far as I can tell the current problem with validation did not start until early September when the profile of Nbody jobs changed from a few minutes to an unspecified number of hours, even days, ever since then the total waiting for validation has escalated. All the Nbody jobs I did have been aborted at least twice before I got them. Whether they will ever validate I don’t know. The coincidence between increasing Nbody run times and the validation backlog surely needs checking out ?


I agree
ID: 74505 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mrchips
Avatar

Send message
Joined: 31 Oct 10
Posts: 15
Credit: 281,009,768
RAC: 204
Message 74507 - Posted: 20 Oct 2022, 11:43:03 UTC

I have stopped running for now, yesterday my count was Validation pending (4414)
This morning it is at Validation pending (3550). Without me running any WU
I will stop for the weekend and see what the count is Monday.
ID: 74507 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 13 Oct 21
Posts: 44
Credit: 226,874,311
RAC: 17,400
Message 74515 - Posted: 20 Oct 2022, 20:31:25 UTC
Last modified: 20 Oct 2022, 20:33:01 UTC

Validation is still happening just at a very slow pace so if one stops crunching a reduction in one's Validation Pending is to be expected. Even without stopping there are occasional temporary reductions. However, users stopping is likely to slow things down even more as there will be even less machines doing the little validation that is happening.

Unfortunately, it seems unlikely that things will get fixed until after the server migration.

I haven't ran N-Body much over the last few months. Is N-Body validation also a problem or is it just Separation?
ID: 74515 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Martin

Send message
Joined: 28 May 22
Posts: 17
Credit: 402,111,833
RAC: 0
Message 74516 - Posted: 20 Oct 2022, 20:37:21 UTC

Suddenly, current tasks I've just returned are showing Completed and validated !

Martin
ID: 74516 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 74518 - Posted: 20 Oct 2022, 21:00:40 UTC - in response to Message 74516.  

Suddenly, current tasks I've just returned are showing Completed and validated !

Martin


WOO HOO!!!
ID: 74518 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Validation Pending too many tasks

©2024 Astroinformatics Group