Welcome to MilkyWay@home

Tasks Completed, but validation tasks remain Unsent

Message boards : Number crunching : Tasks Completed, but validation tasks remain Unsent
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Searching to find the meaning ...
Avatar

Send message
Joined: 5 Dec 15
Posts: 4
Credit: 875,220
RAC: 48
Message 76807 - Posted: 25 Jan 2024, 17:58:59 UTC
Last modified: 25 Jan 2024, 18:00:28 UTC

Two of my tasks are Completed, validation inconclusive:

https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=963991857
24 Jan 2024, 4:42:42 UTC CPU time (sec) 43,471.86 or over 12 hours
yet Task 936324031 is still Unsent
and
https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=964000770
23 Jan 2024, 22:51:35 UTC CPU time (sec) 54,474.03 or over 15 hours
yet Task 936319099 is still Unsent

Not running any more tasks for now.
LLP, PhD, Prof. Engr.
I think => I THINK I am.
My thinking is not the source of my being, nor does it prove my existence to you.
The Living Word of God
World Youth Day
ID: 76807 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76809 - Posted: 26 Jan 2024, 1:01:22 UTC - in response to Message 76807.  

Two of my tasks are Completed, validation inconclusive:

https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=963991857
24 Jan 2024, 4:42:42 UTC CPU time (sec) 43,471.86 or over 12 hours
yet Task 936324031 is still Unsent
and
https://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=964000770
23 Jan 2024, 22:51:35 UTC CPU time (sec) 54,474.03 or over 15 hours
yet Task 936319099 is still Unsent

Not running any more tasks for now.


Validation Inconclusive just means 'waiting for a wingman', keep crunching and they will get validated in the end. MW only sends out the original task then if it needs a wingman task it generates it but it goes at the end of the list of available tasks, so they can take awhile to validate.
ID: 76809 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Joseph Pistritto

Send message
Joined: 18 Jan 22
Posts: 1
Credit: 1,010,177
RAC: 2,380
Message 76816 - Posted: 26 Jan 2024, 15:26:29 UTC - in response to Message 76809.  

Everything ive completed in the last 10 days or so is "Validation Inconclusive"... I'm still getting new tasks (slowly) but its happening.
ID: 76816 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 624
Credit: 19,299,762
RAC: 2,614
Message 76817 - Posted: 26 Jan 2024, 16:06:39 UTC - in response to Message 76816.  

Yes, this is a known "issue", which should resolve itself in the next 2-4 weeks I guess. The more we crunch, the sooner this will happen, check the "Admin Updates Discussion" thread in the News section.
ID: 76817 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Searching to find the meaning ...
Avatar

Send message
Joined: 5 Dec 15
Posts: 4
Credit: 875,220
RAC: 48
Message 76818 - Posted: 26 Jan 2024, 22:17:48 UTC - in response to Message 76809.  
Last modified: 26 Jan 2024, 22:23:14 UTC

MW only sends out the original task then if it needs a wingman task it generates it
both WUs say
minimum quorum	        1
initial replication	2 

...not sure what's the difference between quorum and replication, but quite obviously a send task IS needed to complete validation.

I've not heard the term 'wingman' task, but the task to complete the validation for BOTH WUs had already been generated, but neither has been sent to be run ... both have status Unsent.
LLP, PhD, Prof. Engr.
I think => I THINK I am.
My thinking is not the source of my being, nor does it prove my existence to you.
The Living Word of God
World Youth Day
ID: 76818 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76823 - Posted: 27 Jan 2024, 11:46:29 UTC - in response to Message 76818.  

MW only sends out the original task then if it needs a wingman task it generates it
both WUs say
minimum quorum	        1
initial replication	2 

...not sure what's the difference between quorum and replication, but quite obviously a send task IS needed to complete validation.

I've not heard the term 'wingman' task, but the task to complete the validation for BOTH WUs had already been generated, but neither has been sent to be run ... both have status Unsent.


No a wingman is not always needed, apparently if you return I think the number is 10 tasks in a row that are valid then the Server thinks your pc is trustworthy and it will only periodically send out a wingman task for that pc. BUT as soon as your wingman proves your pc is not trustworthy anymore then the process starts all over from zero again. Becoming non trustworthy can be from dust, overclocking, components wearing out etc etc.

Link tried to explain WHY they haven't been sent out yet, the Server made a million tasks and all wingman tasks go at the end of the list, so in a couple of weeks we should be getting ALOT of _1 tasks, the initial tasks end in _0 and then everyone tasks should be valid or the Project will send out a 3rd task to try and figure out which of the first 2 pc's has the right answer.
ID: 76823 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xii5ku

Send message
Joined: 1 Jan 17
Posts: 37
Credit: 111,040,115
RAC: 35,489
Message 76839 - Posted: 29 Jan 2024, 14:58:32 UTC - in response to Message 76817.  
Last modified: 29 Jan 2024, 15:03:30 UTC

Link wrote:
Yes, this is a known "issue", which should resolve itself in the next 2-4 weeks I guess. The more we crunch, the sooner this will happen, check the "Admin Updates Discussion" thread in the News section.
Here is an estimation of my own, according to which the pile of 'validation inconclusive' will last for about two months:

  • Each valid result from mid January used to earn about 1,000 credits. At least this is the order of magnitude of what I see in the current few valid results of the ~20 top hosts. There is quite some credit variation though. Let's go with 1,100 credit/result on average. (source)
  • Before January 17, the server maintained a level of ~1,000 ready-to-send NBody tasks. Then the mishap with the huge number of new tasks happened. The admin removed many of them on January 24, such that there is now a level of ~690,000 ready-to-send. (source)
  • Before January 17, MilkyWay@Home gave out typically ~14 M credit per day globally, sometimes more. (source)
  • So I guess that MilkyWay@Home received on the order of 13,000 valid results per day before the mishap.
  • Let's optimistically assume that there is still the same amount of computer capacity active.
    Also let's assume that average workunit size stays the same as earlier in January, and that the fraction of successful returns remains the same.
  • If so, ~690,000 "*_0" tasks / ~13,000 results/day = ~50 days = 7...8 weeks is what it takes until there are results returned for all of these "*_0" tasks.
  • From what I understand, only after this will the server start to assign "*_1" tasks to hosts. And obviously after that, the server needs to receive valid results from "*_1" tasks in order to validate the pile of earlier "*_0" results.
    Note, there is currently a quite constant level of ~690,000 tasks ready to send. This is because for each "*_0" result returned, the server generates a "*_1" task. (This is true for success returns as well as error returns.) That is, the current pile of tasks ready to send is slowly containing fewer _0 tasks and more _1 tasks. But still, the server assigns all those _0 tasks first because they were queued earlier.
  • We need to count these 7...8 weeks from January 17 on. (That's because that's the day from which on there were only a few _1, _2, _3... tasks left from before, and far more _0 tasks stuffed into the queue.) Which means that hosts will begin to receive "*_1" tasks in the middle of March, maybe early March.

Does this make sense?

ID: 76839 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 624
Credit: 19,299,762
RAC: 2,614
Message 76843 - Posted: 29 Jan 2024, 15:46:51 UTC - in response to Message 76839.  
Last modified: 29 Jan 2024, 15:48:55 UTC

Does this make sense?
Yes, but it doesn't take into account, that we have two types of tasks here: the long tasks, which need 96000-116000 CPU seconds on my computer and for which we get around 1000 credits and the short tasks, which need less than 10000 CPU seconds (and as you can see, they are the majority, at least on my list).

The oldest _1 waiting for to be send out from my WU list is 935953561, created on the 17th January.
The oldest _0 I got on the 16th is 932856512
The newest _0 I got today (29th) is 935473682

That means we processed 2,617,170 tasks in 13 days. That's 201,321 tasks per day.

There are 479,879 tasks left between the 935473682 I got today and 935953561. At 201,321 tasks per day, 935953561 should be sent out in about 2.5 days.

Does that make more sense?
ID: 76843 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xii5ku

Send message
Joined: 1 Jan 17
Posts: 37
Credit: 111,040,115
RAC: 35,489
Message 76844 - Posted: 29 Jan 2024, 19:50:46 UTC - in response to Message 76843.  
Last modified: 29 Jan 2024, 20:26:20 UTC

Link wrote:
Yes, but it doesn't take into account, that we have two types of tasks here: the long tasks, which need 96000-116000 CPU seconds on my computer and for which we get around 1000 credits and the short tasks, which need less than 10000 CPU seconds (and as you can see, they are the majority, at least on my list).
Good point. I missed these because hardly any of them can be found among the valid tasks which are currently left in the database. I think these short tasks get ~100 credits. (source)

The current top host has got 4000 inconclusive results by now. A couple of hours ago I copy+pasted 500 of its then most recent inconclusive results into a spreadsheet. Of these, 222 took 3,300...3,600 CPU seconds and 278 took 35,600...45,400 CPU seconds. I.e. this host had 44 % short tasks and 56 % long tasks recently.

Let's say it's fifty-fifty long and short tasks, which gives ~600 average credits per result. If this was the same earlier this month, then the ~14,000,000 credits/day before January 17 mean ~23,000 valid results per day. And to get ~690,000 tasks returned validly would take 30 days (4 weeks) = until mid February if that rate remained constant. Edit: The average during 2023-12-21...2024-01-17 actually was 17,000,000 credits/day = almost 30,000 valid results/day, which would translate to 24 days for 690,000 tasks.

Link wrote:
The oldest _0 I got on the 16th is 932856512
The newest _0 I got today (29th) is 935473682

That means we processed 2,617,170 tasks in 13 days. That's 201,321 tasks per day.
We need the rate of successfully computed results. Your figure also includes all aborted tasks, computation errors, and timeouts. I am not saying though that we have a huge ratio of error returns; I don't know why your figure is 7...9 times as much as mine.

Edit 2: Oh wait. Your figure is possibly skewed a lot because during late January 16 ... mid January 24, there were 3 million tasks-ready-to-send on the server. 2.3 million of those no longer exist since January 24 because Kevin deleted them, but they may be included in the workunit numbers which you found.
ID: 76844 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 624
Credit: 19,299,762
RAC: 2,614
Message 76848 - Posted: 30 Jan 2024, 9:24:30 UTC - in response to Message 76844.  

We need the rate of successfully computed results. Your figure also includes all aborted tasks, computation errors, and timeouts.
No, for the rate at which the tasks are sent it doesn't matter what happens with them later on the clients.


Oh wait. Your figure is possibly skewed a lot because during late January 16 ... mid January 24, there were 3 million tasks-ready-to-send on the server. 2.3 million of those no longer exist since January 24 because Kevin deleted them, but they may be included in the workunit numbers which you found.
Yes.
ID: 76848 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 624
Credit: 19,299,762
RAC: 2,614
Message 76851 - Posted: 30 Jan 2024, 10:35:46 UTC

So let's try again...

932856512, sent 16 Jan 2024, 20:40:44 UTC
933039926, sent 23 Jan 2024, 11:27:26 UTC

That are 183,414 tasks sent out in about 177 hours.

935342117, sent 24 Jan 2024, 13:12:42 UTC
935473011, sent 29 Jan 2024, 16:29:58 UTC

That are 130,894 tasks sent out in about 123 hours.

So total 314,308 tasks in 300 hours, or 1,047.7 tasks per hour.

There are 480,550 tasks left between the 935473011 and 935953561 (apparently the server isn't sending the tasks exactly after their numbers, but close enough). At 1,047.7 tasks per hour, 935953561 should be sent out about 458 hours (about 19 days) after 935473011. So around the 17th February.

I think I can't get it more exactly than that. Did I still miss something?
ID: 76851 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
February Thunder

Send message
Joined: 9 Aug 21
Posts: 1
Credit: 3,140,964
RAC: 1,327
Message 76853 - Posted: 30 Jan 2024, 12:47:37 UTC - in response to Message 76816.  

Everything ive completed in the last 10 days or so is "Validation Inconclusive"... I'm still getting new tasks (slowly) but its happening.

I have been experiencing a similar situation on my computers since January 16, 2024
ID: 76853 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xii5ku

Send message
Joined: 1 Jan 17
Posts: 37
Credit: 111,040,115
RAC: 35,489
Message 76855 - Posted: 30 Jan 2024, 15:46:31 UTC
Last modified: 30 Jan 2024, 15:53:59 UTC

xii5ku wrote:
~23,000 valid results per day [typically returned until January 17]
Link wrote:
1,047.7 tasks per hour [being assigned to hosts on average during the last few days]
I.e. our updated estimations are in the same ballpark. Meaning that we are currently on the way to get workunits validly completed again sometime in mid February.


xii5ku wrote:
We need the rate of successfully computed results.
Link wrote:
No, for the rate at which the tasks are sent it doesn't matter what happens with them later on the clients.
There are two related, but not identical questions: When will the server start to assign _1 tasks to hosts? When will the server receive _1 results which match _0 results so that successful validations are happening again? I for one was more occupied by the latter than by the former question. However, concerning this latter question, I admit on second thought that intermittent occurrences of error returns don't actually defer the point in time at which validations start happening again (ignoring unrealistic corner cases). They merely reduce the rate of validations, after validations started happening again.

--------

February Thunder wrote:
I have been experiencing a similar situation on my computers since January 16, 2024
The very same is happening on all hosts which are currently active, without exception. It is because there was accidentally a very unusual amount of new work queued on the server at once, combined with how MilkyWay@Home is implementing workunit validation.
ID: 76855 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rz5rqt

Send message
Joined: 5 Sep 09
Posts: 9
Credit: 559,488,021
RAC: 65,789
Message 76856 - Posted: 31 Jan 2024, 12:48:54 UTC

Very good thread gentlemen. You validated many of my thoughts. So...... how hard would it be for the admin to tell us how many of the 690,000 tasks "ready to send" are initial tasks (_0) and how many are "validation" tasks (wingman)(_1) tasks. If these tasks are just files sitting in a UNIX/Linux directory they should be easy to count. If posted once a week, we could see our progress.
ID: 76856 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 624
Credit: 19,299,762
RAC: 2,614
Message 76859 - Posted: 31 Jan 2024, 14:51:54 UTC - in response to Message 76856.  
Last modified: 31 Jan 2024, 14:53:15 UTC

No, they are not just files, they are database entries. Not sure how hard it is to find them, but considering that they still have not been able to find and delete old separation WUs from 2021, I guess we don't need to hope for official numbers of _0 and _1 tasks. But if you assume, that the 480,550 tasks left between the 935473011 and 935953561 are nearly all _0 and everything else is _1, you will be pretty close to the truth. Or if we take my newest task 935501281, than there are about 452,280 _0s left, probably a bit less than that.
ID: 76859 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Tasks Completed, but validation tasks remain Unsent

©2024 Astroinformatics Group