Welcome to MilkyWay@home

Marked as Invalid? (Part 2)

Message boards : Number crunching : Marked as Invalid? (Part 2)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
Message 37988 - Posted: 3 Apr 2010, 16:36:19 UTC
Last modified: 3 Apr 2010, 16:37:07 UTC

Out of the blue today, a bunch of GPU WU's prefixed with "de_s11_3s_free_6" are rejected by the server as invalid. They all appeared to run to completion in a normal amount of time. The vast majority of the other WU's of the same family seem to be running just fine. All of the "stderr out" reported for the "de_s11_3s_free_6" series look pretty much identical.

Any ideas how to figure out what happened to these handful that were rejected?
ID: 37988 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Arif Mert Kapicioglu

Send message
Joined: 14 Dec 09
Posts: 161
Credit: 589,318,064
RAC: 0
Message 37991 - Posted: 3 Apr 2010, 19:23:30 UTC

Brian, I experienced the same thing with "de_s11_3s_free_6" work units, some marked as invalid although the amount of runtime is just fine.
ID: 37991 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 37992 - Posted: 3 Apr 2010, 22:42:35 UTC

Same thing on my Quad with a 5830.
ID: 37992 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tails
Avatar

Send message
Joined: 19 Feb 10
Posts: 17
Credit: 7,573,117
RAC: 0
Message 37993 - Posted: 4 Apr 2010, 0:44:43 UTC

I think travis working on the new validator system.
ID: 37993 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Crunch3r
Volunteer developer
Avatar

Send message
Joined: 17 Feb 08
Posts: 363
Credit: 258,227,990
RAC: 0
Message 38002 - Posted: 4 Apr 2010, 12:30:17 UTC - in response to Message 37993.  

Something is messed up with the validator. I'm too seeing WUs marke as invalid...
Would be nice if someone could fix that.

Join Support science! Joinc Team BOINC United now!
ID: 38002 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
Message 38032 - Posted: 5 Apr 2010, 5:01:08 UTC - in response to Message 38002.  
Last modified: 5 Apr 2010, 5:01:47 UTC

About 1/6 of mine, including 1 CPU WU that ran for 9.5hrs, are being rejected lately. New code reported instead of just "Invalid": "Workunit error - check skipped".
ID: 38032 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
loeakaodas

Send message
Joined: 2 Jan 09
Posts: 34
Credit: 93,631,891
RAC: 0
Message 38038 - Posted: 5 Apr 2010, 5:51:20 UTC

Also seem to be getting lots of invalid results
ID: 38038 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
UBT - JohnR

Send message
Joined: 10 Mar 08
Posts: 7
Credit: 60,169,291
RAC: 0
Message 38047 - Posted: 5 Apr 2010, 6:35:36 UTC

All my GPU's are getting some "Completed, can't validate" and "Completed, marked as invalid". non of them are overclocked.
I'm not too bothered by all the Pendings as those should be OK in time but when wu's have 5 or 6 results all marked as invalid then I would hope that when the work on the validator is finished that it will check these results again.
ID: 38047 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
R.Stanneveld

Send message
Joined: 13 Dec 09
Posts: 5
Credit: 38,436,391
RAC: 0
Message 38056 - Posted: 5 Apr 2010, 7:37:47 UTC

Its like noone cares :P
Ah well back to collatzzzzz.
ID: 38056 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gary Roberts

Send message
Joined: 1 Mar 09
Posts: 56
Credit: 1,984,937,499
RAC: 0
Message 38060 - Posted: 5 Apr 2010, 8:24:19 UTC - in response to Message 38056.  

Its like noone cares :P
Ah well back to collatzzzzz.

It's more like you are reading the wrong thread!!

Try looking at the rather active top thread in the news forum and you'll see that Travis is actually working pretty hard trying to solve these issues.

Cheers,
Gary.
ID: 38060 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 38063 - Posted: 5 Apr 2010, 9:16:52 UTC
Last modified: 5 Apr 2010, 9:23:39 UTC

I looked at this thread and didn't look at my Milkyway GPU. I see I am getting the same with mainly -

"Completed, validation inconclusive"

or

"Completed, can't validate"

- as others are reporting.

This can be seen here, where about half of them report "validation inconclusive". The rest, when looking at more detail are like this with multiple resends.

Looks like there is a lot of marginal work being sent out, or the recent server changes have affected something that was marginal.

For this rig the problem seems to have hit me with a higher rate, or mix, of these inconclusive WUs over the last 24 hours, although the problem has been about for sone time.

The interesting thing is there are a significant number of WUs coming down, being crunched and returned that validate in the expected way.

This suggests the problem lies with the work being sent out.
Go away, I was asleep


ID: 38063 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Edboard
Avatar

Send message
Joined: 22 Feb 09
Posts: 20
Credit: 105,156,399
RAC: 0
Message 38064 - Posted: 5 Apr 2010, 9:44:07 UTC

Today (April 5) I have counted all units done and I have aprox. 1/3 marked as invalid (160 invalid and 340 valids). All of them has lasted normal time to be processed. I have stopped to crunch in Milky for a while.
ID: 38064 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
Message 38072 - Posted: 5 Apr 2010, 10:34:55 UTC - in response to Message 38064.  

I have aprox. 1/3 marked as invalid (160 invalid and 340 valids). All of them has lasted normal time to be processed. I have stopped to crunch in Milky for a while.
Over the last 10hrs, it has rejected 120 of mine. It would normally do about 400 WU's over that time period on the HD5870 so I'm getting roughly the same appalling failure rate.
ID: 38072 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 38074 - Posted: 5 Apr 2010, 10:42:40 UTC
Last modified: 5 Apr 2010, 11:14:25 UTC

Looks like its the new Validator, its possible Tobias is still working on it. Whatever whichway, there is not much point crunching until its sorted out with the output its has been giving for the last 24 hrs, so I've gone NNT for now.


EDIT : Discussion going on about this over at the News Forum, Travis is working on it, see

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=1644&nowrap=true#38053


Regards
Zy
ID: 38074 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 29 Aug 07
Posts: 486
Credit: 576,517,594
RAC: 36,582
Message 38080 - Posted: 5 Apr 2010, 11:16:51 UTC

If it's the Projects intention to drive away it's participant's then it's doing a fine job of it, NNW for now for me ...

96643461 90256685 5 Apr 2010 10:59:35 UTC 5 Apr 2010 11:02:46 UTC Completed, can't validate 87.34 84.23 0.48 0.00 Anonymous platform
96643460 90255706 5 Apr 2010 10:59:35 UTC 5 Apr 2010 11:01:16 UTC Completed, can't validate 86.44 84.31 0.48 0.00 Anonymous platform
96643452 90242207 5 Apr 2010 10:59:35 UTC 5 Apr 2010 11:07:17 UTC Completed, can't validate 87.09 84.30 0.48 0.00 Anonymous platform
96623554 89640557 5 Apr 2010 10:22:06 UTC 5 Apr 2010 10:26:09 UTC Completed, can't validate 87.47 84.08 0.48 0.00 Anonymous platform
96623521 90246510 5 Apr 2010 10:22:06 UTC 5 Apr 2010 10:23:53 UTC Completed, can't validate 88.25 84.67 0.48 0.00 Anonymous platform
96623229 90261544 5 Apr 2010 10:22:06 UTC 5 Apr 2010 10:41:07 UTC Completed, can't validate 86.84 84.33 0.48 0.00 Anonymous platform
96323728 90180280 5 Apr 2010 0:42:22 UTC 5 Apr 2010 0:48:03 UTC Completed, marked as invalid 81.64 79.11 0.66 0.00 Anonymous platform
96323693 90180245 5 Apr 2010 0:42:22 UTC 5 Apr 2010 0:58:02 UTC Completed, can't validate 81.78 79.25 0.66 0.00 Anonymous platform
96323627 85656042 5 Apr 2010 0:42:22 UTC 5 Apr 2010 0:48:03 UTC Completed, can't validate 89.52 86.72 0.72 0.00 Anonymous platform
STE\/E
ID: 38080 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Blurf
Volunteer moderator
Project administrator

Send message
Joined: 13 Mar 08
Posts: 804
Credit: 26,380,161
RAC: 0
Message 38134 - Posted: 5 Apr 2010, 20:13:51 UTC

From Front Page:

Validator Strictness

I've lowered the strictness of the validator from 10e-11 to 10e-10. I'm hoping this should significantly reduce the number of WUs flagged invalid. If the issue persists I might have to lower it farther to 10e-9. The new application will have the strictness back at 10e-11, so keep that in mind if you're compiling your own versions.
The issue we're having seems to be that the ATI 48xx GPUs and the ATI 58xx GPUs are returning different results, and if too many of either make it into the quorum they will invalidate the other results (including stock results). I'm still trying to determine if the 58xx GPU or the 48xx GPU is the one correctly validating against the stock application.
I've also updated the validator so if you check your tasks they will show what fitness they reported, so you can compare vs other tasks for the same workunit.
I'm hoping we should have this issue straightened out shortly, and thanks for your patience.
5 Apr 2010 16:35:39 UTC


ID: 38134 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 38139 - Posted: 5 Apr 2010, 21:15:10 UTC

A massive number of WUs are marked:
"Completed, can't validate"

The Workunit is marked:
"errors: Too many total results"

Nobody has a valid WU? Don't believe it.
ID: 38139 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 38143 - Posted: 5 Apr 2010, 22:16:18 UTC
Last modified: 5 Apr 2010, 22:36:24 UTC

What does this mean:
"errors: Too many success results"

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=86084176

Does that make sense at all?

Edit:

Here's some more "Too many success results"

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=88320323
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=84473249

What the heck does "Too many success results" mean?
ID: 38143 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
Message 38148 - Posted: 5 Apr 2010, 22:26:59 UTC - in response to Message 38074.  
Last modified: 5 Apr 2010, 22:27:33 UTC

...there is not much point crunching until its sorted out with the output its has been giving for the last 24 hrs, so I've gone NNT for now.
What's NNT?

Just checked the task view and MW's rejection rate now exceeds 50% of WU's run using dead-stock applications with non-overclocked hardware. I see absolutely no point in staying either.
ID: 38148 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 38150 - Posted: 5 Apr 2010, 22:37:31 UTC - in response to Message 38148.  

NNT = No New Tasks -- basically completing all taks in the queue and having other projects work (primarily Collatz for ATI GPU workstations).

I expect that Travis will fix things reasonably soon unless MW needs no more work done.



...there is not much point crunching until its sorted out with the output its has been giving for the last 24 hrs, so I've gone NNT for now.
What's NNT?

Just checked the task view and MW's rejection rate now exceeds 50% of WU's run using dead-stock applications with non-overclocked hardware. I see absolutely no point in staying either.


ID: 38150 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : Marked as Invalid? (Part 2)

©2024 Astroinformatics Group