Welcome to MilkyWay@home

Validation inconclusive

Message boards : Number crunching : Validation inconclusive
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 · Next

AuthorMessage
alanb1951

Send message
Joined: 16 Mar 10
Posts: 208
Credit: 105,461,806
RAC: 36,122
Message 73974 - Posted: 21 Jul 2022, 3:55:22 UTC - in response to Message 73971.  

Uh oh. In 2038 Boinc will cease to be.
https://en.wikipedia.org/wiki/Year_2038_problem


I would think before then they can find a way to make it work
I can't decide whether Peter was trying to make a joke or not :-). However, this one is probably less of a problem than Y2K, and a lot of diligence avoided most of the problems that could've caused... (And I suspect we're more likely to have had a mass extinction event by 2038 than this being a problem...)

In this case the only real issue is how the Operating System returns date/time information. As per the Wikipedia item, some O/S flavours return a 32-bit integer number of seconds since the "UNIX Epoch", and that would be problematic! Other systems may return a floating-point number of days or seconds since some base date; as long as those return a double rather than a float, there's already no problem in getting date/time data after the 2038 barrier. More recent Linux versions with 64-bit system libraries already return a 64-bit integer instead, so no problems there either. It then depends on how the system libraries make the date/time available to applications -- all it needs is for routines not to be constrained to 32-bit (or less) precision...

The remaining issue would be whether applications are coded to meet the standards expected by various system libraries -- for instance, using a time_t variable rather than a native C data type for a date/time numeric value (and competent programmers use those size-agnostic variable types for exactly this sort of reason!) Any applications coded thus would need a recompilation against newer libraries (if that hadn't already happened!)

BOINC is written in C++ and uses the appropriate variable declaration standards. So, as no client software is likely to still be 32-bit by then, where's the problem? The server side might have been a bit more interesting if the database had issues with such dates, but MySQL shouldn't be an issue regarding dates, and the conversion to/from system standard date representations seems solid. So, again, recompile if necessary, and re-link with the latest libraries!

Now, if you want an example of a seemingly unavoidable Y2K-type incident, consider what happens to a lot of [older model?] GPS systems when the (10-bit) week number rolls over once every 20 years or so and there's no defensive coding in the device to deal with that.

Cheers - Al.
ID: 73974 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 73977 - Posted: 21 Jul 2022, 15:37:27 UTC - in response to Message 73974.  

I can't decide whether Peter was trying to make a joke or not :-).
It's a joke that programmers make these limitations in the first place, but I bet you Boinc will screw up like it did when the security certificates expired. Nobody will fix it until the last minute.

However, this one is probably less of a problem than Y2K,
Y2K was never a problem, it was a panic over nothing and stupid expenditure over nothing. Computers don't blow up because they can't add one to the date. So a few things get confused and something has to be adjusted, it's not going to cause the end of the world. I actually read in that article that ABS brakes on cars would be affected. Seriously?! Come on. They might compare the time between now and a millisecond later when the next tooth passes on the wheel, but the worst that might happen is for one millisecond it thinks the car broke the speed of light in reverse. Either nothing will happen or your brakes will judder once, just as if you'd hit a patch of loose grit on the road.

(And I suspect we're more likely to have had a mass extinction event by 2038 than this being a problem...)
No, but we might have no resources left. We'll be all running around naked living outdoors, which will be fun.

BOINC is written in C++ and uses the appropriate variable declaration standards. So, as no client software is likely to still be 32-bit by then, where's the problem?
I can guarantee you it's limited now, because I got a secondary Boinc account banned until precisely that date. I'm guessing the system wouldn't let the admin ban me after that date.

Now, if you want an example of a seemingly unavoidable Y2K-type incident, consider what happens to a lot of [older model?] GPS systems when the (10-bit) week number rolls over once every 20 years or so and there's no defensive coding in the device to deal with that.
GPSes I've used in my car never last more than 5 years anyway since the hot sun buggers them up. And that's in Scotland. I'm surprised the Li Ion batteries don't explode actually. The worst that will happen is a plane doesn't know where it is and the pilot has to actually fly it for a bit.
ID: 73977 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
.clair.

Send message
Joined: 3 Mar 13
Posts: 84
Credit: 779,527,603
RAC: 22,637
Message 73986 - Posted: 22 Jul 2022, 0:16:43 UTC
Last modified: 22 Jul 2022, 1:16:35 UTC

The show aint over until the fat lady runs out of time :-),
one from my arkives
If any of it is wrong , I blame some one else . . .
And will place a firm leap second in your integers :
------------------------
Remember Y2k
Well it's not over , yet

2020: January 1: Systems still using 1920 as pivot date fail; Macintoshes running System 6.0.4 or earlier - correct date can no longer be set in Date & Time Control Panel
2030: January 1: Systems still using 1930 as pivot date fail.
2036: January 1: Burroughs Unisys A Series system date fails?
2036: February 6: 2^32 seconds from Jan 1, 1900.
2037: January 1: Rollover date for NTP systems
2038: January 19: Unix: 2^31 seconds from Jan 1, 1970
2040: February 6: At 06:28:16, old Macs' longword seconds from Jan 1, 1904 overflow.
2042: September 17: IBM 370 TOD clock overflows. One source lists this as the 18th (?)
2044: January 1: MS-DOS: 2^6 years from 1980, setting the most significant bit (MSB). Signed variables using this get a negative date.
2046: January 1: Amiga system date failure
2046: June 8: Some Unix password aging fails; 62^2 weeks from 1970.
2049: December 31: Microsoft Project 95 limit.
2078: December 31: MS Excel 7.0 - the last day
2079: June 6: 2^16 days from January 1, 1900
2080: January 1: MS-DOS file dates, displayed with two-digit years, become ambiguous.
2100: Y2.1K; most current PC BIOSes run out of dates; MS-DOS <DIR> renders the file-date years 2100 through 2107 as 99.
2100: February 28: last day of February - NOT a leap year
2106: February 7: Unix: 2^32 seconds from Jan 1, 1970; time overflows at 06:28:16.
2108: January 1: MS-DOS 2^7 years from 1980; file date overflows
2738: November 28: Approximate day of A.D. 1 million (days)
4338: November 28: COBOL-85 integer day 1,000,000 (10^6) exceeds six-digit field
9999: HTTP caching fails.
10000: January 1: Y10K!! Four-digit years fail. More time will elapse between the time this document was written and this date than has elapsed from the beginning of modern human civilization until now.
29602: January 1: MS Windows NT File Systems (NTFS) fails.
29940: New Macs' signed 64-bit time fails (has been OK since 30,081 B.C.!!)
31086: July 31: Internal DEC VMS time fails at 02:48:05.47
60056: Win32 64-bit time fails (started from Jan 1, 1601)
ID: 73986 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 73987 - Posted: 22 Jul 2022, 2:01:11 UTC - in response to Message 73986.  
Last modified: 22 Jul 2022, 2:01:36 UTC

Remember Y2k
I remember a big fuss over nothing.

Why such stupid programmers setting a limit on things?

2046: January 1: Amiga system date failure
Well that's the end of the world. Oooh lemmings!

60056: Win32 64-bit time fails (started from Jan 1, 1601)
Looks like Windows looked furthest ahead.
ID: 73987 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nuadormrac

Send message
Joined: 11 Sep 08
Posts: 22
Credit: 9,081,761
RAC: 0
Message 74130 - Posted: 9 Sep 2022, 0:55:49 UTC

I'm now getting a lot of this validation inconclusive on a brand new laptop I installed a couple days ago. Oddly enough, my old laptop (the graphics card on it has recently died outside the Intel one on the CPU), didn't seem to pop these up. I'm sure the new laptop isn't fatally flawed, so waiting to see what happens...

Other projects I've tried, have been validating without issue....
ID: 74130 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nuadormrac

Send message
Joined: 11 Sep 08
Posts: 22
Credit: 9,081,761
RAC: 0
Message 74131 - Posted: 9 Sep 2022, 0:55:51 UTC
Last modified: 9 Sep 2022, 0:57:40 UTC

post delay, thought it didn't post Plz delete duplicate
ID: 74131 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 74132 - Posted: 9 Sep 2022, 1:07:18 UTC - in response to Message 74130.  
Last modified: 9 Sep 2022, 1:13:25 UTC

I'm now getting a lot of this validation inconclusive on a brand new laptop I installed a couple days ago. Oddly enough, my old laptop (the graphics card on it has recently died outside the Intel one on the CPU), didn't seem to pop these up. I'm sure the new laptop isn't fatally flawed, so waiting to see what happens...

Other projects I've tried, have been validating without issue....
On this project "validation inconclusive" should read "validation pending". For some reason they go in the wrong list. it just means they've not been checked yet. Since your laptop is powerful, you're going to finish them first and have to wait for someone slower.

However I do see some which have been checked by someone's Nvidia against your CPU, and it's asked for a third check. Might be something up there. I know Einstein are having problems with different types of chips being compared, but that's a new program they've written.

However, looking through my tasks, it seems my valid ones are not checked. Looks they they only check some of them at random? or perhaps they trust a machine after a certain time? If you've just attached to the project, milkyway needs to see lots of successful tasks before it stops checking them with someone else.

By the way, to delete a post, edit it and change it to contain two spaces and nothing else. For some reason this makes it go away.
ID: 74132 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 208
Credit: 105,461,806
RAC: 36,122
Message 74133 - Posted: 9 Sep 2022, 3:41:59 UTC

@Nuadormrac (and Peter too...)

For information regarding the phases of validation...

The validator has two duties -- when a result is returned it checks it for obvious errors (which can be marked Invalid immediately) and either validates the work unit at once (see below) or marks the result as needing verification. If multiple results are required for verification, the validator gets invoked again once there are enough results to perform a full verification. (Oversimplification!)

As Peter mentioned, sometimes it only needs one result -- it uses a mechanism called Adaptive Replication. Once a user has passed a pre-defined count [20, I think] of consecutive successful results for a specific application, the validator does a little calculation which (by default) will result in about 90% of tasks for said user being passed without needing any wingmen. So until you've racked up enough consecutive successfully validated Separation tasks it will always go to Validation Inconclusive (and if you get a bad result it'll clear the count...)

In theory, both applications at MilkyWay use Adaptive Replication, but it [currently] seems to be broken for N-body, so there'll always be a wingman [eventually]. However, it works for Separation, but for some reason the replication count for the wingman case seems to be three (instead of the two used for N-body). I presume the count is set higher because of the way the results are compared for verification purposes; it doesn't require an exact match...

Cheers - Al.

P.S. When MilkyWay had the problems after the disk crash earlier this year, it was quite common to see work units stuck with one or more results at Validation Inconclusive when there should've been enough results to complete validation and declare a canonical result. This was because another part of the system got so bottlenecked that it wasn't calling the validator for the verification phase!
ID: 74133 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74149 - Posted: 12 Sep 2022, 13:04:32 UTC - in response to Message 74133.  

Is something wrong somewhere the number of WU's waiting for validation is now over 13000, not bbeen that high for a very long time ?
ID: 74149 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74150 - Posted: 12 Sep 2022, 13:04:35 UTC - in response to Message 74133.  

Is something wrong somewhere the number of WU's waiting for validation is now over 13000, not bbeen that high for a very long time ?
ID: 74150 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74151 - Posted: 12 Sep 2022, 13:05:10 UTC - in response to Message 74133.  

Is something wrong somewhere the number of WU's waiting for validation is now over 13000, not been that high for a very long time ? took ages to even post this meaasge.
ID: 74151 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 74159 - Posted: 13 Sep 2022, 3:04:09 UTC
Last modified: 13 Sep 2022, 3:05:05 UTC

I only have 4 GPUs on this project just now, but they're downloading the max per host easily and crunching through them just fine with immediate server response. I also posted this easily. Validation queue is now down to 901, so i guess whatever it was was fixed very quickly.
ID: 74159 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Septimus

Send message
Joined: 8 Nov 11
Posts: 205
Credit: 2,882,881
RAC: 267
Message 74195 - Posted: 18 Sep 2022, 10:16:57 UTC - in response to Message 74159.  

Is something stuck somewhere the validation Queue is hovering at over 800,000 now, been like that for a couple of days.
ID: 74195 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Speedy51

Send message
Joined: 12 Jun 10
Posts: 57
Credit: 6,163,587
RAC: 156
Message 74198 - Posted: 19 Sep 2022, 0:22:44 UTC - in response to Message 74195.  

Is something stuck somewhere the validation Queue is hovering at over 800,000 now, been like that for a couple of days.

The best way I believe we can help with the situation is keep processing the work we are given and process anything with a _2 or higher at the end of a task first.
ID: 74198 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,037,238
RAC: 35,415
Message 74199 - Posted: 19 Sep 2022, 3:11:16 UTC - in response to Message 74198.  

Is something stuck somewhere the validation Queue is hovering at over 800,000 now, been like that for a couple of days.

The best way I believe we can help with the situation is keep processing the work we are given and process anything with a _2 or higher at the end of a task first.
Yep, that is exactly what I do, when I can. Hopefully that clears completed workunits off of the servers sooner rather than later. Which would imply less server thrashing.
ID: 74199 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 74200 - Posted: 19 Sep 2022, 4:14:19 UTC

Server needs an SSD.
ID: 74200 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
San-Fernando-Valley

Send message
Joined: 13 Apr 17
Posts: 256
Credit: 604,411,638
RAC: 0
Message 74201 - Posted: 19 Sep 2022, 4:57:26 UTC - in response to Message 74200.  

Server needs an SSD.

Peter:
ONE won't do ....
Bad joke - I know.

Have a nice week!
S-F-V
ID: 74201 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 74202 - Posted: 19 Sep 2022, 5:29:51 UTC - in response to Message 74201.  

Server needs an SSD.
Peter:
ONE won't do ....
Bad joke - I know.
I have 8 of my own.

Have a nice week!
S-F-V
i will, i'm going on holiday to visit friends and relatives. "7.5 hour" drive according to satnav, so that's about 4.5 hours then :-)
ID: 74202 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
San-Fernando-Valley

Send message
Joined: 13 Apr 17
Posts: 256
Credit: 604,411,638
RAC: 0
Message 74203 - Posted: 19 Sep 2022, 7:38:03 UTC - in response to Message 74202.  

Server needs an SSD.
Peter:
ONE won't do ....
Bad joke - I know.
I have 8 of my own.

Have a nice week!
S-F-V
i will, i'm going on holiday to visit friends and relatives. "7.5 hour" drive according to satnav, so that's about 4.5 hours then :-)


oops, I thought you ment milkyway's server ...

The response times are sometimes really bad.
Not yours, I again mean milkyway.

JOKE ON:
Oh, 7.5 hours is the calculation with HDDs and the 4.5 hours with SSDs.
I wonder what the driving time calculation would be with NVMes?
JOKE OFF.

Have a nice time on your holiday!
ID: 74203 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,142,956
RAC: 2
Message 74204 - Posted: 19 Sep 2022, 9:08:08 UTC - in response to Message 74203.  

Server needs an SSD.
Peter:
ONE won't do ....
Bad joke - I know.
I have 8 of my own.
oops, I thought you ment milkyway's server ...

The response times are sometimes really bad.
Not yours, I again mean milkyway.
i did mean Milkyway server. I meant I have 8 here, yet MW has none. Mechanical drives in the 21st century is unworkable. Rosetta for example actually has a bank of 72 SSDs. And there's a lot more data goes through MW.

Have a nice week!
S-F-V
i will, i'm going on holiday to visit friends and relatives. "7.5 hour" drive according to satnav, so that's about 4.5 hours then :-)
JOKE ON:
Oh, 7.5 hours is the calculation with HDDs and the 4.5 hours with SSDs.
I wonder what the driving time calculation would be with NVMes?
JOKE OFF.
Well A Renault Scenic goes 120mph in a 70 limit so....

Have a nice time on your holiday!
Thanks, i will try to.
ID: 74204 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 · Next

Message boards : Number crunching : Validation inconclusive

©2024 Astroinformatics Group