Message boards :
Number crunching :
Validation inconclusive
Message board moderation
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 · Next
Author | Message |
---|---|
alanb1951 Send message Joined: 16 Mar 10 Posts: 168 Credit: 97,104,435 RAC: 64,576 ![]() ![]() ![]() |
I can't decide whether Peter was trying to make a joke or not :-). However, this one is probably less of a problem than Y2K, and a lot of diligence avoided most of the problems that could've caused... (And I suspect we're more likely to have had a mass extinction event by 2038 than this being a problem...)Uh oh. In 2038 Boinc will cease to be. In this case the only real issue is how the Operating System returns date/time information. As per the Wikipedia item, some O/S flavours return a 32-bit integer number of seconds since the "UNIX Epoch", and that would be problematic! Other systems may return a floating-point number of days or seconds since some base date; as long as those return a double rather than a float, there's already no problem in getting date/time data after the 2038 barrier. More recent Linux versions with 64-bit system libraries already return a 64-bit integer instead, so no problems there either. It then depends on how the system libraries make the date/time available to applications -- all it needs is for routines not to be constrained to 32-bit (or less) precision... The remaining issue would be whether applications are coded to meet the standards expected by various system libraries -- for instance, using a time_t variable rather than a native C data type for a date/time numeric value (and competent programmers use those size-agnostic variable types for exactly this sort of reason!) Any applications coded thus would need a recompilation against newer libraries (if that hadn't already happened!) BOINC is written in C++ and uses the appropriate variable declaration standards. So, as no client software is likely to still be 32-bit by then, where's the problem? The server side might have been a bit more interesting if the database had issues with such dates, but MySQL shouldn't be an issue regarding dates, and the conversion to/from system standard date representations seems solid. So, again, recompile if necessary, and re-link with the latest libraries! Now, if you want an example of a seemingly unavoidable Y2K-type incident, consider what happens to a lot of [older model?] GPS systems when the (10-bit) week number rolls over once every 20 years or so and there's no defensive coding in the device to deal with that. Cheers - Al. |
Mr P Hucker![]() Send message Joined: 5 Jul 11 Posts: 759 Credit: 361,871,315 RAC: 4 ![]() ![]() |
I can't decide whether Peter was trying to make a joke or not :-).It's a joke that programmers make these limitations in the first place, but I bet you Boinc will screw up like it did when the security certificates expired. Nobody will fix it until the last minute. However, this one is probably less of a problem than Y2K,Y2K was never a problem, it was a panic over nothing and stupid expenditure over nothing. Computers don't blow up because they can't add one to the date. So a few things get confused and something has to be adjusted, it's not going to cause the end of the world. I actually read in that article that ABS brakes on cars would be affected. Seriously?! Come on. They might compare the time between now and a millisecond later when the next tooth passes on the wheel, but the worst that might happen is for one millisecond it thinks the car broke the speed of light in reverse. Either nothing will happen or your brakes will judder once, just as if you'd hit a patch of loose grit on the road. (And I suspect we're more likely to have had a mass extinction event by 2038 than this being a problem...)No, but we might have no resources left. We'll be all running around naked living outdoors, which will be fun. BOINC is written in C++ and uses the appropriate variable declaration standards. So, as no client software is likely to still be 32-bit by then, where's the problem?I can guarantee you it's limited now, because I got a secondary Boinc account banned until precisely that date. I'm guessing the system wouldn't let the admin ban me after that date. Now, if you want an example of a seemingly unavoidable Y2K-type incident, consider what happens to a lot of [older model?] GPS systems when the (10-bit) week number rolls over once every 20 years or so and there's no defensive coding in the device to deal with that.GPSes I've used in my car never last more than 5 years anyway since the hot sun buggers them up. And that's in Scotland. I'm surprised the Li Ion batteries don't explode actually. The worst that will happen is a plane doesn't know where it is and the pilot has to actually fly it for a bit. |
.clair. Send message Joined: 3 Mar 13 Posts: 63 Credit: 770,392,945 RAC: 1,640 ![]() ![]() |
The show aint over until the fat lady runs out of time :-), one from my arkives If any of it is wrong , I blame some one else . . . And will place a firm leap second in your integers : ------------------------ Remember Y2k Well it's not over , yet 2020: January 1: Systems still using 1920 as pivot date fail; Macintoshes running System 6.0.4 or earlier - correct date can no longer be set in Date & Time Control Panel 2030: January 1: Systems still using 1930 as pivot date fail. 2036: January 1: Burroughs Unisys A Series system date fails? 2036: February 6: 2^32 seconds from Jan 1, 1900. 2037: January 1: Rollover date for NTP systems 2038: January 19: Unix: 2^31 seconds from Jan 1, 1970 2040: February 6: At 06:28:16, old Macs' longword seconds from Jan 1, 1904 overflow. 2042: September 17: IBM 370 TOD clock overflows. One source lists this as the 18th (?) 2044: January 1: MS-DOS: 2^6 years from 1980, setting the most significant bit (MSB). Signed variables using this get a negative date. 2046: January 1: Amiga system date failure 2046: June 8: Some Unix password aging fails; 62^2 weeks from 1970. 2049: December 31: Microsoft Project 95 limit. 2078: December 31: MS Excel 7.0 - the last day 2079: June 6: 2^16 days from January 1, 1900 2080: January 1: MS-DOS file dates, displayed with two-digit years, become ambiguous. 2100: Y2.1K; most current PC BIOSes run out of dates; MS-DOS <DIR> renders the file-date years 2100 through 2107 as 99. 2100: February 28: last day of February - NOT a leap year 2106: February 7: Unix: 2^32 seconds from Jan 1, 1970; time overflows at 06:28:16. 2108: January 1: MS-DOS 2^7 years from 1980; file date overflows 2738: November 28: Approximate day of A.D. 1 million (days) 4338: November 28: COBOL-85 integer day 1,000,000 (10^6) exceeds six-digit field 9999: HTTP caching fails. 10000: January 1: Y10K!! Four-digit years fail. More time will elapse between the time this document was written and this date than has elapsed from the beginning of modern human civilization until now. 29602: January 1: MS Windows NT File Systems (NTFS) fails. 29940: New Macs' signed 64-bit time fails (has been OK since 30,081 B.C.!!) 31086: July 31: Internal DEC VMS time fails at 02:48:05.47 60056: Win32 64-bit time fails (started from Jan 1, 1601) |
Mr P Hucker![]() Send message Joined: 5 Jul 11 Posts: 759 Credit: 361,871,315 RAC: 4 ![]() ![]() |
Remember Y2kI remember a big fuss over nothing. Why such stupid programmers setting a limit on things? 2046: January 1: Amiga system date failureWell that's the end of the world. Oooh lemmings! 60056: Win32 64-bit time fails (started from Jan 1, 1601)Looks like Windows looked furthest ahead. |
Nuadormrac Send message Joined: 11 Sep 08 Posts: 22 Credit: 8,585,635 RAC: 15,918 ![]() ![]() |
I'm now getting a lot of this validation inconclusive on a brand new laptop I installed a couple days ago. Oddly enough, my old laptop (the graphics card on it has recently died outside the Intel one on the CPU), didn't seem to pop these up. I'm sure the new laptop isn't fatally flawed, so waiting to see what happens... Other projects I've tried, have been validating without issue.... |
Nuadormrac Send message Joined: 11 Sep 08 Posts: 22 Credit: 8,585,635 RAC: 15,918 ![]() ![]() |
|
Mr P Hucker![]() Send message Joined: 5 Jul 11 Posts: 759 Credit: 361,871,315 RAC: 4 ![]() ![]() |
I'm now getting a lot of this validation inconclusive on a brand new laptop I installed a couple days ago. Oddly enough, my old laptop (the graphics card on it has recently died outside the Intel one on the CPU), didn't seem to pop these up. I'm sure the new laptop isn't fatally flawed, so waiting to see what happens...On this project "validation inconclusive" should read "validation pending". For some reason they go in the wrong list. it just means they've not been checked yet. Since your laptop is powerful, you're going to finish them first and have to wait for someone slower. However I do see some which have been checked by someone's Nvidia against your CPU, and it's asked for a third check. Might be something up there. I know Einstein are having problems with different types of chips being compared, but that's a new program they've written. However, looking through my tasks, it seems my valid ones are not checked. Looks they they only check some of them at random? or perhaps they trust a machine after a certain time? If you've just attached to the project, milkyway needs to see lots of successful tasks before it stops checking them with someone else. By the way, to delete a post, edit it and change it to contain two spaces and nothing else. For some reason this makes it go away. |
alanb1951 Send message Joined: 16 Mar 10 Posts: 168 Credit: 97,104,435 RAC: 64,576 ![]() ![]() ![]() |
@Nuadormrac (and Peter too...) For information regarding the phases of validation... The validator has two duties -- when a result is returned it checks it for obvious errors (which can be marked Invalid immediately) and either validates the work unit at once (see below) or marks the result as needing verification. If multiple results are required for verification, the validator gets invoked again once there are enough results to perform a full verification. (Oversimplification!) As Peter mentioned, sometimes it only needs one result -- it uses a mechanism called Adaptive Replication. Once a user has passed a pre-defined count [20, I think] of consecutive successful results for a specific application, the validator does a little calculation which (by default) will result in about 90% of tasks for said user being passed without needing any wingmen. So until you've racked up enough consecutive successfully validated Separation tasks it will always go to Validation Inconclusive (and if you get a bad result it'll clear the count...) In theory, both applications at MilkyWay use Adaptive Replication, but it [currently] seems to be broken for N-body, so there'll always be a wingman [eventually]. However, it works for Separation, but for some reason the replication count for the wingman case seems to be three (instead of the two used for N-body). I presume the count is set higher because of the way the results are compared for verification purposes; it doesn't require an exact match... Cheers - Al. P.S. When MilkyWay had the problems after the disk crash earlier this year, it was quite common to see work units stuck with one or more results at Validation Inconclusive when there should've been enough results to complete validation and declare a canonical result. This was because another part of the system got so bottlenecked that it wasn't calling the validator for the verification phase! |
Septimus Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,792,317 RAC: 2,728 ![]() ![]() |
Is something wrong somewhere the number of WU's waiting for validation is now over 13000, not bbeen that high for a very long time ? |
Septimus Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,792,317 RAC: 2,728 ![]() ![]() |
Is something wrong somewhere the number of WU's waiting for validation is now over 13000, not bbeen that high for a very long time ? |
Septimus Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,792,317 RAC: 2,728 ![]() ![]() |
Is something wrong somewhere the number of WU's waiting for validation is now over 13000, not been that high for a very long time ? took ages to even post this meaasge. |
Mr P Hucker![]() Send message Joined: 5 Jul 11 Posts: 759 Credit: 361,871,315 RAC: 4 ![]() ![]() |
I only have 4 GPUs on this project just now, but they're downloading the max per host easily and crunching through them just fine with immediate server response. I also posted this easily. Validation queue is now down to 901, so i guess whatever it was was fixed very quickly. |
Septimus Send message Joined: 8 Nov 11 Posts: 205 Credit: 2,792,317 RAC: 2,728 ![]() ![]() |
Is something stuck somewhere the validation Queue is hovering at over 800,000 now, been like that for a couple of days. |
Speedy51 Send message Joined: 12 Jun 10 Posts: 41 Credit: 5,313,268 RAC: 7,118 ![]() ![]() |
Is something stuck somewhere the validation Queue is hovering at over 800,000 now, been like that for a couple of days. The best way I believe we can help with the situation is keep processing the work we are given and process anything with a _2 or higher at the end of a task first. |
![]() Send message Joined: 12 Nov 21 Posts: 222 Credit: 520,018,550 RAC: 2,186,172 ![]() ![]() |
Yep, that is exactly what I do, when I can. Hopefully that clears completed workunits off of the servers sooner rather than later. Which would imply less server thrashing.Is something stuck somewhere the validation Queue is hovering at over 800,000 now, been like that for a couple of days. |
Mr P Hucker![]() Send message Joined: 5 Jul 11 Posts: 759 Credit: 361,871,315 RAC: 4 ![]() ![]() |
Server needs an SSD. |
San-Fernando-Valley Send message Joined: 13 Apr 17 Posts: 237 Credit: 576,421,357 RAC: 2,261,206 ![]() ![]() ![]() |
Server needs an SSD. Peter: ONE won't do .... Bad joke - I know. Have a nice week! S-F-V |
Mr P Hucker![]() Send message Joined: 5 Jul 11 Posts: 759 Credit: 361,871,315 RAC: 4 ![]() ![]() |
I have 8 of my own.Server needs an SSD.Peter: Have a nice week!i will, i'm going on holiday to visit friends and relatives. "7.5 hour" drive according to satnav, so that's about 4.5 hours then :-) |
San-Fernando-Valley Send message Joined: 13 Apr 17 Posts: 237 Credit: 576,421,357 RAC: 2,261,206 ![]() ![]() ![]() |
I have 8 of my own.Server needs an SSD.Peter: oops, I thought you ment milkyway's server ... The response times are sometimes really bad. Not yours, I again mean milkyway. JOKE ON: Oh, 7.5 hours is the calculation with HDDs and the 4.5 hours with SSDs. I wonder what the driving time calculation would be with NVMes? JOKE OFF. Have a nice time on your holiday! |
Mr P Hucker![]() Send message Joined: 5 Jul 11 Posts: 759 Credit: 361,871,315 RAC: 4 ![]() ![]() |
i did mean Milkyway server. I meant I have 8 here, yet MW has none. Mechanical drives in the 21st century is unworkable. Rosetta for example actually has a bank of 72 SSDs. And there's a lot more data goes through MW.oops, I thought you ment milkyway's server ...I have 8 of my own.Server needs an SSD.Peter: Well A Renault Scenic goes 120mph in a 70 limit so....JOKE ON:Have a nice week!i will, i'm going on holiday to visit friends and relatives. "7.5 hour" drive according to satnav, so that's about 4.5 hours then :-) Have a nice time on your holiday!Thanks, i will try to. |
©2023 Astroinformatics Group