Message boards :
Number crunching :
Torrents of Invalid WU's
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Nov 09 Posts: 108 Credit: 430,760,953 RAC: 0 |
Oct 5-6 have caused me to have at least 100 WU's marked with "Validate Error", "Completed, marked as invalid", and "Completed, can't validate". "Validate Error" seems to be caused by dozens of CPU-only WU's that 'completed' after only 4-8sec. One WU marked as invalid (161366443) had two other CPU-only wingmen complete it in only 5sec yet their results were accepted. In scrolling through my own 'valid' WU's, I also see dozens of 4sec WU results accepted as gospel by the validator. The "Completed, can't validate" appears to be a quorum failure where hundreds of other wingmen are returning invalid CPU results in the same sub-5sec period. Something seems to be very broken here... |
Send message Joined: 16 Jan 09 Posts: 5 Credit: 400,627,866 RAC: 0 |
|
Send message Joined: 22 Apr 09 Posts: 38 Credit: 27,377,932 RAC: 0 |
The same here http://milkyway.cs.rpi.edu/milkyway/results.php?userid=23358&offset=0&show_names=0&state=4. Can't access your results, try if you get access to this task. Last validated result. Knight Who says Ni |
Send message Joined: 16 Jan 09 Posts: 5 Credit: 400,627,866 RAC: 0 |
The same as me. Buit something has happened as I have only one invalid 1 GPU x 2 Linux CPU task now. |
Send message Joined: 11 Feb 10 Posts: 8 Credit: 11,459,648 RAC: 0 |
The river of invalid WU's unfortunalely still hasn't dried out as the following report shows. http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=213541708 Any clues? |
Send message Joined: 11 Feb 10 Posts: 8 Credit: 11,459,648 RAC: 0 |
The river of invalid WU's unfortunalely still hasn't dried out as the following report shows. The same goes for these too: http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=213541688 http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=213541706 http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=213541707 |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
Yesterday evening (GMT+2) after Travis' post I aborted all MW-wu's and loaded the new ones. My mainsystem cruched some hundred since then, not a sigle one failed. Currently I cruch only the ati-app, I stopped the nbodys since they also produced errors. Alexander |
Send message Joined: 27 Nov 09 Posts: 108 Credit: 430,760,953 RAC: 0 |
The river of invalid WU's unfortunalely still hasn't dried out... Travis apparently did something with the GPU WU's around 04:22UTC today. This so far has fixed the GPU validation problems for me. But 6 of 17 CPU-only WU's sent out after that time have run for only a few seconds and been rejected with "Validate Error" or "Completed, marked as invalid". This includes several from only 2 hours ago. The remainder look like they will run for the expected time. |
Send message Joined: 11 Feb 10 Posts: 8 Credit: 11,459,648 RAC: 0 |
The remainder look like they will run for the expected time. Affirmative. After half a dozen of invalid WUs, finally the valid ones rushed down the line! We're back in CPU-business. |
Send message Joined: 25 Feb 10 Posts: 49 Credit: 10,137,837 RAC: 0 |
213355053 201854 6 Oct 2010 7:36:33 UTC 6 Oct 2010 7:54:19 UTC Completed and validated 83.27 5.83 0.04 213.76 MilkyWay@Home v0.23 (ati13ati) 213369417 182806 6 Oct 2010 8:06:35 UTC 6 Oct 2010 13:42:27 UTC Completed, marked as invalid 1,294.64 406.41 2.15 0.00 Anonymous platform 213538977 104894 6 Oct 2010 13:47:35 UTC 6 Oct 2010 14:12:59 UTC Completed and validated 0.00 9.05 0.05 213.76 Anonymous platform The underclocked 3850 wu is invalid while the cpu wu that have 0 secs runtime and 9 secs cpu time is valid. This is getting out of hand, I'm gonna move my host to collatz until this problem is fixed. |
Send message Joined: 27 Nov 09 Posts: 108 Credit: 430,760,953 RAC: 0 |
We're back in CPU-business. Alas, we are not. I still have CPU WU's that are completing in a few seconds. These latest were sent out from 10:53UTC October 6 through 01:09UTC October 7. |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
Travis posted over in the other thread that he was going to update the CPU apps as he knew what was going on. http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=1953&nowrap=true#42642 |
Send message Joined: 27 Nov 09 Posts: 108 Credit: 430,760,953 RAC: 0 |
Version 0.40 CPU app seems to have done the trick. |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
Not sure where this fits best, This task ran for 3 hours with no progress and I aborted it, 4 of 5 didn't seems to show any progress either. Yet one task I had ran just fine. task de_14_2s_5_1344965_1286666350_1 Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 27 Nov 09 Posts: 108 Credit: 430,760,953 RAC: 0 |
This task ran for 3 hours with no progress and I aborted it, 4 of 5 didn't seems to show any progress either. All of my 0.40 or 0.04 WU's failed on error -161 after up to 37 CPU hours. All of them were running at least 50% slower than version 0.19. They don't yet know why it runs so slow under Windows... |
Send message Joined: 23 May 10 Posts: 1 Credit: 43,238,758 RAC: 0 |
Hi, Just started getting the message below. I am no longer recieving new WU's. What should I do? Thanks. 10/10/2010 7:53:30 PM Milkyway@home Message from server: No work sent 10/10/2010 7:53:30 PM Milkyway@home Message from server: Your app_info.xml file doesn't have a version of MilkyWay@Home N-Body Simulation. 10/10/2010 7:54:26 PM Milkyway@home Computation for task de_15_2s_5_1888795_1286742812_1 finished 10/10/2010 7:54:28 PM Milkyway@home Started upload of de_15_2s_5_1888795_1286742812_1_0 10/10/2010 7:54:29 PM Milkyway@home Finished upload of de_15_2s_5_1888795_1286742812_1_0 10/10/2010 7:54:36 PM Milkyway@home Sending scheduler request: To fetch work. 10/10/2010 7:54:36 PM Milkyway@home Reporting 2 completed tasks, requesting new tasks for CPU 10/10/2010 7:54:38 PM Milkyway@home Scheduler request completed: got 0 new tasks 10/10/2010 7:54:38 PM Milkyway@home Message from server: No work sent 10/10/2010 7:54:38 PM Milkyway@home Message from server: Your app_info.xml file doesn't have a version of MilkyWay@Home N-Body Simulation. 10/10/2010 7:54:57 PM Milkyway@home Computation for task de_15_2s_5_1948709_1286751518_0 finished |
Send message Joined: 19 Feb 09 Posts: 29 Credit: 5,452,691 RAC: 0 |
Not sure where this fits best, This task ran for 3 hours with no progress and I aborted it, 4 of 5 didn't seems to show any progress either. Yet one task I had ran just fine.[/u] task de_14_2s_5_1344965_1286666350_1 I have had similar ones what I notice is that if it does not start in 2 minutes it will just countdown on the right with no progress until it reaches zero Ie after 6 or 7 hours then keeps running instead of the normal 4 to 5 hours the good ones that work usually show 0.62% progress within 2 minutes. so when i get them if it shows the 0.62 before 2 minutes then I let it run and if not I abort it. I notice over the weekend about 40% were good but the other 60% would not have started so these were aborted after 2 minutes Paul |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
All of these were bad and aborted, first issue also. de_15_2s_5_1435953_1286983876 de_15_2s_5_1435954_1286983876 de_15_2s_5_1435955_1286983876 de_15_2s_5_1435956_1286983876 de_15_2s_5_1435957_1286983876 de_15_2s_5_1435958_1286983876 added these, same thing. de_16_2s_5_1437807_1286984701 de_16_2s_5_1437817_1286984701 de_16_2s_5_1437818_1286984701 de_16_2s_5_1437819_1286984701 de_16_2s_5_1437821_1286984701 de_16_2s_5_1437822_1286984701 Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
My findings is that all 2s units are bad, and all 3s units run fine. ((so far)) Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 27 Nov 09 Posts: 108 Credit: 430,760,953 RAC: 0 |
My findings is that all 2s units are bad, and all 3s units run fine. Unfortunately, if you search the message board for "_3s_", you'll find some people reporting that they have errored out on this -161 file transfer error just like the 2S units did. |
©2025 Astroinformatics Group