Message boards :
News :
Failing workunits
Message board moderation
Author | Message |
---|---|
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
We're experiencing some issues with the RPI computer science login servers since some time over the weekend, so we've been unable to fix the problem with the failing workunits for the separation 0.4 runs. We're waiting for them to be restored from a backup before we can fix the issue. |
Send message Joined: 24 Jul 09 Posts: 32 Credit: 18,139,650 RAC: 20 |
Thank you for the update. It's gratly appreciated. |
Send message Joined: 16 Jun 10 Posts: 6 Credit: 7,402,186 RAC: 0 |
Thanks for the update. Will check back to see when the problem is fixed. |
Send message Joined: 3 May 10 Posts: 74 Credit: 1,532,760 RAC: 0 |
Hi Matt. Thanks for the update. You will know how frustrating it is some of my wus predicted at c16 hours have been running for 38 hours with 15 to go I presume that these will error out eventually. I've only got an E4700 core duo and one does S@H and the other MW@H. S@H is having server problems and MW@H is erroring out with the 0.4 or is it 0.04 software. Why don't we just go back to the previous version? Check out the posts on Number crunching>Computation errors everybody is having this problem and migrating to other projects. Good luck with the fix we, out here in the ether, hope that you manage to fix it soon and appreciate all your efforts to supply us with work for free |
Send message Joined: 13 Feb 09 Posts: 51 Credit: 72,857,382 RAC: 3,730 |
Is that why I've gotten zero credit for two WUs that took over 40 hours to complete? I hope that I eventually get credit, but in the meantime, I'll suspend the project and let my machine crunch on other projects. |
Send message Joined: 16 Mar 10 Posts: 12 Credit: 22,284,745 RAC: 0 |
|
Send message Joined: 6 Apr 09 Posts: 1 Credit: 1,560,491 RAC: 0 |
Hi Folks, I to have been hit with failed work units. I've had a total of 7 work units and I'm going to have what looks like number eight that has failed. It started on [*]10/11/10--19h--15m--55s [*]10/12/10--31h--13m--51s [*]10/12/10--28h--06m--19s [*]10/13/10--19h--26m--52s {*]10/14/10--19h--11m--04s [*]10/16/10--25h--17m--14s [*]10/17/10--35h--02m--48s
|
Send message Joined: 21 Mar 09 Posts: 11 Credit: 14,806,072 RAC: 0 |
Hi Folks, The WUs with 0.40 have still problems, results are: error while computing Two examples: The first WU ID 221388801 took over 17 hours in computing ELAP Time and the second WI ID 219212040 took over 34 hours on an INTEL i7 965 (8 cores) with VISTA 64 bit OS. 221388801 167364804 16 Oct 2010 17:51:48 UTC 17 Oct 2010 11:34:04 UTC Error while computing 62,653.59 60,971.77 426.59 --- MilkyWay@Home v0.40 219212040 165787964 13 Oct 2010 19:58:58 UTC 16 Oct 2010 16:57:16 UTC Error while computing 124,433.33 123,003.90 860.59 --- MilkyWay@Home v0.40 I hope, that can be fixed ASAP. Kind regards, Ronald |
Send message Joined: 9 Feb 10 Posts: 1 Credit: 7,857,169 RAC: 0 |
Hi Folks, since the last days I get only "accounting irregularity" for 100% computed Milkyway@home-tasks. I changed the Mainsystem at weekend from xp to win 7, but this isn't the problem; it exists under xp and win 7. Other workunits, e.g. from SIMAP, works successful. I hope you can fix this problem in the next time. Thanks & Good Luck |
Send message Joined: 29 Sep 09 Posts: 18 Credit: 46,059 RAC: 0 |
Does the latest Milkyway WU software use SSE2 or similar? Some older machines do not have those instructions (especially AMD), and that may be the cause of this rash of Compute Errors with Illegal Instruction. |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
Does the latest Milkyway WU software use SSE2 or similar? The N-body is supposed to require SSE2 because without it, it's a pain to get consistent results with the x87 FPU. The separation isn't. I think there might have been some 'build system pollution' where the SSE2 flags were infecting the separation build. Some older machines do not have those instructions (especially AMD), and that may be the cause of this rash of Compute Errors with Illegal Instruction. Intel added SSE2 in 2001, and AMD added it in 2003, so really old. |
Send message Joined: 16 Mar 10 Posts: 12 Credit: 22,284,745 RAC: 0 |
Does the latest Milkyway WU software use SSE2 or similar? Some older machines do not have those instructions (especially AMD), and that may be the cause of this rash of Compute Errors with Illegal Instruction. Well since I am having my failures on an Intel i5-750 system with Windows 7, I don't believe this is the problem. |
Send message Joined: 30 Apr 09 Posts: 101 Credit: 29,874,293 RAC: 0 |
A team mate and I got errors at all of the WU series of: de_14_2s_5_x and de_16_2s_5_x. The errors are from GTX295 (team mate) and from GTX260-216 cards (my). It look like this (as example) (GTX260-216): Exit status -1073741819 (0xffffffffc0000005) - exit code -1073741819 (0xc0000005) Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00408036 read attempt to address 0x0224B000 Engaging BOINC Windows Runtime Debugger... ..and.. (GTX295) Reason: Access Violation (0xc0000005) at address 0x00408036 read attempt to address 0x023AF000 Reason: Access Violation (0xc0000005) at address 0x00408036 read attempt to address 0x023D6000 It's a problem with the 0.24 cuda23 app, or with the WUs? |
Send message Joined: 21 Mar 09 Posts: 11 Credit: 14,806,072 RAC: 0 |
Hi Folks, The WUs with 0.40 have still problems, results are: Validate error This WU take 20 to 33 hours ELAP Time or more. Other WUs are estimated in ELAP-Time much more, due to the fact, that the runtime is awful long, up to 80-120 or more hours in computing ELAP Time, I decided to abort all those longrunners. My System is: INTEL i7 965 (8 cores) with VISTA 64 bit OS. 224837318 165691129 21 Oct 2010 5:16:37 UTC 23 Oct 2010 4:31:04 UTC Validate error 53,701.63 53,047.92 358.39 --- MilkyWay@Home v0.40 224801303 169891267 20 Oct 2010 23:57:58 UTC 23 Oct 2010 8:03:05 UTC Validate error 120,921.55 119,557.60 807.73 --- MilkyWay@Home v0.40 224863918 169581572 21 Oct 2010 16:32:50 UTC 23 Oct 2010 11:41:07 UTC Validate error 78,787.94 77,470.22 523.39 --- MilkyWay@Home v0.40 224818756 169904633 21 Oct 2010 1:14:38 UTC 23 Oct 2010 11:57:10 UTC Validate error 91,987.54 90,332.60 610.29 --- MilkyWay@Home v0.40 UTC 23 Oct 2010 14:08:22 UTC Aborted by user 87,497.46 86,091.69 581.64 --- MilkyWay@Home v0.40 224776309 169873920 20 Oct 2010 23:15:04 UTC 23 Oct 2010 14:08:22 UTC Aborted by user 145,051.26 143,105.00 966.82 --- MilkyWay@Home v0.40 224773732 169871374 20 Oct 2010 23:10:36 UTC 23 Oct 2010 14:08:22 UTC Aborted by user 148,016.59 145,909.30 985.77 --- MilkyWay@Home v0.40 224733715 169771092 20 Oct 2010 22:02:27 UTC 23 Oct 2010 14:08:22 UTC Aborted by user 154,943.80 152,746.70 1,031.96 --- MilkyWay@Home v0.40 I hope, that you can fix this asap. Best regards, Ronald |
Send message Joined: 13 Feb 10 Posts: 1 Credit: 743,176 RAC: 0 |
I am getting the following error and Boinc will not process any downloads. milkyway_0.4_windows_intelx86.exe has encountered a problem and needs to close. We are sorry for the inconvenience. I have this running on several other computers. Does anyone have any ideas? I uninstalled and reinstalled several times as well as deleting all folders and registry entries. Running on a Pentium III windows Xp |
Send message Joined: 21 Mar 09 Posts: 11 Credit: 14,806,072 RAC: 0 |
Hi Folks, good news, this 8 WU-Tasks de_seperation v0.40 went thru without any problems. Thanks to Matt, Travis and the team. 226427677 171024106 23 Oct 2010 16:32:46 UTC 24 Oct 2010 9:14:53 UTC Completed and validated 60,072.07 59,266.79 400.41 213.78 MilkyWay@Home v0.40 226427676 171024105 23 Oct 2010 16:32:46 UTC 24 Oct 2010 9:32:11 UTC Completed and validated 61,135.63 60,270.72 407.19 213.78 MilkyWay@Home v0.40 226427675 171024104 23 Oct 2010 16:32:46 UTC 24 Oct 2010 8:47:22 UTC Completed and validated 58,459.32 57,540.41 388.74 213.78 MilkyWay@Home v0.40 226427674 171024103 23 Oct 2010 16:32:46 UTC 24 Oct 2010 8:46:42 UTC Completed and validated 58,395.09 57,585.02 389.05 213.78 MilkyWay@Home v0.40 226427673 171024102 23 Oct 2010 16:32:46 UTC 24 Oct 2010 8:45:47 UTC Completed and validated 58,291.51 57,536.37 388.72 213.78 MilkyWay@Home v0.40 226427660 171024089 23 Oct 2010 16:32:46 UTC 24 Oct 2010 8:36:12 UTC Completed and validated 57,774.64 57,013.30 385.18 213.78 MilkyWay@Home v0.40 226427659 171024088 23 Oct 2010 16:32:46 UTC 24 Oct 2010 9:53:58 UTC Completed and validated 58,912.21 58,050.43 392.19 213.78 MilkyWay@Home v0.40 226427658 171024087 23 Oct 2010 16:32:46 UTC 24 Oct 2010 9:03:55 UTC Completed and validated 59,452.36 58,563.46 395.66 213.78 MilkyWay@Home v0.40 Best regards, Ronald |
Send message Joined: 13 Jul 09 Posts: 1 Credit: 12,316 RAC: 0 |
Hello, Since last week I would like to add new work from Milkyway, but every time I started to download a new task, an error happens. When I check the message page on the BOINC manager, it shows this: work fetch resumed by user update requested by user sending scheduler request: Requested by user. Requesting new tasks Scheduler request completed: got 1 new tasks Started download of stars-td82-2stream-30.txt Started download of de_separation_82_3s_00_1_1669219_1287930653_search_parameters Finished download of de_separation_82_3s_00_1_1669219_1287930653_search_parameters Finished download of stars-td82-2stream-30.txt After the above mentioned, nothing happens and after restarting the BOINC manager I got this error message: milkyway_0.4_windows_intelx86.exe has encountered a problem and needs to close. We are sorry for the inconvenience. And here is the signature of this error: AppName: milkyway_0.4_windows_intel86.exe AppVer: 0.0.0.0 ModName: milkyway_0.4_windows_intel86.exe ModVer: 0.0.0.0 Offset: 000198f9 Can anyone tell me how to fix this? It's going on for more than 2 weeks now. Thanks, John |
Send message Joined: 3 May 10 Posts: 74 Credit: 1,532,760 RAC: 0 |
Ever since the new software I have been having problems. At first all I got was calculation errors but now it seems that they have disappeared to be replaced by endless calculations. I have two WUs behaving as follows de_12_3s_5_606570_1287603352_0 elapsed 36:41 to completion 6:30 de_12_3s_5_606558_1287603352_0 elapsed 35:47 to completion 7:41 these are running with a high priority have kicked two other MW WUs off the processors and are preventing a S&H WU starting. I have another 6 MW WUs waiting to calculate and time is running down before they are due to be reported. Although I am sure that something is wrong I am allowing these WUs to continue to see the outcome. I would appreciate any comment on my endless calculation problem as with only an E4700 processor with two cores and a weak GPU I do not feel that I am making progress or contributing to anything worthwhile as long as this persists. I have not managed to sucessfully calculate a MW Wus in the last 14 days HELP |
Send message Joined: 16 Mar 10 Posts: 12 Credit: 22,284,745 RAC: 0 |
|
Send message Joined: 3 May 10 Posts: 74 Credit: 1,532,760 RAC: 0 |
Ever since the new software I have been having problems. At first all I got was calculation errors but now it seems that they have disappeared to be replaced by endless calculations. The first unit is completed and validated but the run time of 158,069 secs seems excessive compared to previous MW runs and will reduce my throughput dramatically. Can anybody tell me if this is the expected run time or am I doing something wrong? At this rate I will be unable to run the WUs I have to the required reporting time |
©2024 Astroinformatics Group