Message boards :
Number crunching :
Anormal Wu time
Message board moderation
Author | Message |
---|---|
Send message Joined: 14 Feb 09 Posts: 5 Credit: 157,994,413 RAC: 0 |
I'm facing some cumputation problem on my crossfireX system (2X HD5970) Often (when i was AFK) my computer become very slow CPU charge increase dramaticly (near 100% of each core) and GPU become idle or just have 50% of load and WU time increase dramticaly record@10 000s for some (normaly 100s and 150s for DE_13) It only happen on new long WU (de_13_3s_const for exemple) http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=75789084 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=75789083 http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=75788633 A screen capture of the problem, it happen at the beginning of the WU I can use Pause/resume or start stop to correct but i'm not always on my computer Normaly i can expect around 30 000 credit per hour but due to this some day it les than 10 000 I try to use: App 0.20b, 0.21, 0.22 Boinc 6.10.18/35/37 My config: Win 7 Pro 64b Q9450@3.2Ghz Quad crossfire HD5970@stock ATI 10.2 Some way to fix it? |
Send message Joined: 12 Aug 09 Posts: 172 Credit: 645,240,165 RAC: 0 |
I too, am having the same problems. I have three machines, and they all do it. One BOINC 6.10.34 Cal 10.1 Two BOINC 6.10.34 Cal 10.1 Three BOINC 6.10.24 Cal 10.2 I have tried different versions of BOINC and Cal drivers to no avail. The screen refreshes VERY slowly, so aborting the offending files takes several minutes each. After whichever file is slow, things return to normal. Shutting BOINC exiting, then restarting sometimes fixes it, sometimes not. likewise shutting the whole computer and rebooting, sometimes fixes it sometimes not. These days I just abort when I notice the afterburner monitors sticking at 30% of flatlining. A major nuisance and very hands on. |
Send message Joined: 19 Mar 08 Posts: 5 Credit: 232,926,469 RAC: 0 |
Same problem for me. Same effect. |
Send message Joined: 19 Mar 08 Posts: 5 Credit: 232,926,469 RAC: 0 |
Maybe a lead to follow. In the WU with problem and long time to finish, i found this 'dividing each iteration in 6 parts' With normal WUs this is 'dividing each iteration in 5 parts'. edit : error in my conclusion, there are another WUs whith 'dividing each iteration in 6 parts' and no problem with. |
Send message Joined: 11 Sep 08 Posts: 22 Credit: 9,081,761 RAC: 0 |
I'm not sure if this is related or not; but of late I've been noticing WUs that never seem to progress beyond 0.000% completion. I've aborted, thinking bad WU, but as it has happened a bit more; I'm wondering what is going on with WUs which appear to hang/not progress for a fair amount of time. And by all appearances we're not talking slow, but rather stalled. |
Send message Joined: 10 Mar 10 Posts: 1 Credit: 28,822,623 RAC: 0 |
I've also had a few work units that sit at 0% for ages (some after 28 minutes), which I've canceled. Other sit at 0% for a bit, then jump to over 100%, now I have one at 128% and still running, this after 7 minutes. Average WU time was 4-6 minutes before this started happening (only been running for a week...) Edit: woah, just sat here watching it work. That same WU stayed at 128% till around 17 minutes, then started dropping percentage points till it read 100% (took just over a minute), then uploaded normally... |
Send message Joined: 11 Sep 08 Posts: 22 Credit: 9,081,761 RAC: 0 |
Perhaps check pointing got messed up, not sure. But I'm having to run it on a CPU, not a GPU; and I've got 2 more sitting here at 0% for it's been over 1.5 hours now. I can let these sit if that's what's being seen, but no idea. When the thing doesn't budge for hours, it really does look stuck. |
Send message Joined: 12 Aug 09 Posts: 172 Credit: 645,240,165 RAC: 0 |
Yep, that is a different problem that has been fixed. Basically the WU sits at 0% then will jump to 143% and work backwards to 100% then uploads. You can download a manual app (.22) if it bothers you, otherwise just wait for a few days, and normality will be restored. |
Send message Joined: 14 Feb 09 Posts: 5 Credit: 157,994,413 RAC: 0 |
i'm trying to use this command line in an automated task every 10minutes boinccmd.exe --set_run_mode never 1 it will stop and start boinc for 1s every 10min |
Send message Joined: 14 Feb 09 Posts: 5 Credit: 157,994,413 RAC: 0 |
i'm trying to use this command line in an automated task every 10minutes It woks but ... in worst case i will loose around 10 000points per hour (it's better than 0 in fact) so ... |
Send message Joined: 11 Sep 08 Posts: 22 Credit: 9,081,761 RAC: 0 |
Both new ones have now been over 6 hours, and still at 0% |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
It would be nice if anyone would give an update on what is going on and being done. As well if the wus are fine or should be aborted. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 12 Aug 09 Posts: 172 Credit: 645,240,165 RAC: 0 |
You can fix the progress issue now, see here http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=1600&nowrap=true#37333, And here for the apps: http://www.arkayn.us/milkyway/index.html As to the other, going slow issue, I have noticed it only happens on my Windows 7 boxes as well, and not the Vista one. |
©2024 Astroinformatics Group