Welcome to MilkyWay@home

Posts by pippen

1) Message boards : Number crunching : Bunch of new computational errors (Message 58582)
Posted 9 Jun 2013 by pippen
Post:
All of my "Milkyway@home 0.82 (ATI14)" projects are generating "computational errors" at the start of the run (1-2 seconds). This just started in the last couple of days. Other Milkyway projects are running just fine.

I suspect, based on the title, it is not detecting that I have a single precision processor, but am not sure. Normally GPU stuff is not even submitted and I have the message notification saying that double precision is needed.

No other projects are causing problems. I recently (within the last week or so) updated to the latest BOINC..

FYI...

-Mike
2) Message boards : Number crunching : Clear Error Runs? (Message 57238)
Posted 13 Feb 2013 by pippen
Post:
Thanks for the responses! I'll ignore them then... I just hate the shame <g>
3) Message boards : Number crunching : Clear Error Runs? (Message 57223)
Posted 12 Feb 2013 by pippen
Post:
Hi,

I had 2 runs pop up with "computation error" a week or so ago. I am quite sure they were caused by some system problems I was having (a couple of lockups, unrelated to Boinc) at the time, however I can't make those two entries in my task list go away, though they are not running any more of course. They were both ps_nbody_100k_1_30_13 runs. Other runs have done fine both before and since.

So, I did some searching, and see a lot about how to deal with computation errors, but not this particular issue.

Is there a trick to purging them from my list? Or even better get them to clear out and start over?

I would assume the client reported the errors back to the server: am I correct? Or do I need to do something?

I see they have a deadline of Feb 14th. Will they go away then?

Thanks!

-Mike
4) Message boards : Number crunching : All WUs Failing? (Message 56619)
Posted 23 Dec 2012 by pippen
Post:
Milkyway needs a double precision card the one you are running I think is not !

The errors occur on the CPU and that has DP for sure. So that's not the reason.

Looking into the std_err... look like some permisions issue. I'd try excluding BOINC data directory from antivirus scanning, if that does not help, reinstall BOINC (just install it once again over the old installation).


The permission issue tipped me off... My McAfee Access Protection service had gone a bit bonkers, and I forget that turning off "real time protection" does not turn that off. My Windows Update even failed with the same "permission" error, and I saw the "shield up" thing in the McAfee icon, so I checked those logs. It looks like it was blocking access to something to do with the ATI driver.

Disabled Access Protection for a while and everything started running fine. Finished the patches, rebooted and did a bit of cleanup on my temp dirs and it has been running fine ever since. Thanks to everyone for the help!!

-Mike
5) Message boards : Number crunching : All WUs Failing? (Message 56584)
Posted 20 Dec 2012 by pippen
Post:
First thing I'd check is if you have sufficient cooling and also do a system wide check, ie, check RAM, disk, GPU, CPU for errors. There are diagnostics programs that can do that but I can't recommend one as I don't use Windows

If all comes out clean, then I'd suspect problems on M@H's side of things


I run temp and process monitors all the time, and nothing is showing any issues. SETI and WCG are both running fine, so I lean against a hardware issue per-se. Since it fails at the start of the run, 100% of the time, anything hardware related would more likely be showing in the other programs as well. I'll watch these things and if I don't see anything else to explain it, I'll track some down and run them. I don't overclock or anything (gave that stuff up in the late 80's and early 90's after seeing how bad that worked in retrospect) and in general I try not to stress the system.

Thanks!!

-Mike


6) Message boards : Number crunching : All WUs Failing? (Message 56582)
Posted 20 Dec 2012 by pippen
Post:
Milkyway needs a double precision card the one you are running I think is not !

The errors occur on the CPU and that has DP for sure. So that's not the reason.

Looking into the std_err... look like some permisions issue. I'd try excluding BOINC data directory from antivirus scanning, if that does not help, reinstall BOINC (just install it once again over the old installation).


Yes, I always get the message at startup from MW@H that my GPU does not support double precision and can't be used, so MW does not use it and just runs on the CPUs, as you indicate.

I'll check permissions and AV, though I have done nothing new (at least deliberately) and certainly have not had any popups. Since every run is aborting within 15 seconds (and just on this project), I was thinking I might need to check/purge the temp files. Where does MW@H store those in Windows 7? Any I should NOT delete? Is there a more detailed log file I can check?

Again, things have been processing just fine up until the 18th.

Thanks! -Mike
7) Message boards : Number crunching : All WUs Failing? (Message 56568)
Posted 19 Dec 2012 by pippen
Post:
Hello,

I have had no real problems for many months, but as of the 18th of December at some point, every work unit for MW is failing, all at about 14 seconds into the run ("computational error"). Thoughts or suggestions? I run 2 other projects that are showing no issues. I do have an AMD/ATI GPU, but MW does not use it because it does not support double precision. The failure rate has been 100% on these.

System is Windows 7 I7 based system, plenty of memory and disk space available. I did a project refresh, and the new tasks failed just like the others.

Thanks! -Mike




©2024 Astroinformatics Group