Fix it or I'm gone
log in

Advanced search

Message boards : MilkyWay@home Science : Fix it or I'm gone

Author Message
Chris Rampson
Send message
Joined: 14 May 11
Posts: 6
Credit: 52,881,178
RAC: 5,389

Message 66155 - Posted: 3 Feb 2017, 22:14:37 UTC

It's been months that one particular Milky Way program started failing. I have lost millions of points! If Milkyway@home is interested in keeping my considerable computing resources working for them - you have until the end of February to fix YOUR problem (hint, there is no "inf" variable type), or I'm going somewhere else.

https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4087

alanb1951
Send message
Joined: 16 Mar 10
Posts: 31
Credit: 27,241,641
RAC: 11,157

Message 66160 - Posted: 5 Feb 2017, 8:44:44 UTC - in response to Message 66155.

It's been months that one particular Milky Way program started failing. I have lost millions of points! If Milkyway@home is interested in keeping my considerable computing resources working for them - you have until the end of February to fix YOUR problem (hint, there is no "inf" variable type), or I'm going somewhere else.

https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4087


You might be interested in another thread in the Linux forum, titled "Consistent "Validate error" status", in which I mentioned some research I'd done into why the client was apparently building bad GPU kernels:

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4091

To summarize for your information, inf is actually what some C/C++ run-time print statements produce for infinity, and the user in this other thread was doing CPU tasks that reported "q is 0" for the first task of a batch of 5. It appears that the kernel constructor computes 1/q and its square root, and passes the latter value into the kernel compiler with the label Q_INV_SQR. If, for some reason, q is zero (which it should not be, of course!) it will pass in Q_INV_SQR=inf. Problem explained, but not solved.

Now, I found it interesting that folks who were seeing "q is 0" or this Q_INV_SQR problem seemed to be using old versions of the BOINC client, so I wondered if perhaps there's something in the old BOINC client libraries that messes up reading the parameter files on occasion - certainly, it would seem that a change to a much more recent client resolved the issue for other users.

And if that is the cause, I don't think there's anything much that the project programmer can realistically do about it. After all, if you know the data files are valid and you are reading them properly, putting in lots of defensive code to allow for system errors is not really productive!

I believe you're using a non-Debian based Linux, so I'm afraid I don't know how you can resolve the client issue (unless you're willing to build your own and, perhaps, feed it back to your community?) I do realize this doesn't help you much, but I hope you are now at least informed as to the likely cause.

By the way, I have no association with this project other than that of user - I just got so irritated at seeing "it doesn't work" posts that I decided to go to github and grab the source to find out what was going on!

I hope you do eventually manage to find a newer client anyway - amongst other things, the later clients are better at detecting GPUs and offer more logging options.

Cheers - Al.

Chris Rampson
Send message
Joined: 14 May 11
Posts: 6
Credit: 52,881,178
RAC: 5,389

Message 66180 - Posted: 12 Feb 2017, 1:04:20 UTC

The "official" version of BOINC for x86_64 is too old. I downloaded the latest (developer?) version 7.7.0, and the problem went away. I will look into providing BOINC rpms for my distro.


Post to thread

Message boards : MilkyWay@home Science : Fix it or I'm gone


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group