Tons of failed jobs
log in

Advanced search

Message boards : Application Code Discussion : Tons of failed jobs

Author Message
Chris Rampson
Send message
Joined: 14 May 11
Posts: 6
Credit: 52,876,764
RAC: 5,287

Message 66127 - Posted: 21 Jan 2017, 5:38:14 UTC

Here is my log from a failed job (I get a hundred every day). It appears that the developer typo-ed a variable declaration and used inf instead of int.

How can I get this info to the right people? Thanks. BTW, I'm losing tons of credit and CPU cycles over this.

Build log (https://milkyway.cs.rpi.edu/milkyway/result.php?resultid=1944345652):
--------------------------------------------------------------------------------
<kernel>:176:72: warning: unknown attribute 'max_constant_size' ignored
__constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))),
^
<kernel>:178:62: warning: unknown attribute 'max_constant_size' ignored
__constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))),
^
<kernel>:179:67: warning: unknown attribute 'max_constant_size' ignored
__constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))),
^
<kernel>:227:26: error: use of undeclared identifier 'inf'
tmp = mad((real) Q_INV_SQR, z * z, tmp); /* (q_invsqr * z^2) + (x^2 + y^2) */
^
<built-in>:33:19: note: expanded from here
#define Q_INV_SQR inf
^

--------------------------------------------------------------------------------
clBuildProgram: Build failure (-11): CL_BUILD_PROGRAM_FAILURE
Error building program from source (-11): CL_BUILD_PROGRAM_FAILURE
Error creating integral program from source

captainjack
Send message
Joined: 22 Jun 13
Posts: 40
Credit: 43,700,032
RAC: 0

Message 66128 - Posted: 21 Jan 2017, 13:09:19 UTC

Chris,

It appears that something is missing with your GPU setup. In your computer description, the driver version is not being shown. For example mine shows:

[2] NVIDIA GeForce GTX 970 (4036MB) driver: 367.57 OpenCL: 1.2


Your setup shows
[2] NVIDIA GeForce GTX 690 (1998MB) OpenCL: 1.2


Note that in your setup the "driver: 367.57" is missing.

Suggest that you upgrade to the latest version of BOINC (7.6.33) and verify that your GPU drivers are installed correctly. If that doesn't get you going, please post the first 50 (approximate) lines of your event log after you start up BOINC so we can see what else might be amiss.

Your issues are similar to another cruncher. You can read down this thread to see what he did to solve the problem.
http://milkyway.cs.rpi.edu/milkyway/show_user.php?userid=113092

Hope that helps.

Chris Rampson
Send message
Joined: 14 May 11
Posts: 6
Credit: 52,876,764
RAC: 5,287

Message 66129 - Posted: 21 Jan 2017, 15:15:43 UTC - in response to Message 66128.

Seems that the Linux 64 bit version of BOINC has been stuck at 7.2.42 for 3 years now.

So a variable declaration of type "inf" is a valid statement?

captainjack
Send message
Joined: 22 Jun 13
Posts: 40
Credit: 43,700,032
RAC: 0

Message 66130 - Posted: 21 Jan 2017, 16:40:18 UTC

Seems that the Linux 64 bit version of BOINC has been stuck at 7.2.42 for 3 years now.


I was able to get 7.6.33 using the Ubuntu software installer.

Chris Rampson
Send message
Joined: 14 May 11
Posts: 6
Credit: 52,876,764
RAC: 5,287

Message 66140 - Posted: 25 Jan 2017, 0:55:12 UTC

Speaking about the "#define Q_INV_SQR inf" Milkyway error:

I have been running BOINC/SETI for over 20 years. I have projects in Einstein, SETI, GPUGrid, Rosetta and Milkyway. Of the Milkyway project, there are 3 different types of jobs that I noticed. ONLY ONE OF THOSE IS FAILING.

I do not subscribe to the idea that there is a configuration problem when all my other projects are screaming along (including alternate Milkyway jobs).

The most reasonable explanation is a TYPO in the project files. Someone fatfingered a "t" into a "f" - which is not too hard to do since they are next to each other on the keyboard.

What needs to happen is for a Milkyway developer to explain why this happens - and hopefully fix his own code.

wwei25
Send message
Joined: 9 Feb 16
Posts: 3
Credit: 46,570,865
RAC: 128,926

Message 66156 - Posted: 3 Feb 2017, 23:00:18 UTC - in response to Message 66140.

The most reasonable explanation is a TYPO in the project files. Someone fatfingered a "t" into a "f" - which is not too hard to do since they are next to each other on the keyboard.


How careless they are! A number of people are contributing their computational power to MilkyWay@Home without and charge and MilkyWay@Home just work so casually. They clearly understand that if they loose you, they will have other people. So they don't even care about your feeling. I have already quit, and before they solve the problem, I will not be back.

alanb1951
Send message
Joined: 16 Mar 10
Posts: 31
Credit: 27,235,711
RAC: 11,244

Message 66161 - Posted: 5 Feb 2017, 9:02:53 UTC - in response to Message 66156.

The most reasonable explanation is a TYPO in the project files. Someone fatfingered a "t" into a "f" - which is not too hard to do since they are next to each other on the keyboard.


How careless they are! A number of people are contributing their computational power to MilkyWay@Home without and charge and MilkyWay@Home just work so casually. They clearly understand that if they loose you, they will have other people. So they don't even care about your feeling. I have already quit, and before they solve the problem, I will not be back.


Actually, it's not fat-fingering at all, as I discovered by getting the application source from github and trying to remember my C/C++ programming from 20+ years ago!...

For more information, I refer you to my response to Chris's post entitled "Fix it or I'm gone" in the MilkyWay@home Science board. Whilst it doesn't make the issue go away, it does try to explain it and point out a possible solution...

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4093

Cheers - Al.

P.S. I'm just a user, like you folks...

P.P.S. How many (full time or otherwise) programmers and technical staff do you think this project has?

wwei25
Send message
Joined: 9 Feb 16
Posts: 3
Credit: 46,570,865
RAC: 128,926

Message 66174 - Posted: 10 Feb 2017, 22:01:28 UTC - in response to Message 66161.

The most reasonable explanation is a TYPO in the project files. Someone fatfingered a "t" into a "f" - which is not too hard to do since they are next to each other on the keyboard.


How careless they are! A number of people are contributing their computational power to MilkyWay@Home without and charge and MilkyWay@Home just work so casually. They clearly understand that if they loose you, they will have other people. So they don't even care about your feeling. I have already quit, and before they solve the problem, I will not be back.


Actually, it's not fat-fingering at all, as I discovered by getting the application source from github and trying to remember my C/C++ programming from 20+ years ago!...

For more information, I refer you to my response to Chris's post entitled "Fix it or I'm gone" in the MilkyWay@home Science board. Whilst it doesn't make the issue go away, it does try to explain it and point out a possible solution...

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4093

Cheers - Al.

P.S. I'm just a user, like you folks...

P.P.S. How many (full time or otherwise) programmers and technical staff do you think this project has?


Interesting. I have already read that. Any it seems that nobody from M@H dare to come and reply.


Post to thread

Message boards : Application Code Discussion : Tons of failed jobs


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group