might have found the error
log in

Advanced search

Message boards : News : might have found the error

Author Message
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 55768 - Posted: 13 Oct 2012, 23:40:07 UTC

It looks like the star file was misspecified for the *22_3s_free_1/2 and *22_3s_edge_1/2 searches. I've made a new fixed star file for *22_3s_free_3 and *22_3s_edge_3 searches, and shut down the old ones. Let me know how these are working.
____________

Profile ritterm
Avatar
Send message
Joined: 16 Jun 08
Posts: 92
Credit: 365,629,434
RAC: 488

Message 55772 - Posted: 14 Oct 2012, 2:37:00 UTC - in response to Message 55768.

...I've made a new fixed star file for *22_3s_free_3 and *22_3s_edge_3 searches...Let me know how these are working.

I've run through only about 30 of these these, but, so far, so good. Previously, my error rate had been about 10%.

____________

TLSI2000
Send message
Joined: 15 Mar 10
Posts: 17
Credit: 427,215,338
RAC: 185,485

Message 55773 - Posted: 14 Oct 2012, 2:58:27 UTC

A Follow-up...

After two hours of the version '3' , I have seen *zero* errors on them.

Still having a few errors on the version '1' and '2' WUs as those batches run their course.

It looks like the problem is solved.

Thx.

Profile ritterm
Avatar
Send message
Joined: 16 Jun 08
Posts: 92
Credit: 365,629,434
RAC: 488

Message 55774 - Posted: 14 Oct 2012, 3:20:10 UTC - in response to Message 55772.
Last modified: 14 Oct 2012, 3:23:37 UTC

...so far, so good.

Or, maybe not so much... This result, 319700259, shows a failure similar to the earlier runs. :-(
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 55775 - Posted: 14 Oct 2012, 5:40:20 UTC - in response to Message 55774.

...so far, so good.

Or, maybe not so much... This result, 319700259, shows a failure similar to the earlier runs. :-(


Looks like the job completed successfully but for some reason the client marked it as a failure (it does have the output for the likelihood at the end). I wonder if this is due to the fact that one of the stream only likelihoods ended up being NAN.

I hope the errors like this are at least less frequent...
____________

Profile dskagcommunity
Avatar
Send message
Joined: 26 Feb 11
Posts: 170
Credit: 183,085,176
RAC: 0

Message 55776 - Posted: 14 Oct 2012, 7:43:20 UTC

Had some errors overnight like this :

http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=319788399
____________
DSKAG Austria Research Team: http://www.research.dskag.at



TJ
Send message
Joined: 12 Aug 09
Posts: 262
Credit: 91,881,498
RAC: 24

Message 55779 - Posted: 14 Oct 2012, 10:57:13 UTC

For edge_1 and edge_3 and free_1 and free_3 still many errors (while computing) and even more validate error for same, however wingman seems to run them fine and get validated.
I have the newest driver 1.4.1741 with HD5870 and BOINC 7.0.28 on Win7 x64.
____________
Greetings from,
TJ

TJ
Send message
Joined: 12 Aug 09
Posts: 262
Credit: 91,881,498
RAC: 24

Message 55780 - Posted: 14 Oct 2012, 11:08:51 UTC
Last modified: 14 Oct 2012, 11:09:24 UTC

Could it be a graphics driver problem?
I run Albert@home once a while as well and they did okay since yesterday as there are many validate errors too with ati as I saw with wingman. Nvidia seems to be okay.
____________
Greetings from,
TJ

TLSI2000
Send message
Joined: 15 Mar 10
Posts: 17
Credit: 427,215,338
RAC: 185,485

Message 55782 - Posted: 14 Oct 2012, 11:23:25 UTC

And I have seen a number of errors on the version '3' WUs overnight as well.

These seem to have an #IND in the result as:

<stream_only_likelihood> -3.638176510600306 -10.877692835483177 -1.#IND00000000000 </stream_only_likelihood>

Profile ritterm
Avatar
Send message
Joined: 16 Jun 08
Posts: 92
Credit: 365,629,434
RAC: 488

Message 55783 - Posted: 14 Oct 2012, 11:35:25 UTC - in response to Message 55775.
Last modified: 14 Oct 2012, 11:37:03 UTC

Travis said:

I hope the errors like this are at least less frequent...

A rough calculation with my host's results shows an error rate of about 2%-3% over the last 4 hours, or so. So, they do seem to be less frequent (for me).
____________

Link
Avatar
Send message
Joined: 19 Jul 10
Posts: 327
Credit: 16,283,020
RAC: 0

Message 55796 - Posted: 14 Oct 2012, 21:32:51 UTC - in response to Message 55775.

I hope the errors like this are at least less frequent...

Yes, errors are less frequent compared 22_3s_free_1 and 22_3s_edge_1 searches, today I had just 3 bad *free_3* WUs. So that's way better than those 11 edge/free_1 from yesterday.

What I have noticed is we have a lot of WUs which need 3 results before they be validated. Here some examples: 253847186, 253944881, 254009156, 253771061, 253995223, 253888155. That's just from 1 page of my results, which still contains many "completed, validation inconclusive", some of which will very probably join this club.
____________
.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 55798 - Posted: 14 Oct 2012, 22:07:31 UTC - in response to Message 55796.

I hope the errors like this are at least less frequent...

Yes, errors are less frequent compared 22_3s_free_1 and 22_3s_edge_1 searches, today I had just 3 bad *free_3* WUs. So that's way better than those 11 edge/free_1 from yesterday.

What I have noticed is we have a lot of WUs which need 3 results before they be validated. Here some examples: 253847186, 253944881, 254009156, 253771061, 253995223, 253888155. That's just from 1 page of my results, which still contains many "completed, validation inconclusive", some of which will very probably join this club.


I've upped the quorums for some workunits a little bit because I was noticing we were getting some weird results validated. So there's no problem there, just a little bit extra validation on our end.
____________


Post to thread

Message boards : News : might have found the error


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group