Welcome to MilkyWay@home

Lots of crunching errors since today


Advanced search

Message boards : Number crunching : Lots of crunching errors since today
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
astro-marwil

Send message
Joined: 3 Jul 12
Posts: 13
Credit: 7,601,982
RAC: 0
5 million credit badge7 year member badge
Message 55764 - Posted: 13 Oct 2012, 20:40:06 UTC

Hallo!
Since today I get lots of crunching errors, all of: exit code 1 (0x1), unallowed function, at most for ps_separation_22_3s_free_1_xxx, but also some for ps_separation_22_3s_edge_1_xxxx. So I aborted all tasks of ps_separation_22_3s_free_1_xxx. Somewhat peculiar is, that some tasks of these types ended up fine. Normaly I donĀ“t have any crunching errors. As all other is running well, I feel sure, the reason lies in this taks mentioned above, wich came up today.
Kind regards and happy crunching
Martin
ID: 55764 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTex1954

Send message
Joined: 22 Apr 11
Posts: 61
Credit: 864,729,667
RAC: 429,437
500 million credit badge8 year member badgeextraordinary contributions badge
Message 55769 - Posted: 14 Oct 2012, 0:43:07 UTC - in response to Message 55764.  
Last modified: 14 Oct 2012, 0:55:14 UTC

Yup, I am seeing a lot of errors as well and it isn't me... at least, it isn't me NOW...

8-)

And I ain't the only one... for example, mine is 3rd down from the top...

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=253673444
ID: 55769 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTex1954

Send message
Joined: 22 Apr 11
Posts: 61
Credit: 864,729,667
RAC: 429,437
500 million credit badge8 year member badgeextraordinary contributions badge
Message 55771 - Posted: 14 Oct 2012, 2:06:21 UTC - in response to Message 55769.  

ID: 55771 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 356
Credit: 16,317,754
RAC: 0
10 million credit badge9 year member badge
Message 55778 - Posted: 14 Oct 2012, 8:31:47 UTC

They are testing new searches, if you have any issues with those, better post in the News part of this forum, there are threads to each of the new searches. The edge_1 and free_1 had error rates of about 5-10% according to the posts over there, so no reason to abort all of them, that does really not help them to find the issue.
.
ID: 55778 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Miklos M

Send message
Joined: 29 Dec 11
Posts: 22
Credit: 6,696,932
RAC: 68,563
5 million credit badge7 year member badge
Message 55784 - Posted: 14 Oct 2012, 11:58:40 UTC

Errors here too. As well as shorter units with less credit.
ID: 55784 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTex1954

Send message
Joined: 22 Apr 11
Posts: 61
Credit: 864,729,667
RAC: 429,437
500 million credit badge8 year member badgeextraordinary contributions badge
Message 55788 - Posted: 14 Oct 2012, 14:44:50 UTC - in response to Message 55778.  

They are testing new searches, if you have any issues with those, better post in the News part of this forum, there are threads to each of the new searches. The edge_1 and free_1 had error rates of about 5-10% according to the posts over there, so no reason to abort all of them, that does really not help them to find the issue.


Well, that is fine except there seemed to be something unusual going on because some of the errors would cause boinc to stop responding somehow in my HD6990/GTX580 box.

In any case, I don't read the news much... just report here for crunching issues...

I'm sure the folks know exactly what WU's error out and are on top of things.

8-)
ID: 55788 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Miklos M

Send message
Joined: 29 Dec 11
Posts: 22
Credit: 6,696,932
RAC: 68,563
5 million credit badge7 year member badge
Message 55814 - Posted: 15 Oct 2012, 17:32:52 UTC

Another day and still errors. Although, seems to be fewer. I wonder what they are doing about it?
ID: 55814 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge8 year member badgeextraordinary contributions badge
Message 55848 - Posted: 18 Oct 2012, 3:39:53 UTC - in response to Message 55778.  

They are testing new searches, if you have any issues with those, better post in the News part of this forum, there are threads to each of the new searches. The edge_1 and free_1 had error rates of about 5-10% according to the posts over there, so no reason to abort all of them, that does really not help them to find the issue.

man am i glad i found this thread and this info. errors started showing up on my machine back on the 13th. i started to worry, so i switched over to Collatz Conjecture. now that i know that the errors are expected as of late, and that there's nothing wrong on my end, i'll jump back in the fray.
ID: 55848 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 290,767,365
RAC: 1,152,428
200 million credit badge10 year member badgeextraordinary contributions badge
Message 55850 - Posted: 18 Oct 2012, 11:03:48 UTC - in response to Message 55848.  

They are testing new searches, if you have any issues with those, better post in the News part of this forum, there are threads to each of the new searches. The edge_1 and free_1 had error rates of about 5-10% according to the posts over there, so no reason to abort all of them, that does really not help them to find the issue.


man am i glad i found this thread and this info. errors started showing up on my machine back on the 13th. i started to worry, so i switched over to Collatz Conjecture. now that i know that the errors are expected as of late, and that there's nothing wrong on my end, i'll jump back in the fray.


They are crunching thru them, you can stay there and wait it out or jump in and help move thru them. Just don't expect 100% good stuff right now.
ID: 55850 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Miklos M

Send message
Joined: 29 Dec 11
Posts: 22
Credit: 6,696,932
RAC: 68,563
5 million credit badge7 year member badge
Message 55851 - Posted: 18 Oct 2012, 13:26:25 UTC

Yes, still getting about 5 or more bad ones a day.
ID: 55851 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge8 year member badgeextraordinary contributions badge
Message 55854 - Posted: 18 Oct 2012, 22:26:18 UTC - in response to Message 55850.  

They are crunching thru them, you can stay there and wait it out or jump in and help move thru them. Just don't expect 100% good stuff right now.

yeah, i think i may have spoken too soon, as my error rate is back up again, this time somewhere between 2% and 3%. nevertheless, i'll stick around and help flush the bad tasks out of the system...

Yes, still getting about 5 or more bad ones a day.

i'm up to 34 errors in the last ~24 hours...
ID: 55854 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Miklos M

Send message
Joined: 29 Dec 11
Posts: 22
Credit: 6,696,932
RAC: 68,563
5 million credit badge7 year member badge
Message 55858 - Posted: 19 Oct 2012, 12:22:59 UTC - in response to Message 55854.  

That is a high percentage. Mine is running at about 1 hour wasted per 24.
ID: 55858 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 356
Credit: 16,317,754
RAC: 0
10 million credit badge9 year member badge
Message 55861 - Posted: 19 Oct 2012, 16:21:14 UTC - in response to Message 55858.  
Last modified: 19 Oct 2012, 16:23:23 UTC

That is a high percentage. Mine is running at about 1 hour wasted per 24.

That would be about 4.167%. But that's pretty close to what I have, my machine however is not running 24/7.
.
ID: 55861 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matthew
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 6 May 09
Posts: 217
Credit: 6,856,375
RAC: 0
5 million credit badge10 year member badge
Message 55862 - Posted: 19 Oct 2012, 17:16:00 UTC

In the past, a minority of jobs have errored out due to outdated drivers, BOINC application version, or client code.

Milkos M and Sunny129, it looks as though your GPU drivers are not at the latest versions.

Link - it seems as though your BOINC app is an old version.

I am not sure yet if outdated software is the source of these errors, but it wouldn't hurt for you to update and see if that fixes the problem. Let us know if it does (or does not). I'm still looking into things on our end.

- Matthew N
ID: 55862 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge8 year member badgeextraordinary contributions badge
Message 55864 - Posted: 19 Oct 2012, 17:37:14 UTC - in response to Message 55862.  

In the past, a minority of jobs have errored out due to outdated drivers, BOINC application version, or client code.

Milkos M and Sunny129, it looks as though your GPU drivers are not at the latest versions.

while i'm not running the most current Catalyst drivers (i'm running v12.4), i can't imagine that this driver version would be a problem considering that 1) i've been using v12.4 for months now without any problems, 2) i previously had over 100,000 consecutive valid tasks before this whole fiasco started, and 3) many folks have had more problems w/ recent Catalyst drivers than they have with slightly older versions (i believe its been said that some of the most recent Catalyst drivers are missing the appropriate OpenCL libraries required to make MW@H work on a GPU).

besides, how do you know exactly what driver vsrsion i'm running? all i see on my MW@H web page and in my individual tasks is a v1.4.1720, which is Greek to me.
ID: 55864 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 290,767,365
RAC: 1,152,428
200 million credit badge10 year member badgeextraordinary contributions badge
Message 55873 - Posted: 20 Oct 2012, 12:24:38 UTC - in response to Message 55864.  

besides, how do you know exactly what driver vsrsion i'm running? all i see on my MW@H web page and in my individual tasks is a v1.4.1720, which is Greek to me.


Click on a username and then under computers click view unless they have them hidden. Under yours we can see you have 2 pc's here, one with:
[2] AMD AMD Radeon HD 6900 series (Cayman) (2048MB) driver: 1.4.1720
and the other with:
[2] AMD ATI Radeon HD 5x00 series (Redwood) (1024MB) driver: 1.4.1720

The 1.4.1720 is the driver/catalyst version. My AMD machine for instance says:
AMD AMD Radeon HD 6800 series (Barts) (1024MB) driver: 1.4.1741.
This translates to a driver version of 12.8. That pc has no tasks as it is NOT a dual precision gpu.
ID: 55873 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge8 year member badgeextraordinary contributions badge
Message 55874 - Posted: 20 Oct 2012, 12:31:20 UTC

right, i understand that much...i guess i should have more specifically asked "how do you know which v1.4.xxxx corresponds to which v11.x or v12.x"?
ID: 55874 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 290,767,365
RAC: 1,152,428
200 million credit badge10 year member badgeextraordinary contributions badge
Message 55875 - Posted: 20 Oct 2012, 12:33:34 UTC - in response to Message 55874.  

right, i understand that much...i guess i should have more specifically asked "how do you know which v1.4.xxxx corresponds to which v11.x or v12.x"?


I am sure there is a better way but I went to that machine and opened the catalyst and checked.
ID: 55875 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Miklos M

Send message
Joined: 29 Dec 11
Posts: 22
Credit: 6,696,932
RAC: 68,563
5 million credit badge7 year member badge
Message 55878 - Posted: 20 Oct 2012, 14:07:27 UTC - in response to Message 55862.  
Last modified: 20 Oct 2012, 14:10:00 UTC

How do I find out if my GPU driver is outdated? It is an almost new computer with a 580 GPU. Milkyway is the only project being affected.
ID: 55878 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge8 year member badgeextraordinary contributions badge
Message 55879 - Posted: 20 Oct 2012, 14:45:48 UTC - in response to Message 55878.  

How do I find out if my GPU driver is outdated? It is an almost new computer with a 580 GPU. Milkyway is the only project being affected.

go to nVidia's official website and go to the drivers link. you'll see that the v266.58 you're running is quite old. i was running v301.42 not too long ago on my dual GTX 560 Ti machine, and that was working just fine for Einstein@Home (can't comment on Milkyway@Home)...but i updated to 306.23 when they released it, and i noticed a slight improvement in efficiency. the newest official release is now v306.97, but i have yet to update my drivers again...
ID: 55879 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Lots of crunching errors since today

©2019 Astroinformatics Group