Welcome to MilkyWay@home

Compute Errors

Message boards : Number crunching : Compute Errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
Profile Neil Polson
Avatar

Send message
Joined: 31 Dec 08
Posts: 9
Credit: 1,338,590
RAC: 0
Message 24241 - Posted: 5 Jun 2009, 10:11:57 UTC

Just had this 3s invalidate on my P4 cpu. So not just a gpu problem it seems.
ID: 24241 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Simplex0
Avatar

Send message
Joined: 11 Nov 07
Posts: 232
Credit: 178,229,009
RAC: 0
Message 24242 - Posted: 5 Jun 2009, 10:21:12 UTC

And same here. I was thinking that it was related to my instalation of Folding@home so I removed it, to bad. This project used to be greate but now it's just crap with an alpha stus that seams to go on for ever.
ID: 24242 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [TiDC] Anlupa

Send message
Joined: 17 Nov 08
Posts: 2
Credit: 33,115,365
RAC: 0
Message 24243 - Posted: 5 Jun 2009, 10:32:36 UTC

Hi everyone!
Í've got same problem with my ATI 4850.
With any version of aplication.
ID: 24243 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Phil
Avatar

Send message
Joined: 13 Feb 08
Posts: 1124
Credit: 46,740
RAC: 0
Message 24245 - Posted: 5 Jun 2009, 10:38:49 UTC - in response to Message 24242.  

And same here. I was thinking that it was related to my instalation of Folding@home so I removed it, to bad. This project used to be greate but now it's just crap with an alpha stus that seams to go on for ever.

The work runs fine with the stock application.
When people chose the anonymous platform, unforeseen things happen. Be patient, it'll get fixed.
ID: 24245 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile verstapp
Avatar

Send message
Joined: 26 Jan 09
Posts: 589
Credit: 497,834,261
RAC: 0
Message 24247 - Posted: 5 Jun 2009, 10:47:55 UTC

But why wasn't it fixed yesterday! :)
Cheers,

PeterV

.
ID: 24247 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile KWSN imcrazynow
Avatar

Send message
Joined: 22 Nov 08
Posts: 136
Credit: 319,414,799
RAC: 0
Message 24250 - Posted: 5 Jun 2009, 11:56:30 UTC

I've aborted 40-50 of the 3s units this morning to go along with the 100 or so of them last night. All GPU crunching grinds to a halt when these hang up. Please stop sending these things out until there is a fix. What happened to testing the work before sending it out to everybody?

4870 GPU
4870 GPU
ID: 24250 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Berserk_Tux
Avatar

Send message
Joined: 2 Jan 08
Posts: 79
Credit: 365,471,675
RAC: 0
Message 24251 - Posted: 5 Jun 2009, 12:09:31 UTC - in response to Message 24250.  

Oh my god, Is anybody home hear. Stop sending out 3s wu's Now!!!!!

ID: 24251 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
Message 24252 - Posted: 5 Jun 2009, 12:10:48 UTC

Like many others posting in to this thread I have a stalled ATI HD3850 due, I surmise, to the problem ps_sgr_208_3s_etc WUs.

Interestingly, the HD3850 seems to be getting lots of work, but cannot crunch them as the _3s_ cause the system to stall for some reason or other. When the GPU does crunch it ends up terminating as a computer error.

I am going to detach and reattach to allow Milkyway to do CPU crunching as I noticed there is no problem with these _3s_ WUs on them. I can continue this way, at 10% of the GPU RAC, until these _3s_ WUs stop.

I hope Travis or Dave can see a way to sort this problem out in this project soon. It has been at least 2 days now living (or not) with this problem.

:((
Go away, I was asleep


ID: 24252 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 24255 - Posted: 5 Jun 2009, 12:48:14 UTC

I didn't have any problems with the six 3s_2 I got. Is it only the 3s_1 giving problems? Could be some lingering still that need canceled.
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 24255 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile verstapp
Avatar

Send message
Joined: 26 Jan 09
Posts: 589
Credit: 497,834,261
RAC: 0
Message 24257 - Posted: 5 Jun 2009, 13:24:01 UTC

The current workaround is to abort all the _3s_ WUs, close and restart boinc, and hope you get some _1s_ or _2s_ WUs next time. Its horribly manual and requires keeping an eye on boinc, which explains all the requests/demands in this thread for Travis to fix it.
Cheers,

PeterV

.
ID: 24257 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Temujin

Send message
Joined: 12 Oct 07
Posts: 77
Credit: 404,471,187
RAC: 0
Message 24258 - Posted: 5 Jun 2009, 13:36:57 UTC - in response to Message 24257.  

The current workaround is to abort all the _3s_ WUs, close and restart boinc, and hope you get some _1s_ or _2s_ WUs next time. Its horribly manual and requires keeping an eye on boinc, which explains all the requests/demands in this thread for Travis to fix it.

Come on guys, deep breaths :)
The problem isn't with the workunits, they're fine, so Travis can't fix it.
It's a bug in Cluster Physiks GPU application as he has already pointed out.
Give him a chance and he'll sort it
ID: 24258 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [KWSN]John Galt 007
Avatar

Send message
Joined: 12 Dec 08
Posts: 56
Credit: 269,889,439
RAC: 0
Message 24259 - Posted: 5 Jun 2009, 13:44:25 UTC - in response to Message 24258.  

The current workaround is to abort all the _3s_ WUs, close and restart boinc, and hope you get some _1s_ or _2s_ WUs next time. Its horribly manual and requires keeping an eye on boinc, which explains all the requests/demands in this thread for Travis to fix it.

Come on guys, deep breaths :)
The problem isn't with the workunits, they're fine, so Travis can't fix it.
It's a bug in Cluster Physiks GPU application as he has already pointed out.
Give him a chance and he'll sort it


The real problem is that if a WU gets sent out to 2 GPU clients, and both abort it, the WU dies from too many errors, so the project suffers.

Just my $0.02.....
Click to help Seti City.




ID: 24259 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 24261 - Posted: 5 Jun 2009, 14:04:58 UTC - in response to Message 24258.  


It's a bug in Cluster Physiks GPU application as he has already pointed out.

Didn't catch that when I read through. I only have a cpu, so no problems.
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 24261 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
localizer

Send message
Joined: 28 Jan 08
Posts: 40
Credit: 379,931,801
RAC: 0
Message 24262 - Posted: 5 Jun 2009, 14:05:25 UTC
Last modified: 5 Jun 2009, 14:05:59 UTC

Hi John - agreed. However GPU users cannot choose their WUs - and once downloaded these WUs can't be run - therefore abort or leave stalled is the only option. Granted I'm sure that CP will get it sorted, but the project may be best suspending the generation of this type of WU until it is addressed ......... can't be good for the retruned data to have so may aborted WUs or project resets.
ID: 24262 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Vickers
Volunteer moderator
Project developer
Project scientist
Avatar

Send message
Joined: 11 May 09
Posts: 30
Credit: 81,093
RAC: 0
Message 24263 - Posted: 5 Jun 2009, 14:10:55 UTC
Last modified: 5 Jun 2009, 14:26:44 UTC

Hello MW@Home,

These *_3s_* runs are 3 stream runs that I started. I will tell Travis that there is a problem with them on the GPUs but not the CPUs and abort said run asap.

Sorry for the inconvenience,
John Vickers
ID: 24263 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [KWSN]John Galt 007
Avatar

Send message
Joined: 12 Dec 08
Posts: 56
Credit: 269,889,439
RAC: 0
Message 24266 - Posted: 5 Jun 2009, 14:23:44 UTC - in response to Message 24263.  

Hello MW@Home,

These *_3s_* runs are 3 stream runs that I started. I will tell Travis that there is a problem with them on the GPUs but not the CPUs and abort said run asap.

Sorry for the inconvenience,
John Vickers


No probs...

Once CP gets the ATI app sorted out, we will burn thru these like nothing...

And thanks for posting...
Click to help Seti City.




ID: 24266 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 24268 - Posted: 5 Jun 2009, 14:25:28 UTC - in response to Message 24266.  

Looks like the searches are stopped, we'll not do 3 stream runs until the ATI code is fixed :)
ID: 24268 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Kevint
Avatar

Send message
Joined: 22 Nov 07
Posts: 285
Credit: 1,076,786,368
RAC: 0
Message 24269 - Posted: 5 Jun 2009, 14:26:44 UTC - in response to Message 24268.  

Looks like the searches are stopped, we'll not do 3 stream runs until the ATI code is fixed :)



Ahhhh,,

Just as I was really starting to enjoy them :)...



.
ID: 24269 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Blue Northern Software

Send message
Joined: 30 Jan 09
Posts: 56
Credit: 85,464
RAC: 0
Message 24270 - Posted: 5 Jun 2009, 14:46:21 UTC


  "booger"                     



_________________
*** BOFH excuse #309:
firewall needs cooling
ID: 24270 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Lord Tedric
Avatar

Send message
Joined: 9 Nov 07
Posts: 151
Credit: 8,391,608
RAC: 0
Message 24276 - Posted: 5 Jun 2009, 16:29:18 UTC

No 3s to be had, everything seems ok at present.
ID: 24276 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Number crunching : Compute Errors

©2024 Astroinformatics Group