Welcome to MilkyWay@home

Marked as Invalid? (Part 2)

Message boards : Number crunching : Marked as Invalid? (Part 2)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Profile Simplex0
Avatar

Send message
Joined: 11 Nov 07
Posts: 232
Credit: 178,229,009
RAC: 0
Message 38152 - Posted: 5 Apr 2010, 22:43:05 UTC - in response to Message 38143.  

What does this mean:
"errors: Too many success results"

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=86084176

Does that make sense at all?

Edit:

Here's some more "Too many success results"

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=88320323
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=84473249

What the heck does "Too many success results" mean?


Sounds like a mathematician saying 'To good to be true' :)
ID: 38152 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 38156 - Posted: 5 Apr 2010, 23:41:07 UTC

For the time being my 5830 is going over to Collatz as I cannot seem to get above a 66% valid rate.

I can do more over there at the moment.
ID: 38156 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 38159 - Posted: 6 Apr 2010, 0:32:21 UTC - in response to Message 38156.  

Right, I already posted an alert over at Collatz so that Slicker is aware of the incoming rush of MW folks looking for a productive home pending Travis issuing the appropriate 'undo' command to the validator 'fix' which has been implemented here.

For the time being my 5830 is going over to Collatz as I cannot seem to get above a 66% valid rate.

I can do more over there at the moment.


ID: 38159 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Dan T. Morris
Avatar

Send message
Joined: 17 Mar 08
Posts: 165
Credit: 410,228,216
RAC: 0
Message 38164 - Posted: 6 Apr 2010, 1:29:34 UTC

OOPS I just saw 23 tera flops move off of this project. Bummer.


ID: 38164 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PeteS

Send message
Joined: 19 Mar 09
Posts: 27
Credit: 117,670,452
RAC: 0
Message 38181 - Posted: 6 Apr 2010, 11:08:07 UTC

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=90334482

Some get credit and some don't for the same WU????
ID: 38181 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PeteS

Send message
Joined: 19 Mar 09
Posts: 27
Credit: 117,670,452
RAC: 0
Message 38182 - Posted: 6 Apr 2010, 11:11:02 UTC - in response to Message 38159.  

Right, I already posted an alert over at Collatz so that Slicker is aware of the incoming rush of MW folks looking for a productive home pending Travis issuing the appropriate 'undo' command to the validator 'fix' which has been implemented here.

For the time being my 5830 is going over to Collatz as I cannot seem to get above a 66% valid rate.

I can do more over there at the moment.



Don't forget DNETC@Home which also works on ATI and nVidia.
ID: 38182 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 29 Aug 07
Posts: 486
Credit: 576,520,765
RAC: 35,563
Message 38183 - Posted: 6 Apr 2010, 11:24:35 UTC - in response to Message 38182.  

Right, I already posted an alert over at Collatz so that Slicker is aware of the incoming rush of MW folks looking for a productive home pending Travis issuing the appropriate 'undo' command to the validator 'fix' which has been implemented here.

For the time being my 5830 is going over to Collatz as I cannot seem to get above a 66% valid rate.

I can do more over there at the moment.



Don't forget DNETC@Home which also works on ATI and nVidia.


I already Posted it over at DNETC yesterday, the Project has increased the WU Length's fro about 4 Minutes for a 58xx Card to about 15+ Minutes now to ease the strain on the Server.

STE\/E
ID: 38183 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Emanuel

Send message
Joined: 18 Nov 07
Posts: 280
Credit: 2,442,757
RAC: 0
Message 38184 - Posted: 6 Apr 2010, 11:34:47 UTC - in response to Message 38181.  
Last modified: 6 Apr 2010, 11:36:13 UTC

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=90334482

Some get credit and some don't for the same WU????

Why would they get credit? Their results were marked as invalid. Mind you, in this case it looks like the wrong hosts got the credits, as it's the HD5800 cards that are giving the wrong results (this will hopefully be fixed soon, perhaps even today). There is no way for the server to tell which results are the valid ones except by majority vote, and unfortunately the HD5800 cards were in the majority on this one.
ID: 38184 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 38186 - Posted: 6 Apr 2010, 11:40:02 UTC

I'm thinking it's validating whatever result comes first and then rejecting the last one.

This example I finished and then the others were sent out, 4800 came in second, 5800, 4800. The 5800 was rejected.
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=90067556
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 38186 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 38188 - Posted: 6 Apr 2010, 12:37:10 UTC - in response to Message 38183.  

I already Posted it over at DNETC yesterday, the Project has increased the WU Length's fro about 4 Minutes for a 58xx Card to about 15+ Minutes now to ease the strain on the Server.

You must be using liquid nitrogen if you can complete a current DNETC 48 packet ATI 5K task on a single 58xx card in 15 minutes. :) My 5870 @ 950MHz takes 26+ minutes to complete all 48 packets.

No offence meant, perhaps you were referring to the new longer ATI 5K task times when processed by 2 58xx GPU cores. Considering that there is no progress bar, just letting people know that these new longer DNETC tasks will take much longer than 15 minutes to complete on a single 58xx card.
ID: 38188 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gary Roberts

Send message
Joined: 1 Mar 09
Posts: 56
Credit: 1,984,937,499
RAC: 0
Message 38192 - Posted: 6 Apr 2010, 13:22:16 UTC - in response to Message 38186.  
Last modified: 6 Apr 2010, 13:23:46 UTC

I'm thinking it's validating whatever result comes first and then rejecting the last one.

With the greatest of respect, your statement is total codswallop! :-).

This example I finished and then the others were sent out, 4800 came in second, 5800, 4800. The 5800 was rejected.
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=90067556

I've made your url clickable. The real reason that one task in the quorum was denied credit is simply that the answer given by that task did not agree with the answers given by the three tasks that did validate.

Take a good look at the WU you pointed to. At the top you will see that a minimum quorum of 3 is required. That simply means that there must be 3 tasks giving answers within a specified very small error range before any credit can be awarded. The answers for tasks crunched on 5800 series cards do not agree closely enough with those of tasks crunched on CPU/3800/4700/4800/CUDA, etc.

So a 4th task had to be sent out and when it returned a quorum of 3 agreeing results could be formed and the 5800 task could be declared invalid and denied credit. Unfortunately, it can happen the other way and enough 'bad' results from 5800 series GPUs can come in first and cause 'good' results from CPU/3800/4700/4800 hosts to be declared invalid. Fortunately it appears that the reason for the problem with 5800 cards may have been identified and a new app fixing the problem may be available shortly. The details are in the active threads in the news forum.
Cheers,
Gary.
ID: 38192 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 38196 - Posted: 6 Apr 2010, 15:29:28 UTC

Well, one way to resolve the validator problem -- turn the feeder off....

4/6/2010 8:26:47 AM Milkyway@home Message from server: Server error: feeder not running

ID: 38196 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Roger Vanderseypen [NTT_BE]

Send message
Joined: 6 Jun 09
Posts: 5
Credit: 27,626,738
RAC: 0
Message 38204 - Posted: 6 Apr 2010, 19:23:51 UTC

i do not believe that loosing temper would cause any good here, think about these guys that are trying to optimize the codes so that those only interested in chrunching for numbers instead of for science would be able to chrunch faster.

as far as i noticed, some are trying to use the psychology of menace, it will not help, unless having your profile excluded because inappropriate and without respect.

i do not mind putting my 48 or my teams 58 to the benefit of milkyway, as they are the only ones that are able to optimize seriously (and i take my hat of for that) the code of their project, only in benefit for science.

if it was for chrunching credits only and don't care about science, you might aswell go to collatz, they give you credits for a calculation that as far as i know, doesn't serve humanity in it whole.

i do agree, that a warning button, or special code should indicate the project and it's status (kind of feedback of the IT masters on this case). so that the people don't feel like left on their own.

a selection on the site that you agree to do calculations on beta versions of the code should help too, if you where your getting into no reason to discuss and be rude.

with this in respect of religion, culture and nationality of all human and living creator, i remain, with respect for those that make it possible to have the improved code running smoothly.

Ps:
- kind of stupid, but it might be helpful to increase the duration of the working-units, with checkpoints every 3 min for example. it would release your server a lot. say 1 work unit of 1 hour, would then give about over 4000 credits.
- in our company , and even friends at home are even experiencing weird trouble with a variety of programs since this week, going form trying to print (most of driver problems, but could very well be related to updated Microsoft patch that aren't behaving conveniently between all programs and therefore affect print and graphics drivers. (might be interesting to know)

Ps2: no i have no IT degree, just a passion for IT and Science.

Regards,

Roger
Founder of Performance with Purpose

ID: 38204 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 38205 - Posted: 6 Apr 2010, 19:32:59 UTC - in response to Message 38202.  
Last modified: 6 Apr 2010, 19:36:16 UTC

My apologies for my part in your being annoyed. Frankly, it wasn't clear that anyone was reading over here, I see that is not the case.

Perhaps in the absence of project generated information (which indeed is one that that tends to annoy me, if not seemingly many others), perhaps I could make a supposition that the reason the feeder is offline, along with the lack of information today, is that the project folks are busy changing over all already reported as failed validation work units to good/succeeded validation work units.

That task would certainly take a fair amount of time and effort and explain the need for the project to be temporarily offline. Like I said, just speculation in the absence of other information.

It is also possible that the project folks are very busy releasing the new applications mentioned on the home page -- hopefully those applications have been tested reasonably well and would help in their part to resolve the validator change issues that surfaced suddenly earlier this week.

I do think that the (seemingly) untested and apparently sudden change in the behavior of the validator has kicked up some small amount of angst - I fear that I've done little to assuage that in my postings.

Stop quoting yourself, it's annoying.

ID: 38205 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 38207 - Posted: 6 Apr 2010, 19:40:29 UTC - in response to Message 38204.  

Roger, could you clarify what you mean by the psychology of menace? That sounds rather ominous.

I know it has not been my intent to 'menace' anyone here, and if someone has gotten that read from my periodic venting, I do sincerely apologise.




as far as i noticed, some are trying to use the psychology of menace, it will not help, unless having your profile excluded because inappropriate and without respect.


Roger
Founder of Performance with Purpose[/color]


ID: 38207 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Roger Vanderseypen [NTT_BE]

Send message
Joined: 6 Jun 09
Posts: 5
Credit: 27,626,738
RAC: 0
Message 38210 - Posted: 6 Apr 2010, 20:16:35 UTC - in response to Message 38207.  

some are trying to use the menace of going over to other projects, :):)

hey everybody does what he want, just i do not find it appropriate to put this into a message board, just don't get rid of frustrations, because it isn't running perfect. nobody is forcing nobody, it is volunteers work.

some have forgotten this i believe.

just trying to show that if you moderate, and give some hints or clues, it might help one way or the other everybody.

especially for milkyway some of my team members have returned their HD5770 card to the distributor, and ordered a new 5870 directly form the US, coming over with relatives in a couple of weeks.
all of this because of double precision calculation only handled by the 48 or 58 cards, and because there more of a global humanity interest than in some other projects (with breaking down other projects, that might have a good cause in science but maybe not to humanity).

i just hope everybody keeps his integrity and respect for others.

one small belgian guy, that is willing to donate and motivate a lot of people into projects like milkyway, World community grid and a lot of other projects.

:): hope i won't get shot for that
:):)

please bear in mind, that written statements might be interpreted differently, you might not get the same response or actions you intended to get :):)

i can speak out of experience, and lost some friendship, because of 1 sms message that was misinterpreted and the other person was having to much of pressure and a bad mood. :):)

so no offense to anyone.

my regards,
roger

founder of Peformance with Purpose.
ID: 38210 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 38213 - Posted: 6 Apr 2010, 20:39:54 UTC - in response to Message 38210.  

Ah -- OK -- I wouldn't characterise that as 'menace'. One of the driving concepts behind the BOINC approach is multiple project support.

I've joined something like a dozen projects over the years.

I've participated in the MW project for 18 months or more. Until recently, I didn't have access to the relatively high end (ie double precision) GPU cards that MW required. I did (and do) have a flock of single end GPU cards (both CUDA and ATI). That means most of my GPU work is done for other projects (Collatz, and to a lesser degree SETI and GPUGrid). Recently I've added some double precision ATI cards -- the bulk of that processing has also been done for Collatz. About 1/4 GPU processing for those cards had been directed over here.

With the current validation problems, I've temporarily stopped that since the processing for MW with the current validation issues includes no small amount of wasted effort. My decision to do this, is not, as you might suggest, 'menace'. Rather it is for me a pragmatic allocation of processing resources.

Similarly, historically, my activity here has been with CPU's. I have a large batch of workstations which, lacking double precision GPU's are doing CPU processing over here. Since the validation issue also applies to CPU processed work units, I'm currently draining my work there and letting other projects (which I've participated in as long as 5 years), pick up some of that CPU 'slack'.

Again, the shift to other projects *which are also doing good science* is not a case if 'menace', but rather is a pragmatic allocation of processing resources at this time.

I expect that at some point within the next week or two (perhaps longer, perhaps shorter), MW will resolve the issues causing the validation problems and I will reallocate processing over here.




some are trying to use the menace of going over to other projects, :):)

my regards,
roger

founder of Peformance with Purpose.


ID: 38213 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,524,931
RAC: 15
Message 38216 - Posted: 6 Apr 2010, 20:51:45 UTC - in response to Message 38210.  

Roger, one thing to consider here, at the moment, there are only two projects with support for ATI GPU's -- Collatz and MilkyWay. Frankly, *both* are science knowledge projects with perhaps *limited* global humanity interest (to my way of looking at things).

If you are seeking projects which use GPU's and have perhaps more direct potential for 'humanity', you might look to GPU Grid -- it doesn't use ATI cards, but nor does it require double precision GPU cards. Just a thought.

Frankly, it is my hope that other projects which can use both single precision and double precision GPU's (and both CUDA and ATI cards at that) present themselves. More choices certainly cannot be a bad thing.

Lastly, I'd note that historically, MilkyWay has been a 'high credit' project -- both for CPU and supported GPU's. As a result, at least some of the motivation for many of the users here has to do with credit 'payouts'. It is a pretty natural result.




especially for milkyway some of my team members have returned their HD5770 card to the distributor, and ordered a new 5870 directly form the US, coming over with relatives in a couple of weeks.
all of this because of double precision calculation only handled by the 48 or 58 cards, and because there more of a global humanity interest than in some other projects (with breaking down other projects, that might have a good cause in science but maybe not to humanity).

i just hope everybody keeps his integrity and respect for others.

my regards,
roger

founder of Peformance with Purpose.


ID: 38216 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Simplex0
Avatar

Send message
Joined: 11 Nov 07
Posts: 232
Credit: 178,229,009
RAC: 0
Message 38217 - Posted: 6 Apr 2010, 21:09:18 UTC - in response to Message 38216.  

Roger, one thing to consider here, at the moment, there are only two projects with support for ATI GPU's -- Collatz and MilkyWay


Under BOINC, yes but Folding@home can also be crunched using AMD\ATI cards and their apps are made by proffesiona programers.
ID: 38217 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Emanuel

Send message
Joined: 18 Nov 07
Posts: 280
Credit: 2,442,757
RAC: 0
Message 38218 - Posted: 6 Apr 2010, 21:12:10 UTC
Last modified: 6 Apr 2010, 21:13:16 UTC

It's just that, well, this is a forum. Forums tend to have turnaround times of -at least- 6 hours, and that's if you're lucky. There really isn't any point in posting again and again because you're frustrated - it only serves to piss off the people who might actually be able to answer you.

At the moment, I think problems are to be expected. Work on the validator is ongoing, the AMD/ATI CAL applications need to be updated to get accurate results from the HD5800 series (which yes, that means that all results they've returned up until now have been inaccurate and invalid to some degree), -and- work on new science models is being finalized. The first is mostly up to Travis, the second is being worked on by Cluster Physik and will have to be checked by Travis, and the third is being worked on by the other project scientists and will also have to be passed on to Travis. I wouldn't be surprised if the guy is currently a bit too overworked to communicate much, but he has been giving some status updates in the News section of the forum; I hope you've been keeping up with the threads there.
ID: 38218 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Marked as Invalid? (Part 2)

©2024 Astroinformatics Group