Welcome to MilkyWay@home

Disecting the new validator.


Advanced search

Message boards : Number crunching : Disecting the new validator.
Message board moderation

To post messages, you must log in.

AuthorMessage
ProfileThe Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
200 million credit badge10 year member badge
Message 38081 - Posted: 5 Apr 2010, 11:22:46 UTC

ID: 38081 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileDavid Glogau*
Avatar

Send message
Joined: 12 Aug 09
Posts: 172
Credit: 645,240,165
RAC: 0
500 million credit badge10 year member badge
Message 38084 - Posted: 5 Apr 2010, 11:38:05 UTC


ID: 38084 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Chris S
Avatar

Send message
Joined: 20 Sep 08
Posts: 1387
Credit: 186,726,858
RAC: 0
100 million credit badge10 year member badge
Message 38085 - Posted: 5 Apr 2010, 12:41:41 UTC

All of my rigs have suddenly got these error messages ..... Anyone know why?


<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
Running Milkyway@home ATI GPU application version 0.20b (Win32, SSE2, CAL 1.3) by Gipsel
ignoring unknown input argument in app_info.xml: -np
ignoring unknown input argument in app_info.xml: 20
ignoring unknown input argument in app_info.xml: -p
ignoring unknown input argument in app_info.xml: 0.8204714877234080000000000
ignoring unknown input argument in app_info.xml: 6.2644417249787670000000000
ignoring unknown input argument in app_info.xml: -1.1275059800827940000000000
ignoring unknown input argument in app_info.xml: 171.3489450424705200000000000
ignoring unknown input argument in app_info.xml: 25.5968204295114100000000000
ignoring unknown input argument in app_info.xml: 0.4638059718261400000000000
ignoring unknown input argument in app_info.xml: 6.2831853071795860000000000
ignoring unknown input argument in app_info.xml: 6.7755620233591070000000000
ignoring unknown input argument in app_info.xml: -7.1501757118781240000000000
ignoring unknown input argument in app_info.xml: 179.3048252880004400000000000
ignoring unknown input argument in app_info.xml: 38.0862970666705200000000000
ignoring unknown input argument in app_info.xml: 2.5868948022107680000000000
ignoring unknown input argument in app_info.xml: 4.7261433922311420000000000
ignoring unknown input argument in app_info.xml: 7.0656503094651130000000000
ignoring unknown input argument in app_info.xml: -13.5765285909904500000000000
ignoring unknown input argument in app_info.xml: 211.2459262577329200000000000
ignoring unknown input argument in app_info.xml: 15.0673173424509500000000000
ignoring unknown input argument in app_info.xml: 0.0000000000000000000000000
ignoring unknown input argument in app_info.xml: 6.2592519338112480000000000
ignoring unknown input argument in app_info.xml: 12.94646831325726500000

Don't drink water, that's the stuff that rusts pipes
ID: 38085 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
UBT - Ben

Send message
Joined: 8 Mar 08
Posts: 17
Credit: 4,411,459
RAC: 0
3 million credit badge10 year member badge
Message 38086 - Posted: 5 Apr 2010, 12:43:45 UTC

I'm having that problem also.. :S
ID: 38086 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Emanuel

Send message
Joined: 18 Nov 07
Posts: 280
Credit: 2,442,757
RAC: 0
2 million credit badge10 year member badge
Message 38088 - Posted: 5 Apr 2010, 13:53:01 UTC - in response to Message 38081.  
Last modified: 5 Apr 2010, 13:57:19 UTC

ID: 38088 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileWerkstatt

Send message
Joined: 19 Feb 08
Posts: 350
Credit: 134,618,314
RAC: 13,447
100 million credit badge10 year member badge
Message 38094 - Posted: 5 Apr 2010, 14:48:11 UTC - in response to Message 38088.  

Many of my WU's have that problem too.
I'm runnung mw 0.20b and 0.22 from the optimized apps site. Is something wrong with these apps? Which apps work perfect?
ID: 38094 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
CTAPbIi

Send message
Joined: 4 Jan 10
Posts: 86
Credit: 51,753,924
RAC: 0
50 million credit badge10 year member badge
Message 38097 - Posted: 5 Apr 2010, 15:21:41 UTC - in response to Message 38094.  

Is something wrong with these apps?

I think smth wrong with validator.
ID: 38097 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Emanuel

Send message
Joined: 18 Nov 07
Posts: 280
Credit: 2,442,757
RAC: 0
2 million credit badge10 year member badge
Message 38102 - Posted: 5 Apr 2010, 16:24:36 UTC
Last modified: 5 Apr 2010, 16:25:15 UTC

The validator is still very much in flux. Things should get better in the next few days as more of the issues are tracked down and as new application versions are readied for release. Remember that Travis has to sleep too - but at least some of the issues reported last night have now been fixed.

Additionally, there appears to be a problem with the ATI GPU apps which is holding things up - at this point it's not clear whether the issue can be worked around on the application level or if it is due to driver/SDK bugs that ATI will have to fix; let's hope for the former.

If you're worried about losing crunching time, you may want to set Milkyway to No New Tasks until things stabilize. If you want to help out with testing, just keep crunching as normal and report any oddities you see (if they haven't been reported already).
ID: 38102 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profileuwe

Send message
Joined: 6 Nov 09
Posts: 2
Credit: 1,500,164
RAC: 0
1 million credit badge10 year member badge
Message 38110 - Posted: 5 Apr 2010, 17:31:10 UTC - in response to Message 38085.  

All of my rigs have suddenly got these error messages ..... Anyone know why?


<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
Running Milkyway@home ATI GPU application version 0.20b (Win32, SSE2, CAL 1.3) by Gipsel
ignoring unknown input argument in app_info.xml: -np
...
ignoring unknown input argument in app_info.xml: 12.94646831325726500000


In the thread news:testing new validator Travis answered to this question:
That's just getting ready for the new version of the application. The new application will take the parameters it's using from the command line (that way we don't have to generate a new parameter file for each workunit).


ID: 38110 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileWerkstatt

Send message
Joined: 19 Feb 08
Posts: 350
Credit: 134,618,314
RAC: 13,447
100 million credit badge10 year member badge
Message 38112 - Posted: 5 Apr 2010, 18:09:10 UTC - in response to Message 38102.  

Additionally, there appears to be a problem with the ATI GPU apps which is holding things up - at this point it's not clear whether the issue can be worked around on the application level or if it is due to driver/SDK bugs that ATI will have to fix; let's hope for the former.

If you're worried about losing crunching time, you may want to set Milkyway to No New Tasks until things stabilize.


Travis wrote somewhere, that it looks like 48xx and 58xx cards produce different results. I use both of them and wu's from both cards produce results not granted with credits. Both cards are in the same machine, so cal is the same. If it is a driver problem, that can be solved by an update. If it is within cal, we might have a problem for the next weeks or so.
I've seen some posts about apps running in single precision. So my question is: are the apps 0.20b and 0.22 tested correct apps?
It is nice to have credit, but it is much nicer to have a deeper understanding of our milky way. The main goal is to have a working computer grid. And sometimes this means: try and error. No, i will not stop running wu's.

Regards,
Alexander
ID: 38112 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
SkyeHunter

Send message
Joined: 6 Mar 09
Posts: 41
Credit: 38,856,291
RAC: 0
30 million credit badge10 year member badge
Message 38115 - Posted: 5 Apr 2010, 18:56:15 UTC

In order to be as 'compliant' as possible, I migrated my 2 systems with a 4870 to stock application (v21) (from optimized v22). Resulted into 160 invalid, 300 valid and 100 pendings. Besides some practical issues (the quadXPC system is highly unstable since I returned airflow to normal), Credits are about a third what they used to be...

I hope we may expect the situation to return to 'normal'; somewhere this week ...
ID: 38115 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 283,151,643
RAC: 272
200 million credit badge10 year member badgeextraordinary contributions badge
Message 38118 - Posted: 5 Apr 2010, 19:03:55 UTC

As I see these 'failed validation results' popping up on all of my workstations (4850 GPU and CPU MW clients as well), at this point I figure to mark as 'no new work' for ALL of my MW workstations, letting the current queues flush out.

When the project has a need for processed work, (the new validation schema suggests they have more data than they need or want and by imposing the new schema seek to migrate folks out of the project), hopefully they will post a news update.


ID: 38118 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilebanditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
500 thousand credit badge10 year member badge
Message 38122 - Posted: 5 Apr 2010, 19:23:53 UTC

I'm thinking it's validating whatever result comes first and then rejecting the latter.

This example I finished and then the others were sent out, 4800 came in second, 5800, 4800. The 5800 was rejected.
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=90067556
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 38122 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileBlurf
Volunteer moderator
Project administrator

Send message
Joined: 13 Mar 08
Posts: 804
Credit: 26,380,161
RAC: 0
20 million credit badge10 year member badgeextraordinary contributions badge
Message 38167 - Posted: 6 Apr 2010, 3:05:25 UTC

Quorum Down to 2

The database is having a bit of trouble keeping up with all the new results due to a quorum of 3, so for the time being I'm dropping it to a quorum of 2.
On another note, we should have source code for the new application available tomorrow. 6 Apr 2010 2:11:10 UTC


ID: 38167 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Emanuel

Send message
Joined: 18 Nov 07
Posts: 280
Credit: 2,442,757
RAC: 0
2 million credit badge10 year member badge
Message 38180 - Posted: 6 Apr 2010, 10:58:41 UTC

It also looks like the problem with HD5800 series cards has been tracked down. It's not a problem with the cards or the SDK, but a poorly publicized change in recommended programming practices for how to properly load floating point values (hope I'm saying that right) that only applies to the newer cards. It'll hopefully be fixed soon (I'm also hoping the CUDA applications will get the same accuracy from project-side, but they're at least well within the required range).
ID: 38180 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Disecting the new validator.

©2020 Astroinformatics Group