Message boards :
Number crunching :
New series of invalids
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Apr 15 Posts: 58 Credit: 63,291,127 RAC: 0 |
Hello, a new series of invalids is building up. Again all invalids are from the same run, they are from: de_modfit_85_bundle4_4s_south4s_bgset_2_1564052102_xxxxxxx See here: https://milkyway.cs.rpi.edu/milkyway/results.php?userid=1043858&offset=0&show_names=0&state=5&appid= Aloha, Uli |
Send message Joined: 30 Dec 09 Posts: 21 Credit: 75,540,465 RAC: 0 |
Yep, i just discovered it too. Again a "85" run. The other time it was "84" AND "85" runs. I will crunch some Einstein until they figure out their stuff and communicate clearly that it has been definitively fixed. |
Send message Joined: 16 Mar 10 Posts: 208 Credit: 105,173,734 RAC: 51,221 |
This has been anticipated for stripes 84 and 85. As they approach optimization, the tiny differences in accuracy of various GPUs compared with one another (and with CPUs) is causing discrepancies that are large enough to cause failure to validate. It is presumably a characteristic of the particular data set(s) and it is quite possible that it can't be "fixed" (as [most?] GPUs don't do calculations and rounding on very small numbers in quite the same way that modern CPUs do...) By the way, Tom Donlon mentioned that these stripes might start to show errors in his News post that introduced the latest set of batches - see https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4487[/i] |
Send message Joined: 29 Dec 11 Posts: 26 Credit: 1,456,736,094 RAC: 0 |
Getting several invalid results, such as this one: name de_modfit_85_bundle4_4s_south4s_bgset_2_1564052102_10340141 application Milkyway@home Separation created 26 Aug 2019, 3:00:26 UTC canonical result 299838550 granted credit 244.01 minimum quorum 1 initial replication 4 max # of error/total/success tasks 2, 9, 6 What can i do about them? Thank you |
Send message Joined: 29 Dec 11 Posts: 26 Credit: 1,456,736,094 RAC: 0 |
Makes sense to me. |
Send message Joined: 5 Nov 17 Posts: 4 Credit: 2,795,498 RAC: 0 |
My CPU is slow. When I see a batch that has fairly high levels of invalid responses, as soon as they download I abort them and chuck them back into the "available for sending" pool. That's about it. It's a CPU thing, which means ultimately it's a software issue, either in the code or depending on some difference in how the different CPUs crunch numbers. |
Send message Joined: 8 May 09 Posts: 3315 Credit: 519,758,715 RAC: 27,871 |
Getting several invalid results, such as this one: Read the News section maybe? From the 27th: Hi everyone, Stripes 84 and 85 are beginning to return invalids, as was expected. I will pull them down in a couple days when I am back in the lab. Best, Tom |
©2024 Astroinformatics Group