New Separation Modfit Version 1.36
log in

Advanced search

Message boards : News : New Separation Modfit Version 1.36

1 · 2 · Next
Author Message
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 413
Credit: 7,486,492
RAC: 0

Message 62499 - Posted: 6 Oct 2014, 20:22:15 UTC
Last modified: 8 Oct 2014, 14:02:47 UTC

Hey Everyone,

I released a new version of Separation Modfit today (v1.36). If you have any issues with it please post them here.


Thank you for your continued support,

Jake W.

Profile Martin
Send message
Joined: 22 Sep 14
Posts: 9
Credit: 2,076,031
RAC: 6,813

Message 62500 - Posted: 7 Oct 2014, 0:02:30 UTC

Sorry Jake.

I am getting exactly the same result as with V1.34. Immediate (0.00 sec CPU time) fail as a computational error, reporting:

(unknown error) - exit code -1073741515

My computer is a Dell XP machine with a 2-core Intel CPU, and no GPU. So far, no ModFit task under 1.34 or 1.36 has even looked as though it might run.

Robert Meckley
Send message
Joined: 17 Feb 13
Posts: 5
Credit: 382,584,844
RAC: 367,841

Message 62505 - Posted: 7 Oct 2014, 18:09:08 UTC

Jake,

I have been running (modified-fit) off and on for the past 30 days or so stopping because of high error rates, and restarting each time new versions appeared. Unfortunately, v1.36 doesn't seem any better than the previous versions (v1.32 & v1.34). Currently I'm running ~6% VALIDATE ERROR rate over the past 20 hours. I checked the top 100 Hosts to see if anyone else was having a problem. Most of the top 100 hosts use the HD 7970 card with the driver identified by BOINC as 1.4.1848 as do I. I found that the only ones that were not realizing a significant VALIDATE ERROR % rate were those not running MOD-FIT. Without exception it seems that everyone is having a problem with VALIDATE ERRORs. I also made another observation that may or may not be related to another recently identified problem. The top Hosts running MOD-FIT also have a large number of COMPLETED, CAN'T VALIDATE tasks. A lot of these will ultimately validate, but some of these probably won't. Here we find that a number of these tasks have been downloaded to other Hosts only to be ABORTED BY USER. Further research of these hosts shows that in some cases, literally thousands of tasks have been manually aborted by the user. And its not just one or two Hosts that show such behavior - with very little effort I've identified at least a dozen. Since I can't for the life of me posit a sane motive for such behavior, I'm guessing it may be frustration over the errors volunteers are currently experiencing. (O.K., maybe not.) At any rate, could you please fix this. Last month at this time I had an RAC rating of ~325000. Now its barely 200,000 and falling. At 325000 I felt I was contributing to the effort. Now I'm starting to feel like an outsider.

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 413
Credit: 7,486,492
RAC: 0

Message 62508 - Posted: 7 Oct 2014, 20:43:10 UTC

We had a similar problem to this last year when I tried to release the Win32 Version. I will take a look into which platforms are having the highest abort rates and see if killing those platforms will help. Might have to deprecate Win32 again.

Jake W.

Michael Bennett
Send message
Joined: 10 Mar 09
Posts: 13
Credit: 4,061,021
RAC: 788

Message 62511 - Posted: 8 Oct 2014, 2:15:01 UTC - in response to Message 62505.

Jake, sorry, but I just lost 7.5 hours of computer time working a ModFit 1.36. That was one of 2 that self aborted. I aborted 12 more because I didn't want them eating up my processor's time.
Keep up the good work. I'm confident you'll find the answer.
Mike

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 413
Credit: 7,486,492
RAC: 0

Message 62512 - Posted: 8 Oct 2014, 2:25:07 UTC

Looks like the issue is contained to Win32 so I cut them off. Sorry for the inconvenience everyone. I will continue to look into the issue but for now we are dropping Win32 support so we can get everyone else's runs validating.

Thank you for your continued support,

Jake W.

Michael Bennett
Send message
Joined: 10 Mar 09
Posts: 13
Credit: 4,061,021
RAC: 788

Message 62539 - Posted: 11 Oct 2014, 14:44:38 UTC - in response to Message 62499.

Jake,
I don't know if you need this sort of feedback or not.

Lost 3h 12m on one of the Modfit ver 1.36.
I aborted the remaining 3 units of work.

It's every day/night. I can wind up losing hours of processing time due to these units that bomb out. The project is losing a lot of computer time, as well.

Mike

greg_be
Send message
Joined: 18 Aug 09
Posts: 83
Credit: 3,306,312
RAC: 6,739

Message 62540 - Posted: 11 Oct 2014, 22:25:48 UTC

It yet another session of failed tasks and me aborting the rest.
Exit 0 again!
Gawd....what is with this Exit 0 crud anyway?
Blocked Modfit yet again, can't run it.

Michael Bennett
Send message
Joined: 10 Mar 09
Posts: 13
Credit: 4,061,021
RAC: 788

Message 62542 - Posted: 11 Oct 2014, 23:29:00 UTC - in response to Message 62511.

Jake,
5 Modfit 1.36 suffered the dreaded Computation error.
1 Modfit 1.36 was aborted by me.
Mike

greg_be
Send message
Joined: 18 Aug 09
Posts: 83
Credit: 3,306,312
RAC: 6,739

Message 62544 - Posted: 12 Oct 2014, 8:25:14 UTC - in response to Message 62540.

I checked my firewall, modift was being trapped by it. Trying to change some settings and try again.

It yet another session of failed tasks and me aborting the rest.
Exit 0 again!
Gawd....what is with this Exit 0 crud anyway?
Blocked Modfit yet again, can't run it.

greg_be
Send message
Joined: 18 Aug 09
Posts: 83
Credit: 3,306,312
RAC: 6,739

Message 62546 - Posted: 12 Oct 2014, 14:25:46 UTC

Well what do you know....it was the firewall.
But now I have a few tasks that validation was inconclusive and one that could not validate. What's with that?

I think (hope) I have finally gotten through the buggy tasks and that my firewall can behave now.

Rymorea
Send message
Joined: 6 Oct 14
Posts: 45
Credit: 10,006,899
RAC: 4

Message 62548 - Posted: 12 Oct 2014, 17:33:06 UTC
Last modified: 12 Oct 2014, 17:36:47 UTC

Hi, 7 of my Milkyway@Home Separation (Modified Fit) v1.36 (opencl_ati_101) WU's complate but waiting. this is one of them.
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=849329177

and also 8 of them problems like this http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=849307241

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 413
Credit: 7,486,492
RAC: 0

Message 62558 - Posted: 13 Oct 2014, 15:22:57 UTC

Hey Guys,

While the Win32 version was running the error rate was a little high. I took it down so it should now be running more smoothly and validating more frequently and quickly.

Jake W.

Rymorea
Send message
Joined: 6 Oct 14
Posts: 45
Credit: 10,006,899
RAC: 4

Message 62559 - Posted: 13 Oct 2014, 18:35:19 UTC - in response to Message 62558.

Hey Guys,

While the Win32 version was running the error rate was a little high. I took it down so it should now be running more smoothly and validating more frequently and quickly.

Jake W.


i so your message and resume the milkyway again lets see how it go.

Michael Bennett
Send message
Joined: 10 Mar 09
Posts: 13
Credit: 4,061,021
RAC: 788

Message 62561 - Posted: 14 Oct 2014, 14:17:08 UTC - in response to Message 62559.

Jake, Yep, it's running smoother. I only had to delete 10 wu's, last night, due to the one that processed and errored out after taking 5 hours of computer time. Not a good show. I've not seen any difference on my side of the modem.
All the MilkyWay@Home 1.02 (opend_nvidia)work great. It's 100%
All the ModFit 1.36 are a total bust. 0% completion.
Thanks,
Mike

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 413
Credit: 7,486,492
RAC: 0

Message 62565 - Posted: 14 Oct 2014, 16:58:22 UTC
Last modified: 14 Oct 2014, 16:58:39 UTC

Michael Bennett,

Looks like you are running Win32 so you should not be seeing any new work units for Modfit. I had to pull the Win32 application for now to keep everything stable. I will be looking into a fix for the future but no guarantees on how long it will take. Until then you should still be able to run nbody and our stable version of MilkyWay@home just like before.

Sorry for all of the issues.

Jake W.

Profile Martin
Send message
Joined: 22 Sep 14
Posts: 9
Credit: 2,076,031
RAC: 6,813

Message 62566 - Posted: 14 Oct 2014, 21:33:57 UTC

Jake

I am also running a Win32 system, and just had 3 more ModFit 1.36 tasks downloaded. So far there have been no 1.34 or 1.36 tasks which have run on my system.

I am just about to put a block on receiving more until you can find what is wrong with it and correct the problem.[/u]

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 413
Credit: 7,486,492
RAC: 0

Message 62575 - Posted: 15 Oct 2014, 12:53:34 UTC

Hey Martin,

Feel free to block them, you should stop getting them soon anyway. You will still get nbody and normal separation runs. It is just the new separation that is effected.

Sorry,

Jake W.

Stick
Send message
Joined: 8 Oct 07
Posts: 31
Credit: 270,742
RAC: 3

Message 62580 - Posted: 16 Oct 2014, 13:24:52 UTC

I've got 3 computers and all have had problems with Modfit v1.36 - but the symptoms differ. Two computers are 32 bit Intel processors and run Win XP. Modfit v1.36 units fail immediately with computation errors on these 2 computers. The third computer has a 64 bit AMD processor and runs Win 7. Modfit v1.36 units run to completion on this computer but immediately go to "validation inconclusive".

Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 413
Credit: 7,486,492
RAC: 0

Message 62582 - Posted: 16 Oct 2014, 15:46:01 UTC
Last modified: 16 Oct 2014, 15:46:26 UTC

Stick,

Your Win32 machines will no longer be receiving Modfit work units so that won't be a problem any more. As for your 64 bit computer that is completely normal. We require ~3 computers to return the same result for every work unit we send out. Validation inconclusive just means we are waiting for others to return their results for the work unit before we award credit. This ensure people aren't trying to game the system just for credits and it ensures we are getting reliable results for our optimizations.

Sorry for the issues with win32 and confusion,

Jake W.

1 · 2 · Next
Post to thread

Message boards : News : New Separation Modfit Version 1.36


Main page · Your account · Message boards


Copyright © 2017 AstroInformatics Group