Running Modfit on MilkyWay@home

Author	Message
TRuEQ & TuVaLu Send message Joined: 2 Feb 10 Posts: 16 Credit: 57,910,399 RAC: 0	Message 64716 - Posted: 21 Jun 2016, 18:01:21 UTC I am running arkayns app(ati14) "milkyway_separation_0.82_windows_intelx86__ati14.exe" for ati 5850 and all of a sudden i get modfit wu's and all errors. I loged in to my account and deselected the modfit wu's. I am running anonymous platform and I seem to get modfit tasks even if they are desected in preferences. And I have no app for it in app_info.xml boincmanager keep telling me. I don't think I should get theese tasks at all. But still they come.... http://milkyway.cs.rpi.edu/milkyway/hosts_user.php ID: 64716 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 64718 - Posted: 21 Jun 2016, 18:16:00 UTC I will look at recompiling the the Mac binary tomorrow morning. Maybe for some reason it was compiled with the debug flag on. Strange it took this long to realize. TRuEQ, You are getting Modfit work units now because those are the only work units being sent out. Modfit is the new officially supported application on MilkyWay@home. I think new amd opencl application should run on your computer though considering you have OpenCL 1.2 compute capabilities on your GPUs (as long as they support double precision). Peciak, Are you crunching more than 40 work units per minute? Jake ID: 64718 · Rating: 0 · rate: / Reply Quote

Peciak Send message Joined: 27 Jun 09 Posts: 12 Credit: 148,038,330 RAC: 0	Message 64719 - Posted: 21 Jun 2016, 19:15:18 UTC Are you crunching more than 40 work units per minute? no, but the server contact client boinc 1 per 60 second and sends 9-10 WU then deducts another 60 seconds -> ATI sleeps my ATI crunching WU 10-11 per 60 sec ID: 64719 · Rating: 0 · rate: / Reply Quote

Bif74 [Lombardia] Send message Joined: 25 May 09 Posts: 6 Credit: 23,564,491 RAC: 5,678	Message 64720 - Posted: 21 Jun 2016, 20:47:03 UTC - in response to Message 64716. Last modified: 21 Jun 2016, 20:47:38 UTC I am running arkayns app(ati14) "milkyway_separation_0.82_windows_intelx86__ati14.exe" for ati 5850 and all of a sudden i get modfit wu's and all errors. I loged in to my account and deselected the modfit wu's. I am running anonymous platform and I seem to get modfit tasks even if they are desected in preferences. And I have no app for it in app_info.xml boincmanager keep telling me. I don't think I should get theese tasks at all. But still they come.... http://milkyway.cs.rpi.edu/milkyway/hosts_user.php Me too. Same conditions on Win XP SP2. My ATI HD3850 is still receiving only modfit w.u. and they end immediatly with computation error. How can resolve and restart crunching? Thanks, Marco ID: 64720 · Rating: 0 · rate: / Reply Quote

Jean-Pierre HARLE Send message Joined: 25 Sep 08 Posts: 15 Credit: 145,544,797 RAC: 0	Message 64721 - Posted: 21 Jun 2016, 22:41:28 UTC - in response to Message 64714. I'm running into the same problem as Nigel Garvey - tasks take twice as long and the credit earned is a fourth of what it used to be. I don't know if it is some sort of artifact of running Macs or something else. I have core 2 duos, and i5 computers running OS X from Snow Leopard to El Capitan. I am curious as to what is going on. Exactly the same problems on my MacBook Pro : tasks take twice as long and the credits are a fourth of what it used to be (26.74 vs 106.88). ID: 64721 · Rating: 0 · rate: / Reply Quote

Thunder Send message Joined: 9 Jul 08 Posts: 85 Credit: 44,842,651 RAC: 0	Message 64727 - Posted: 22 Jun 2016, 13:56:59 UTC - in response to Message 64719. Are you crunching more than 40 work units per minute? no, but the server contact client boinc 1 per 60 second and sends 9-10 WU then deducts another 60 seconds -> ATI sleeps my ATI crunching WU 10-11 per 60 sec I'm having exactly the same problem and it was only exacerbated by the change to all modfit units. I can do 17-18 WU per minute, but since the scheduler typically only has 8-11 WUs available to send at any given moment, it takes me 2-3 minutes of updates to get 1 minute of work. It's irrelevant to ask if a machine is crunching more than 40 WU per minute (and the limit is actually 25 since the scheduler will not send more tasks than that even if it has them available), when the server rarely has that much even waiting to send. My problem is made worse by the fact that despite making every configuration change I can think of, my clients refuse to update (on their own) more often than every 60 minutes. I've asked for help everywhere I can think, but since MW@H is the only (major) project that dribbles out teeny, weeny amounts of work at a time, it's not a problem that anyone else I can find has had need to try to solve. ID: 64727 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 64728 - Posted: 22 Jun 2016, 14:56:02 UTC Hey everyone, Update on the work I've been doing today. Working on improving the Mac applications run time. Still unsure why it takes so long compared to the old version. As for the not enough work problem, I am really confused as to why it won't send you guys more work when you ask for it. It constantly says there are 1,200+ workunits ready to send on our server status page. Maybe there is a bug in how that number is calculated? Not sure yet. Thunder, Which of your hosts is having an issue getting the right number of work units? Jake ID: 64728 · Rating: 0 · rate: / Reply Quote

Thunder Send message Joined: 9 Jul 08 Posts: 85 Credit: 44,842,651 RAC: 0	Message 64730 - Posted: 22 Jun 2016, 17:42:15 UTC - in response to Message 64728. The same one I referenced earlier in this thread, 691866. The only way it comes close to keeping the GPU "fed" is if I sit at it and hit the update button every 1-2 minutes. (Even then it's likely to run out and switch to another project after a dozen updates or so) If you have MW@H set to use "Locality Scheduling" (which is probably a really good thing for both the project and volunteers, depending on the size of input files), then that might explain the disparity between what you're seeing as available tasks vs what's available for any given host. Depending on how many variations of input files are active at any time, there might be 1,000 total tasks available, but (and this is somewhat random of course) there might only be 10 or 20 available for any given host (depending on what files it already has downloaded). This is just a guess of course, because I've never really dug into whether or not MW@H even needs to use locality scheduling. ID: 64730 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 64731 - Posted: 22 Jun 2016, 18:35:07 UTC Thunder, Can you try updating you BOINC client version for that host? My records show you are using 7.6.6 and the newest version is 7.6.22. Maybe that will fix the issue. I will still keep looking into whether the issue is on our end in the mean time. Jake ID: 64731 · Rating: 0 · rate: / Reply Quote

Thunder Send message Joined: 9 Jul 08 Posts: 85 Credit: 44,842,651 RAC: 0	Message 64732 - Posted: 22 Jun 2016, 18:56:14 UTC I can try. I've not run BOINC on linux from anything but a package installation in as long as I can remember and unfortunately 7.6.6 is the package for Ubuntu 15.10. Considering how difficult it was to get what I have working, I have a feeling there will be a lot of colorful language in my future. This will definitely be a weekend project. :-/ ID: 64732 · Rating: 0 · rate: / Reply Quote

Peciak Send message Joined: 27 Jun 09 Posts: 12 Credit: 148,038,330 RAC: 0	Message 64733 - Posted: 22 Jun 2016, 19:25:08 UTC my host BOINC client version 7.6.22 :-( ID: 64733 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 64735 - Posted: 22 Jun 2016, 20:59:40 UTC Thunder, If it makes you feel any better, I am pretty sure the version in the official Ubuntu 16.04 package system has a memory leak that some users were complaining about. Hopefully updating the client will let you at least ask for work a little more often than once an hour. This is the next thing on my list to fix after I figure out why new compilations of MW@home take 7 times longer to run on Mac than old compilations. Jake ID: 64735 · Rating: 0 · rate: / Reply Quote

Super Nova Nerd Send message Joined: 17 Feb 16 Posts: 14 Credit: 11,121,737 RAC: 0	Message 64736 - Posted: 22 Jun 2016, 23:30:33 UTC Last modified: 22 Jun 2016, 23:50:27 UTC I will wait until the bugs are worked out to hit MW hard again. ID: 64736 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 64738 - Posted: 23 Jun 2016, 2:21:33 UTC Super Nova Nerd, The MilkyWay@home application is stable for all supported systems at this time. I am looking at a way to improve run times for the Mac users and looking into how many work units users can download. Other than that, not much is going to change in the near future. Jake ID: 64738 · Rating: 0 · rate: / Reply Quote

paris Send message Joined: 26 Apr 08 Posts: 87 Credit: 64,801,496 RAC: 0	Message 64740 - Posted: 23 Jun 2016, 12:38:43 UTC I don't know if it will help track down the problem or not, but I noticed that the last few MW work units (after switching to 1.36 but still not running the ModFit units) took about 50,000 sec as opposed to the usual 6500 sec or so under 1.01. Credit for those was 106.88. Thank you for your efforts. Plus SETI Classic = 21,082 WUs ID: 64740 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 64741 - Posted: 23 Jun 2016, 13:02:03 UTC Hey Everyone, I think I found a fix to make Modfit actually run faster than the old application. Just running some last minute tests. Hopefully have a new version out for Macs by the end of the day. Jake ID: 64741 · Rating: 0 · rate: / Reply Quote

Super Nova Nerd Send message Joined: 17 Feb 16 Posts: 14 Credit: 11,121,737 RAC: 0	Message 64742 - Posted: 23 Jun 2016, 17:24:41 UTC - in response to Message 64741. I had trouble with the N body apps before. Are those no longer being used? I had a very high failure rate on those. ID: 64742 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 64743 - Posted: 23 Jun 2016, 17:53:04 UTC Super Nova Nerd, N-body is not my area of expertise. I am the scientist in charge of Separation and Sidd is the scientist in charge of N-body. I know he is in the process of building a new version that he wants to ship out today, but its a beta project so expect it to be a bumpy road and still a little buggy. Jake ID: 64743 · Rating: 0 · rate: / Reply Quote

Thunder Send message Joined: 9 Jul 08 Posts: 85 Credit: 44,842,651 RAC: 0	Message 64747 - Posted: 24 Jun 2016, 17:02:19 UTC - in response to Message 64735. Thunder, If it makes you feel any better, I am pretty sure the version in the official Ubuntu 16.04 package system has a memory leak that some users were complaining about. Hopefully updating the client will let you at least ask for work a little more often than once an hour. This is the next thing on my list to fix after I figure out why new compilations of MW@home take 7 times longer to run on Mac than old compilations. Jake So before going to extreme of installing a new client (a royal pain in the behind on linux), I tried a little experiment today and figured out what's going on. Since I run 3 projects on this machine, the simple thing to try was to set the other two projects to "no new work" and see what happens. :-) As soon as the others were (nearly) out of work, the client started hitting up MW@H about once every minute or two for new work. (I got this idea after seeing the same behavior on a Windows machine with the latest version that also has a reasonably fast GPU) So the client is basically getting work (but only a very little), finishing it, then seeing it has work for other projects and essentially making the decision that since it ran out of work for MW@H, it will switch over. Then, after a sufficient time has passed, it goes to see if MW@H has more work available. (And repeating this cycle ad infinitum) I'm guessing this is why MW@H has seen a precipitous drop in credit since the switch was made (and also a pretty severe drop in active participants). It doesn't really affect those that run only MW&H, but if, as the majority of BOINC users do, you run multiple projects, it's only going to do as many tasks for MW&H as it can get in one scheduler request, then rotate around to another project. Since the default for BOINC installs is to rotate every 60 minutes, there you go. :-/ Obviously the science dictates the tasks, so you can't just make longer tasks to solve this. However, if you could increase the number of tasks available on the server and allow a much larger number to be issued per scheduler request. (if I could wave my magic wand, I'd ask for 250 instead of the current limit of 25) Either way, you need to figure how to get more work available from the server because even if I set MW&H as the only project, it's going to run out of work once in a while due to: WRF1 10172 Milkyway@Home 6/24/2016 11:56:38 AM Scheduler request completed: got 0 new tasks WRF1 10200 Milkyway@Home 6/24/2016 11:57:42 AM Scheduler request completed: got 0 new tasks (Two back to back scheduler requests in which is the server had zero tasks available to send for GPU work) As I've said in a few posts before... IF you're doing the science as fast as you can handle already (as in, the users are solving the problems faster than you can posit new ones), then don't sweat it. We'll just keep on keepin' on as the work is available. :-) ID: 64747 · Rating: 0 · rate: / Reply Quote

Thunder Send message Joined: 9 Jul 08 Posts: 85 Credit: 44,842,651 RAC: 0	Message 64748 - Posted: 24 Jun 2016, 17:17:01 UTC Sure enough, exactly what I predicted happened a few minutes later. After repeated requests for more work getting 0 tasks sent, the GPU ran out completely and I had to go on and allow work from other projects so it would at least be doing something productive. :-/ ID: 64748 · Rating: 0 · rate: / Reply Quote