New Separation Runs

Author	Message
Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 68015 - Posted: 16 Jan 2019, 15:48:44 UTC Hey Everyone, Just wanted you all to know I put up some new separation runs. These runs are back to fitting 3 streams and are bundled in groups of 5. The names of these runs are: de_modfit_80_bundle5_3s_NoContraintsWithDisk200_1 de_modfit_81_bundle5_3s_NoContraintsWithDisk200_1 de_modfit_82_bundle5_3s_NoContraintsWithDisk200_1 de_modfit_83_bundle5_3s_NoContraintsWithDisk200_1 de_modfit_84_bundle5_3s_NoContraintsWithDisk200_1 de_modfit_85_bundle5_3s_NoContraintsWithDisk200_2 de_modfit_86_bundle5_3s_NoContraintsWithDisk200_1 If you have any trouble with these runs, please let me know. Thank you all for your continued support. Jake ID: 68015 · Rating: 0 · rate: / Reply Quote

Rantanplan Send message Joined: 19 Aug 11 Posts: 38 Credit: 50,018,810 RAC: 0	Message 68019 - Posted: 16 Jan 2019, 22:21:23 UTC - in response to Message 68015. ok, i got screwed. Watch my results. ID: 68019 · Rating: 0 · rate: / Reply Quote

Finn the Human Send message Joined: 23 Dec 18 Posts: 23 Credit: 10,213,119 RAC: 0	Message 68020 - Posted: 17 Jan 2019, 6:33:58 UTC - in response to Message 68019. Hmm. I see 16 errors to compute (all GPU WUs) on your computer equipped with the GTX 670 and half of them failed within 3 seconds of starting. I have yet to see any task failing on your RX560 and my GPUs. It might just be your GTX 670 acting up. Everything stays But it still changes Ever so slightly Daily and nightly In little ways When everything stays... ID: 68020 · Rating: 0 · rate: / Reply Quote

Rantanplan Send message Joined: 19 Aug 11 Posts: 38 Credit: 50,018,810 RAC: 0	Message 68021 - Posted: 17 Jan 2019, 10:11:04 UTC - in response to Message 68020. Last modified: 17 Jan 2019, 10:11:17 UTC I changed the driver version to newest 417.71 release. That worked well. ID: 68021 · Rating: 0 · rate: / Reply Quote

Max_Pirx Send message Joined: 13 Dec 17 Posts: 46 Credit: 2,421,362,376 RAC: 0	Message 68023 - Posted: 17 Jan 2019, 11:51:07 UTC Hello, for a while I've been noticing that my GPUs make a pause between the individual tasks in the bundle. I'm running 6 Radeon HD5850/5870 on Win7. On some of the GPUs these pauses are longer (up to a minute sometimes), on other are shorter. But now with the new 5-bundles couple of my GPUs make enormous pauses at 20%, 40%, etc. and it more than doubles the completion time. Effectively, the GPUs are idling more time between the tasks in the bundle tnah actually computing, and the idle time is still counted as computation when calculating the credit. Is there anything I can do to minimise that? ID: 68023 · Rating: 0 · rate: / Reply Quote

xtatuk Send message Joined: 16 Jul 18 Posts: 2 Credit: 1,612,198 RAC: 0	Message 68024 - Posted: 17 Jan 2019, 12:05:55 UTC To much problem as a boinc user with Milkyway database, homepage and boinc, so I have stopped contributing completely. No answer to support questions about database problems from admin or news about database situation and when they are going to be solved at Message Boards. ID: 68024 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 68025 - Posted: 17 Jan 2019, 15:09:09 UTC Hi Max, There are necessary book keeping tasks that have to be performed between workunit runs. These include, doing the final parts of our likelihood calculation on all of the stars (after the GPU computes our large integral), recording the results of the current run, cleaning up data from the current run, and formatting data for the next run. Unfortunately, this does take some time to complete. If it makes you feel any better, these same tasks would also be completed during a normal single workunit, so it shouldn't be any different in overall runtime for 5 single work units or 5 bundled work units. The reason you are seeing different run times for these final parts likely has to do with the different number of stars in different jobs, and not anything to do with your GPUs. xtatuk, I understand your frustration with the database, please understand that we are frustrated with it too. Unfortunately, it is something that is not easy to fix. We are working on making the error messages a little more graceful and improving caching of our data driven web pages. This should hopefully make our website run smoother when our database is under heavy load from work requests. Also it should allow our database to prioritize on handling workunits when it is under heavy load. I doubt this will solve the problem entirely, but slowly we are working to improve the stability of the database through incremental changes. Thank you all for your continued support. Jake ID: 68025 · Rating: 0 · rate: / Reply Quote

mmonnin Send message Joined: 2 Oct 16 Posts: 167 Credit: 1,008,062,758 RAC: 3,336	Message 68027 - Posted: 17 Jan 2019, 15:52:37 UTC - in response to Message 68023. Hello, for a while I've been noticing that my GPUs make a pause between the individual tasks in the bundle. I'm running 6 Radeon HD5850/5870 on Win7. On some of the GPUs these pauses are longer (up to a minute sometimes), on other are shorter. But now with the new 5-bundles couple of my GPUs make enormous pauses at 20%, 40%, etc. and it more than doubles the completion time. Effectively, the GPUs are idling more time between the tasks in the bundle tnah actually computing, and the idle time is still counted as computation when calculating the credit. Is there anything I can do to minimise that? Run multiple tasks at once to keep GPU load high. I run 4x on my 280x. ID: 68027 · Rating: 0 · rate: / Reply Quote

Max_Pirx Send message Joined: 13 Dec 17 Posts: 46 Credit: 2,421,362,376 RAC: 0	Message 68030 - Posted: 17 Jan 2019, 17:03:54 UTC Thanks for the replies. I am not particularity bothered about the pauses since the WUs are completing OK and without errors. I was rather curios if there is something wrong on my part, but if this is normal then that's OK. And I am running 4 WUs concurrently per GPU, that helps but on couple of the GPUs the pauses are still noticeable. ID: 68030 · Rating: 0 · rate: / Reply Quote

Corla99 [Lombardia] Send message Joined: 9 Jan 17 Posts: 1 Credit: 10,090,996 RAC: 0	Message 68031 - Posted: 17 Jan 2019, 18:46:27 UTC I have 2 pc's active on this project, one with a Vega 56 and one with GTX 1080Ti. Most of today's wu are in "validation inconclusive" In total I have 167 with nvidia gpu and 266 with the vega. I run only one wu per gpu and I don't have any dedicated core for the wu. The other projects that run on those rigs are WCG (cpu) and WUProp (nci) ID: 68031 · Rating: 0 · rate: / Reply Quote

bluestang Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0	Message 68032 - Posted: 17 Jan 2019, 19:52:17 UTC Last modified: 17 Jan 2019, 19:53:57 UTC @ Corla99 [Lombardia], "validation inconclusive" means nothing...they will validate eventually. Think of them as pending. @Jake, I think these "bundle5_3s" are much better approach. Especially when running concurrent WUs. The "bundle6_2s" take longer and pay less than the "bundle4_4s" which makes no sense. So these seem about right as far as runtime and credit received. As far as database/server issues...if you increase the number of allowed "In Progess" tasks then maybe that would solve a lot of issues for people running out of work. If the DB is causing WUs to not be issued/downloaded because BOINC can't connect for let's say 2 Hours, then allow the "In Progess" amount to be somewhere around 2 Hours of work. 80 WUs max is too little, I think 200 is a good amount to test with to start. No??? Thanks, blue ID: 68032 · Rating: 0 · rate: / Reply Quote

Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 68033 - Posted: 17 Jan 2019, 21:44:30 UTC Hey Bluestang, The number of workunits we bundle is directly related to the number of streams we have in our model. More streams means a higher number of parameters to fit on the commandline and also a longer integral time because of a more complex model. These are scientifically motivated and sometimes we have to run those types of runs to get the science we need done. We try to compensate for the difference in computation time by changing the number of credits given for each. Unfortunately, our credit algorithm does not always give the perfect compensation for the differing computation times on all machines. Since we do not run these types of runs often, we have not worked on better refining our credit allocation for these types of runs. Everyone, I am about to try something to better micromanage GPU vs CPU workunits. Not sure if it will work. I will make a separate thread about this. Jake ID: 68033 · Rating: 0 · rate: / Reply Quote

bluestang Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0	Message 68037 - Posted: 18 Jan 2019, 1:26:12 UTC - in response to Message 68033. Last modified: 18 Jan 2019, 1:56:54 UTC Thank you for explaining more and trying to find a solution. Always nice to see project people listening and working to better the project and helping users/volunteers. Cheers! EDIT: Although,I'm not quite sure of the 600 max limit for GPUs. What if someone has a fast multi-GPU setup? Just asking :) ID: 68037 · Rating: 0 · rate: / Reply Quote

DAF Send message Joined: 7 Jan 16 Posts: 4 Credit: 16,016,591 RAC: 0	Message 68073 - Posted: 27 Jan 2019, 10:00:36 UTC I'm not sure, mine has become more CPU time consuming to complete a task on the GPU. Fermi is an old-fashioned architecture, but has a generally good figure for fp64. ID: 68073 · Rating: 0 · rate: / Reply Quote

bluestang Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0	Message 68074 - Posted: 28 Jan 2019, 14:31:52 UTC - in response to Message 68073. I'm not sure, mine has become more CPU time consuming to complete a task on the GPU. Fermi is an old-fashioned architecture, but has a generally good figure for fp64. OpenCL on Nvidia needs more CPU than AMD if I'm not mistaken. Nvidia isn't optimized for OpenCL as they want people to use their proprietary CUDA instead. ID: 68074 · Rating: 0 · rate: / Reply Quote

Robby1959 Send message Joined: 1 Feb 13 Posts: 48 Credit: 66,724,440 RAC: 0	Message 68141 - Posted: 12 Feb 2019, 2:55:14 UTC are these runs way longer ?? I went from running 2 gpu units at 3 minutes to running hours and barly hitting the GPU ID: 68141 · Rating: 0 · rate: / Reply Quote

mmonnin Send message Joined: 2 Oct 16 Posts: 167 Credit: 1,008,062,758 RAC: 3,336	Message 68148 - Posted: 12 Feb 2019, 11:49:33 UTC - in response to Message 68141. are these runs way longer ?? I went from running 2 gpu units at 3 minutes to running hours and barly hitting the GPU Thats a driver crash, not a task change. Restart your PC. ID: 68148 · Rating: 0 · rate: / Reply Quote