Users Auto-Aborting Work Units

Author	Message
Jake Weiss Volunteer moderator Project developer Project tester Project scientist Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0	Message 60126 - Posted: 7 Oct 2013, 18:22:58 UTC Richard, I am working on fixing the plan class to ignore GPUs without a certain minimum OpenCL requirement on the applications that need this. It was rarely a problem until MWSMF was released which can not run on CAL. Hopefully this will be resolved around the same time the small segfault is fixed in the MWSMF application. Jake W ID: 60126 · Rating: 0 · rate: / Reply Quote

mikey Send message Joined: 8 May 09 Posts: 3321 Credit: 520,504,881 RAC: 26,711	Message 60128 - Posted: 8 Oct 2013, 12:13:47 UTC - in response to Message 60124. @mikey: Just an idea, which I have not tested myself: have you tried to disable every application in the default settings? Than the server should not send anything on the first request and you could than assign the computer to the prefered venue before the 2nd request. No I had not thought of that but will give it some thought. I have to go to the webpage anyway to assign it, so maybe it could work. Thanks! ID: 60128 · Rating: 0 · rate: / Reply Quote

GCGZpfuy3zLYVrtDTUhmoccc7Kx4pG... Send message Joined: 4 Oct 11 Posts: 1 Credit: 1,397,192 RAC: 0	Message 60132 - Posted: 9 Oct 2013, 10:47:30 UTC In my case the (modified fit) WUs always abort themselves after 2 seconds runtime. I did try to not get them, but could not find were. It's bad that this checkboxes are not shown if you are not "editing Settings". Now I have found it and disabled the modified. Using a 5800 APU ID: 60132 · Rating: 0 · rate: / Reply Quote

Toby Broom Send message Joined: 13 Jun 09 Posts: 24 Credit: 137,536,729 RAC: 0	Message 60453 - Posted: 25 Nov 2013, 1:19:37 UTC Thanks for tips on the titan config files. I got sick of my ATI card crashing my computer all the time! ID: 60453 · Rating: 0 · rate: / Reply Quote

Karl De Ruyck Send message Joined: 2 Sep 12 Posts: 5 Credit: 16,610,474 RAC: 0	Message 60527 - Posted: 6 Dec 2013, 0:03:41 UTC Hi everyone, I hope this is appropriate to post here... I was previously manually aborting modfit work units because when I let them run, they would result in a computation error. After some discussion in another thread, it was determined that my C library was outdated. I am running Debian 7.2, which comes with eglibc 2.13, while the modfit units require 2.14. To solve the issue, I switched repos to jessie, updated libc6 & dependents, then switched repos back to wheezy. This allowed me to upgrade to eglibc 2.17, without breaking anything (yet). All my modfit work units are now completing successfully. :-) ID: 60527 · Rating: 0 · rate: / Reply Quote

mikey Send message Joined: 8 May 09 Posts: 3321 Credit: 520,504,881 RAC: 26,711	Message 60529 - Posted: 6 Dec 2013, 12:56:39 UTC - in response to Message 60527. Hi everyone, I hope this is appropriate to post here... I was previously manually aborting modfit work units because when I let them run, they would result in a computation error. After some discussion in another thread, it was determined that my C library was outdated. I am running Debian 7.2, which comes with eglibc 2.13, while the modfit units require 2.14. To solve the issue, I switched repos to jessie, updated libc6 & dependents, then switched repos back to wheezy. This allowed me to upgrade to eglibc 2.17, without breaking anything (yet). All my modfit work units are now completing successfully. :-) As a non Linux user one would think the project would recognize the missing files and provide them in a download package, making all that work around stuff unnecessary. I am glad you are crunching the units successfully again though!! ID: 60529 · Rating: 0 · rate: / Reply Quote

[TA]Assimilator1 Send message Joined: 22 Jan 11 Posts: 375 Credit: 64,657,871 RAC: 0	Message 60819 - Posted: 26 Jan 2014, 22:16:44 UTC Last modified: 26 Jan 2014, 22:17:04 UTC So is the MW team going to do anything about WUs on the MilkyWay@Home v1.02 (opencl_amd_ati) erroring out on Radeon 5800s & 6900s?? And it's affected at least 1 7950 too. Theirs this thread about it here http://tp://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=3400 , but no answer yet :(. We've had to switch off that app so as not to spew out errored WUs. Maybe this is what GCGZpfuy3zLYVrtDTUhmoccc7Kx4pGM6AH was talking about? (daft name btw, oh & APU = audio processing unit ;) ). Team AnandTech - SETI@H, DPAD, F@H, MW@H, A@H, LHC, POGS, R@H, Einstein@H, DHEP, WCG Main rig - Ryzen 5 3600, MSI B450 G.Pro C. AC, RTX 3060Ti 8GB, 32GB DDR4 3200, Win 10 64bit 2nd rig - i7 4930k @4.1 GHz, HD 7870 XT 3GB(DS), 16GB DDR3 1866, Win7 ID: 60819 · Rating: 0 · rate: / Reply Quote

[TA]Assimilator1 Send message Joined: 22 Jan 11 Posts: 375 Credit: 64,657,871 RAC: 0	Message 60828 - Posted: 27 Jan 2014, 17:46:05 UTC Doh! or Accelerated Processing Units, but I bet you meant GPU. Team AnandTech - SETI@H, DPAD, F@H, MW@H, A@H, LHC, POGS, R@H, Einstein@H, DHEP, WCG Main rig - Ryzen 5 3600, MSI B450 G.Pro C. AC, RTX 3060Ti 8GB, 32GB DDR4 3200, Win 10 64bit 2nd rig - i7 4930k @4.1 GHz, HD 7870 XT 3GB(DS), 16GB DDR3 1866, Win7 ID: 60828 · Rating: 0 · rate: / Reply Quote

Josiah - Images of Heaven Send message Joined: 4 Jan 14 Posts: 3 Credit: 140,563 RAC: 0	Message 60951 - Posted: 4 Feb 2014, 0:04:50 UTC My issue is that I notice the Nbody jobs come in and take over all 8 of my processors thereby suspending all my other BOINC projects. The only one that doesn't do that is the flagship milkyway@home. Therefore I aborted all of the 'vampire' workunits that suck up all 8 processors and then unchecked them Sorry folks but I'm not letting workunits take over all 8 processors. ID: 60951 · Rating: 0 · rate: / Reply Quote

mikey Send message Joined: 8 May 09 Posts: 3321 Credit: 520,504,881 RAC: 26,711	Message 60953 - Posted: 4 Feb 2014, 13:05:07 UTC - in response to Message 60951. My issue is that I notice the Nbody jobs come in and take over all 8 of my processors thereby suspending all my other BOINC projects. The only one that doesn't do that is the flagship milkyway@home. Therefore I aborted all of the 'vampire' workunits that suck up all 8 processors and then unchecked them Sorry folks but I'm not letting workunits take over all 8 processors. Supposedly they are going to stop creating those units in the near future anyway, but they are no more intrusive then running 8 different units on your pc at the same time. AND they gave some insight into how to truly share your 8 core processor while crunching a single unit, true super computer type like computing. I stopped them awhile back too. ID: 60953 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 22 Jun 11 Posts: 32 Credit: 41,852,496 RAC: 0	Message 60954 - Posted: 4 Feb 2014, 13:12:20 UTC - in response to Message 60953. Last modified: 4 Feb 2014, 13:15:42 UTC If I'm reading this correctly, you are referring to "MT" (multi-threaded) tasks in general, where they use multiple virtual cores to get the task done, instead of working as an "ST" (single-threaded) task which only uses 1 virtual core. The thing is... BOINC is sufficiently setup to handle this just fine. It won't overcommit your system (unless it must due to high-priority tasks), it won't undercommit your system, and it properly records REC (recent estimated credit) such that your RS (resource share) percentages are honored across your projects. Sure, other projects can't work concurrently as the MT task, but BOINC is constantly keeping track of the work done, to ensure RS is honored before the MT task and afterward. There is nothing inherently wrong with MT tasks. They've just been designed to use multiple threads/cores to get the task done quicker. I'm not sure if it is setup this way, but... if MilkyWay had/has the MT tasks put into their own application, then "disabling" them would be as easy as editing the project preferences to disable that application. Though, I still don't see why you guys don't want to run MT tasks. ID: 60954 · Rating: 0 · rate: / Reply Quote

[TA]Assimilator1 Send message Joined: 22 Jan 11 Posts: 375 Credit: 64,657,871 RAC: 0	Message 60975 - Posted: 5 Feb 2014, 18:39:33 UTC - in response to Message 60954. Probably because he wants to run more than 1 project at a time I'd guess. Didn't know their was any DC projects that did true MT! Team AnandTech - SETI@H, DPAD, F@H, MW@H, A@H, LHC, POGS, R@H, Einstein@H, DHEP, WCG Main rig - Ryzen 5 3600, MSI B450 G.Pro C. AC, RTX 3060Ti 8GB, 32GB DDR4 3200, Win 10 64bit 2nd rig - i7 4930k @4.1 GHz, HD 7870 XT 3GB(DS), 16GB DDR3 1866, Win7 ID: 60975 · Rating: 0 · rate: / Reply Quote

mikey Send message Joined: 8 May 09 Posts: 3321 Credit: 520,504,881 RAC: 26,711	Message 60977 - Posted: 5 Feb 2014, 19:10:59 UTC - in response to Message 60975. Probably because he wants to run more than 1 project at a time I'd guess. Didn't know their was any DC projects that did true MT! Collatz is doing it as of today, but I do not know if they are doing it the same way or not. ID: 60977 · Rating: 0 · rate: / Reply Quote

Arivald Ha'gel Send message Joined: 30 Apr 14 Posts: 67 Credit: 160,674,488 RAC: 0	Message 61703 - Posted: 7 May 2014, 14:13:43 UTC Hello, Please look at this computer: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=573990&offset=0&show_names=0&state=6&appid= Shouldn't he be "banned" for mass abort? Or at least banned from receiving GPU tasks? ID: 61703 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 19 Jul 10 Posts: 592 Credit: 18,956,057 RAC: 5,189	Message 61704 - Posted: 7 May 2014, 17:20:38 UTC - in response to Message 61703. Hello, Please look at this computer: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=573990 Shouldn't he be "banned" for mass abort? Or at least banned from receiving GPU tasks? The quota system will do that in this case. PS: I made your link clickable. ID: 61704 · Rating: 0 · rate: / Reply Quote

Richard Haselgrove Send message Joined: 4 Sep 12 Posts: 219 Credit: 456,474 RAC: 0	Message 61705 - Posted: 7 May 2014, 18:08:07 UTC - in response to Message 61703. Hello, Please look at this computer: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=573990&offset=0&show_names=0&state=6&appid= Shouldn't he be "banned" for mass abort? Or at least banned from receiving GPU tasks? Note that the error message (every task I've looked at) is 201 (0xc9) EXIT_MISSING_COPROC The card is NVIDIA Quadro K1000M (2048MB) driver: 296.79, but OpenCL support isn't being reported by BOINC - though the card iself can run OpenCL 1.2 I think the question is more - why does the project keep allocating OpenCL tasks to it? ID: 61705 · Rating: 0 · rate: / Reply Quote

mikey Send message Joined: 8 May 09 Posts: 3321 Credit: 520,504,881 RAC: 26,711	Message 61707 - Posted: 8 May 2014, 10:36:35 UTC - in response to Message 61703. Hello, Please look at this computer: http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=573990&offset=0&show_names=0&state=6&appid= Shouldn't he be "banned" for mass abort? Or at least banned from receiving GPU tasks? Sometimes the words projects use aren't totally accurate in this case 'aborted by user' can also mean 'timed out', or 'the server already has a valid return and your unit is not needed' and I am sure there are others, such as 'no device found'. The point of my message is that the actual message is not always an accurate representation of what is going on, kind of like the blue screen in Windows 'something is wrong'...no duh! But no actual clue as to what caused the problems, just a generic message. The Boinc programmers have been 'accused' in the past of learning to write the error messages from Microsoft, generic and meaning little. I THINK they are getting better though. ID: 61707 · Rating: 0 · rate: / Reply Quote

Arivald Ha'gel Send message Joined: 30 Apr 14 Posts: 67 Credit: 160,674,488 RAC: 0	Message 61745 - Posted: 21 May 2014, 13:36:24 UTC Then I would suggest setting: Max tasks per day to 100 at the beginning (or when it's reset due to being >100 & validate error). Right now I can see PCs wasting over 10 000 tasks... ID: 61745 · Rating: 0 · rate: / Reply Quote