Questions and Answers :
Unix/Linux :
Attempting to run CPU tasks but get the following. Not requesting tasks: don't need (CPU: ; NVIDIA GPU: job cache full)
Message board moderation
Author | Message |
---|---|
Send message Joined: 17 Feb 17 Posts: 21 Credit: 8,511,880 RAC: 0 |
Hello, For the past 2 years I've been pretty much exclusively with WCG. I've decided to branch out a bit. I've got an interesting problem. Perhaps boinc is designed to work this way and I've never realized it. I'm very new to running multiple projects and I'm still trying to figure out what I'd like my primary to be. I really like this project for a variety of reasons - I love that it is one of the few that I know of that will use multiple CPU cores per WU for some applications (I honestly forget the specific ones), the WUs are short and have decent deadlines, and there are a variety of different applications for just about everything out there. I currently have a prime grid GPU task in progress that has a 10 day estimate. However, I have no other CPU tasks running or in the cue from any project. Does the work cache limit apply to both CPU and GPU combined? I would think that if the CPU has no pending work available I could still get tasks for it? I'm really not a fan of creating a 10 day cache. What am I doing wrong here? I don't want to abort this 10 day task but I really don't want this machine sitting here idle for 10 days on the processor. That's a lot of wasted CPU. Granted, I'm sure there's a GPU a lot more powerful than mine that can handle this prime grid task, but I'd just like to know if there is a setting I can change. I've tried suspending it with no luck. I should note that I have this machine set to the home profile which has CPU and GPU crunching enabled. I did update after hitting apply and have tried updating since. My default profile only allows crunching on the GPU. Any help appreciated! |
Send message Joined: 24 Jan 11 Posts: 715 Credit: 556,288,680 RAC: 55,007 |
Yes your work cache covers both cpu and gpu work. BOINC determines whether there is room to schedule cpu or gpu work based on the total amount of estimated calculation time spread among all your projects. It is called REC or Recent Estimated Credit and that figure gets used along with GFLOPS for each device in round-robin simulation when you request work. One of the ways to see what your total commitment for the cpu is to set work_fetch_debug in the Event Log logging options and then read through the Event Log after the work request. You don't want to leave it enabled for more than one work fetch cycle though because it generates a lot of output. A good option to set is sched_op_debug as a permanent logging option. It doesn't add all that much to the event log but it does show you exactly how many seconds of work you are requesting for both cpu and gpu that totals up to your days of work cache size. This is a snippet out of mine to show as an example. My work cache settings is 0.5 days of cache and 0.01 days of additional work cache. Your additional days of work should be set very low to make MW@home request work every 91 seconds. Thu 02 May 2019 12:36:36 AM PDT | SETI@home | Sending scheduler request: To fetch work. Thu 02 May 2019 12:36:36 AM PDT | SETI@home | Reporting 15 completed tasks Thu 02 May 2019 12:36:36 AM PDT | SETI@home | Requesting new tasks for CPU and NVIDIA GPU Thu 02 May 2019 12:36:36 AM PDT | SETI@home | [sched_op] CPU work request: 153076.75 seconds; 0.00 devices Thu 02 May 2019 12:36:36 AM PDT | SETI@home | [sched_op] NVIDIA GPU work request: 1702516.75 seconds; 0.00 devices Thu 02 May 2019 12:36:47 AM PDT | SETI@home | Scheduler request completed: got 15 new tasks Thu 02 May 2019 12:36:47 AM PDT | SETI@home | [sched_op] Server version 709 Thu 02 May 2019 12:36:47 AM PDT | SETI@home | Project requested delay of 303 seconds Thu 02 May 2019 12:36:47 AM PDT | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds Thu 02 May 2019 12:36:47 AM PDT | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 1104 seconds To see what kind of commitment you have among all your attached projects, you can set rr_simulation. I think that will show you are too overcommitted to the WCG cpu task. I believe that will cause issues since even if you have a very small work cache set, just one WCG task in your work cache will swamp any other cpu work. The way to get the MW N-body mt (multi-thread) application to pull some work would be to suspend the WCG task. It may take a while for BOINC to "balance the books" and let you download a mt task. You might have to leave the WCG task suspended for a few days and hope that it doesn't go into High Priority mode once re-enabled. Try to set your work cache to a very small amount. That increases your chance of getting some MW mt tasks. |
Send message Joined: 17 Feb 17 Posts: 21 Credit: 8,511,880 RAC: 0 |
Yes your work cache covers both cpu and gpu work. BOINC determines whether there is room to schedule cpu or gpu work based on the total amount of estimated calculation time spread among all your projects. It is called REC or Recent Estimated Credit and that figure gets used along with GFLOPS for each device in round-robin simulation when you request work. One of the ways to see what your total commitment for the cpu is to set work_fetch_debug in the Event Log logging options and then read through the Event Log after the work request. You don't want to leave it enabled for more than one work fetch cycle though because it generates a lot of output. A good option to set is sched_op_debug as a permanent logging option. It doesn't add all that much to the event log but it does show you exactly how many seconds of work you are requesting for both cpu and gpu that totals up to your days of work cache size. This is a snippet out of mine to show as an example. My work cache settings is 0.5 days of cache and 0.01 days of additional work cache. Your additional days of work should be set very low to make MW@home request work every 91 seconds. Thank you for that information. I've just set those flags now. I have 0.5 minimum and an additional 1 day, so I've also changed this. MW seems to be hammering away at the GPU, however the CPU is completely idle. And this is Prime grid, which is sharing the GPU along with MW and Seti. By the cpu being idle, I mean no other project is using it because I have MW exclusively set to be the one that uses it along with the gpu. I'll wait and see what happens at this point. This is the machine I'm referencing. I forget if I posted that before. https://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=803610 |
Send message Joined: 17 Feb 17 Posts: 21 Credit: 8,511,880 RAC: 0 |
Okay, this is a strange issue. I decided to enable CPU tasks in prime grid project settings. And what do you know, that machine is getting tasks like crazy on the CPU side of things. Should I try resetting MWH? I have little experience in doing this so not sure if I should outright remove or reset project. I'm assuming make sure there are no tasks left before doing either one. Interesting and a little frustrating, but hopefully I can figure this out. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Okay, this is a strange issue. Resettting will wipe out every workunit from MW you have on your pc in the process. Are you sure you have the allow cpu tasks checkbox checked for MW? Were you running cpu wu's from other projects prior to allowing them from PG? What is your resource share set too? |
Send message Joined: 17 Feb 17 Posts: 21 Credit: 8,511,880 RAC: 0 |
Okay, this is a strange issue. Yes I do have MW set to receive cpu tasks. I have, right now, resources set to 150. No, I was not receiving cpu tasks for any other projects. I was waiting on MW to receive them since I had it exclusively set to be the one running on the cpu and gpu at the same time. |
©2024 Astroinformatics Group