Welcome to MilkyWay@home

Attempting to run CPU tasks but get the following. Not requesting tasks: don't need (CPU: ; NVIDIA GPU: job cache full)


Advanced search

Questions and Answers : Unix/Linux : Attempting to run CPU tasks but get the following. Not requesting tasks: don't need (CPU: ; NVIDIA GPU: job cache full)
Message board moderation

To post messages, you must log in.

AuthorMessage
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 18
Credit: 968,286
RAC: 26
500 thousand credit badge2 year member badge
Message 68637 - Posted: 2 May 2019, 2:37:33 UTC
Last modified: 2 May 2019, 2:40:43 UTC

Hello,
For the past 2 years I've been pretty much exclusively with WCG. I've decided to branch out a bit.

I've got an interesting problem. Perhaps boinc is designed to work this way and I've never realized it.

I'm very new to running multiple projects and I'm still trying to figure out what I'd like my primary to be. I really like this project for a variety of reasons - I love that it is one of the few that I know of that will use multiple CPU cores per WU for some applications (I honestly forget the specific ones), the WUs are short and have decent deadlines, and there are a variety of different applications for just about everything out there.
I currently have a prime grid GPU task in progress that has a 10 day estimate. However, I have no other CPU tasks running or in the cue from any project.

Does the work cache limit apply to both CPU and GPU combined? I would think that if the CPU has no pending work available I could still get tasks for it? I'm really not a fan of creating a 10 day cache.

What am I doing wrong here? I don't want to abort this 10 day task but I really don't want this machine sitting here idle for 10 days on the processor. That's a lot of wasted CPU. Granted, I'm sure there's a GPU a lot more powerful than mine that can handle this prime grid task, but I'd just like to know if there is a setting I can change. I've tried suspending it with no luck.

I should note that I have this machine set to the home profile which has CPU and GPU crunching enabled. I did update after hitting apply and have tried updating since. My default profile only allows crunching on the GPU.

Any help appreciated!
ID: 68637 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 219
Credit: 108,329,645
RAC: 16,621
100 million credit badge8 year member badgeextraordinary contributions badge
Message 68639 - Posted: 2 May 2019, 7:57:27 UTC - in response to Message 68637.  

Yes your work cache covers both cpu and gpu work. BOINC determines whether there is room to schedule cpu or gpu work based on the total amount of estimated calculation time spread among all your projects. It is called REC or Recent Estimated Credit and that figure gets used along with GFLOPS for each device in round-robin simulation when you request work. One of the ways to see what your total commitment for the cpu is to set work_fetch_debug in the Event Log logging options and then read through the Event Log after the work request. You don't want to leave it enabled for more than one work fetch cycle though because it generates a lot of output. A good option to set is sched_op_debug as a permanent logging option. It doesn't add all that much to the event log but it does show you exactly how many seconds of work you are requesting for both cpu and gpu that totals up to your days of work cache size. This is a snippet out of mine to show as an example. My work cache settings is 0.5 days of cache and 0.01 days of additional work cache. Your additional days of work should be set very low to make MW@home request work every 91 seconds.

Thu 02 May 2019 12:36:36 AM PDT | SETI@home | Sending scheduler request: To fetch work.
Thu 02 May 2019 12:36:36 AM PDT | SETI@home | Reporting 15 completed tasks
Thu 02 May 2019 12:36:36 AM PDT | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
Thu 02 May 2019 12:36:36 AM PDT | SETI@home | [sched_op] CPU work request: 153076.75 seconds; 0.00 devices
Thu 02 May 2019 12:36:36 AM PDT | SETI@home | [sched_op] NVIDIA GPU work request: 1702516.75 seconds; 0.00 devices
Thu 02 May 2019 12:36:47 AM PDT | SETI@home | Scheduler request completed: got 15 new tasks
Thu 02 May 2019 12:36:47 AM PDT | SETI@home | [sched_op] Server version 709
Thu 02 May 2019 12:36:47 AM PDT | SETI@home | Project requested delay of 303 seconds
Thu 02 May 2019 12:36:47 AM PDT | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds
Thu 02 May 2019 12:36:47 AM PDT | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 1104 seconds

To see what kind of commitment you have among all your attached projects, you can set rr_simulation. I think that will show you are too overcommitted to the WCG cpu task. I believe that will cause issues since even if you have a very small work cache set, just one WCG task in your work cache will swamp any other cpu work. The way to get the MW N-body mt (multi-thread) application to pull some work would be to suspend the WCG task. It may take a while for BOINC to "balance the books" and let you download a mt task. You might have to leave the WCG task suspended for a few days and hope that it doesn't go into High Priority mode once re-enabled. Try to set your work cache to a very small amount. That increases your chance of getting some MW mt tasks.
ID: 68639 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 18
Credit: 968,286
RAC: 26
500 thousand credit badge2 year member badge
Message 68653 - Posted: 2 May 2019, 20:54:00 UTC - in response to Message 68639.  

Yes your work cache covers both cpu and gpu work. BOINC determines whether there is room to schedule cpu or gpu work based on the total amount of estimated calculation time spread among all your projects. It is called REC or Recent Estimated Credit and that figure gets used along with GFLOPS for each device in round-robin simulation when you request work. One of the ways to see what your total commitment for the cpu is to set work_fetch_debug in the Event Log logging options and then read through the Event Log after the work request. You don't want to leave it enabled for more than one work fetch cycle though because it generates a lot of output. A good option to set is sched_op_debug as a permanent logging option. It doesn't add all that much to the event log but it does show you exactly how many seconds of work you are requesting for both cpu and gpu that totals up to your days of work cache size. This is a snippet out of mine to show as an example. My work cache settings is 0.5 days of cache and 0.01 days of additional work cache. Your additional days of work should be set very low to make MW@home request work every 91 seconds.


To see what kind of commitment you have among all your attached projects, you can set rr_simulation. I think that will show you are too overcommitted to the WCG cpu task. I believe that will cause issues since even if you have a very small work cache set, just one WCG task in your work cache will swamp any other cpu work. The way to get the MW N-body mt (multi-thread) application to pull some work would be to suspend the WCG task. It may take a while for BOINC to "balance the books" and let you download a mt task. You might have to leave the WCG task suspended for a few days and hope that it doesn't go into High Priority mode once re-enabled. Try to set your work cache to a very small amount. That increases your chance of getting some MW mt tasks.


Thank you for that information. I've just set those flags now.
I have 0.5 minimum and an additional 1 day, so I've also changed this.

MW seems to be hammering away at the GPU, however the CPU is completely idle. And this is Prime grid, which is sharing the GPU along with MW and Seti.

By the cpu being idle, I mean no other project is using it because I have MW exclusively set to be the one that uses it along with the gpu.

I'll wait and see what happens at this point.

This is the machine I'm referencing. I forget if I posted that before. https://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=803610
ID: 68653 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 18
Credit: 968,286
RAC: 26
500 thousand credit badge2 year member badge
Message 68681 - Posted: 4 May 2019, 4:39:26 UTC

Okay, this is a strange issue.

I decided to enable CPU tasks in prime grid project settings. And what do you know, that machine is getting tasks like crazy on the CPU side of things.

Should I try resetting MWH? I have little experience in doing this so not sure if I should outright remove or reset project. I'm assuming make sure there are no tasks left before doing either one.

Interesting and a little frustrating, but hopefully I can figure this out.
ID: 68681 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2236
Credit: 259,703,998
RAC: 44,281
200 million credit badge10 year member badgeextraordinary contributions badge
Message 68687 - Posted: 4 May 2019, 11:00:49 UTC - in response to Message 68681.  

Okay, this is a strange issue.

I decided to enable CPU tasks in prime grid project settings. And what do you know, that machine is getting tasks like crazy on the CPU side of things.

Should I try resetting MWH? I have little experience in doing this so not sure if I should outright remove or reset project. I'm assuming make sure there are no tasks left before doing either one.

Interesting and a little frustrating, but hopefully I can figure this out.


Resettting will wipe out every workunit from MW you have on your pc in the process.

Are you sure you have the allow cpu tasks checkbox checked for MW? Were you running cpu wu's from other projects prior to allowing them from PG? What is your resource share set too?
ID: 68687 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 18
Credit: 968,286
RAC: 26
500 thousand credit badge2 year member badge
Message 68695 - Posted: 5 May 2019, 2:59:26 UTC - in response to Message 68687.  

Okay, this is a strange issue.

I decided to enable CPU tasks in prime grid project settings. And what do you know, that machine is getting tasks like crazy on the CPU side of things.

Should I try resetting MWH? I have little experience in doing this so not sure if I should outright remove or reset project. I'm assuming make sure there are no tasks left before doing either one.

Interesting and a little frustrating, but hopefully I can figure this out.


Resettting will wipe out every workunit from MW you have on your pc in the process.

Are you sure you have the allow cpu tasks checkbox checked for MW? Were you running cpu wu's from other projects prior to allowing them from PG? What is your resource share set too?

Yes I do have MW set to receive cpu tasks. I have, right now, resources set to 150. No, I was not receiving cpu tasks for any other projects. I was waiting on MW to receive them since I had it exclusively set to be the one running on the cpu and gpu at the same time.
ID: 68695 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : Attempting to run CPU tasks but get the following. Not requesting tasks: don't need (CPU: ; NVIDIA GPU: job cache full)

©2019 Astroinformatics Group