Welcome to MilkyWay@home

MilkyWay hogging processor time.

Message boards : Number crunching : MilkyWay hogging processor time.
Message board moderation

To post messages, you must log in.

AuthorMessage
julianop

Send message
Joined: 12 Oct 11
Posts: 7
Credit: 22,232,890
RAC: 2,593
Message 69549 - Posted: 19 Feb 2020, 2:55:03 UTC

I have World Community Grid tasks that are passing their due date, yet MilkyWay is hogging processor time when its projects aren't due for another five days. There hasn't been a task switch for twenty four hours (set for 60 minutes)
What's going on?
ID: 69549 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 69550 - Posted: 19 Feb 2020, 3:12:39 UTC - in response to Message 69549.  
Last modified: 19 Feb 2020, 3:22:36 UTC

Not positive, but I suspect Milkyway requires OpenCL 1.2 and your GTX 560 Ti has only 1.1. That might account for all the errors on that system: all works units errored out.

The other system seems ok, Not an expert, but I suspect that CPU tasks give up every 60 minutes for other CPU tasks and same for GPU tasks. No Milkyway tasks will give up time to a WCG task and vice-versa. They do not share the same resource. My opinion and worth 2c.

How many concurrent tasks are you running in the i7 system with the Quadro?

I suspect you should be able to run 4 concurrent tasks easily on the Quadro and each tasks can be allocated 0.25 a cpu. Pretty sure milkwway can get by with even 0.20 percent of a cpu. That should allow your other cores to easily run 4-5 WCG tasks or more

Maybe a project guru here can tell if OpenCL 1.1 works or not.

I keep WCG set to 0.0 share because they download so much is is not possible to complete. On my dual xeon with 24 threads I used to have 2-3 weeks of data and no possible way to finish. Suggest you go to the project and find the "% share resource" and set it to 0.0 for all their apps.
ID: 69550 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
julianop

Send message
Joined: 12 Oct 11
Posts: 7
Credit: 22,232,890
RAC: 2,593
Message 69552 - Posted: 19 Feb 2020, 4:39:03 UTC - in response to Message 69550.  

Thanks for your reply, Joseph. Umm.... I'm going to need a bit of clarification, if you'll bear with me...
First, there are no errors listed at all. Milkyway is plowing ahead with great gusto, completing "0.92CPUs + 1 NVIDIA GPU" tasks in 11 minutes 30 secs.

At this moment, there is a 6 CPU task running, and the aforementioned GPU task - both from Milkyway.

The WCG task uses regular processor cores, so I see no reason why the regular core tasks from Milkyway should swap out to each othe revery hour; otherwise how would I get WCG to ever run ?

I'm confused about setting "0.0" share. how does that result in any activity?

I've been running Boinc for years, but his whole thing makes no sense to me.
ID: 69552 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 69553 - Posted: 19 Feb 2020, 8:09:42 UTC - in response to Message 69552.  
Last modified: 19 Feb 2020, 8:13:15 UTC

Your systems are here
https://milkyway.cs.rpi.edu/milkyway/hosts_user.php?userid=182458
the one with the 560 shows all tasks errored, all 14


if you set resource to 0.0 then you only get 1 work unit for each cpu or thread. additional work units are downloaded only when one uploads

resiurces must be set at the project not on boinc itself

the .92 is lot higher than i guessed and i suspect no concurrent tasks would be useful
ID: 69553 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,349,483
RAC: 22,133
Message 69554 - Posted: 19 Feb 2020, 10:38:03 UTC - in response to Message 69553.  

Your systems are here
https://milkyway.cs.rpi.edu/milkyway/hosts_user.php?userid=182458
the one with the 560 shows all tasks errored, all 14


if you set resource to 0.0 then you only get 1 work unit for each cpu or thread. additional work units are downloaded only when one uploads

resiurces must be set at the project not on boinc itself

the .92 is lot higher than i guessed and i suspect no concurrent tasks would be useful


That 0.0 resource share only works using your version of the Boinc software for everyone else running gpu tasks here it doesn't work very well as there is a 10 minute backoff before getting new gpu tasks here. I do not believe cpu tasks are affected.
ID: 69554 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,349,483
RAC: 22,133
Message 69555 - Posted: 19 Feb 2020, 10:51:48 UTC - in response to Message 69549.  

I have World Community Grid tasks that are passing their due date, yet MilkyWay is hogging processor time when its projects aren't due for another five days. There hasn't been a task switch for twenty four hours (set for 60 minutes)
What's going on?


The easy answer is to set MilkyWay to no new tasks and then suspend it until the WCG tasks finish, although if you can't finish them in time you might as well abort them as I don't believe they give any extra time at all.

I think the problem is in how Boinc sees the resource share...it's something to aim for as a daily balance meaning if you set WCG at 50% and MW at 35% that's a total of 85% and Boinc will struggle to figure things out. BUT it will try to give you a daily RAC of 50% at WCG and 35% at MW. It sounds like your problem is that you have MW set at a higher or even equal to WCG resource share and since the WCG units are much longer MW is struggling to keep up, it has to crunch LOTS more workunits to get to the same daily RAC as you get less credits per workunit and since your workunit cache is high it's struggling. If you have cable internet I suggest your drop your cache settings to 0.5 in the 1st box and 0.5 in the 2nd box inside the Boinc Manager or on the website. That will give you slightly more than a 1 day cache, it usually works out to 1.5 to 1.75 days of work depending on the project. That should give you plenty of time for Boinc to do it's thing and swap as it wants too without making your workunits be returned late. If you don't have cable internet then you will need different cache settings as it will take longer to connect.
ID: 69555 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nick Name

Send message
Joined: 27 Jul 14
Posts: 23
Credit: 921,261,826
RAC: 0
Message 69557 - Posted: 19 Feb 2020, 18:02:33 UTC - in response to Message 69549.  

I have World Community Grid tasks that are passing their due date, yet MilkyWay is hogging processor time when its projects aren't due for another five days. There hasn't been a task switch for twenty four hours (set for 60 minutes)
What's going on?

None of this - resource share etc. - should matter once work is in the queue and nearing the deadline. If there are tasks that are close to the deadline BOINC should recognize that and start running them. In severe cases of deadline pressure it should also stop running GPU work to free up a thread for these tasks. I can only think of a few reasons why that isn't happening.

1) Project (WCG in this case) is suspended or its tasks are suspended.
2) The number of tasks allowed to run is limited some way, for example by an app_config using the project_max_concurrent tag. In such a case BOINC will max out the number of threads it's allowed to with other projects.
3) I've never run the N-body or any other CPU work here. Maybe those jobs are hung in some way and BOINC is unable to finish them. I view this as the least likely cause as BOINC should still be able to pause them and switch to higher priority work.

If none of these are causing the problem I'd suggest heading over to the general BOINC forums and post your question there.
Team USA forum | Team USA page
Always crunching / Always recruiting
ID: 69557 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,349,483
RAC: 22,133
Message 69558 - Posted: 19 Feb 2020, 23:32:41 UTC - in response to Message 69557.  

I have World Community Grid tasks that are passing their due date, yet MilkyWay is hogging processor time when its projects aren't due for another five days. There hasn't been a task switch for twenty four hours (set for 60 minutes)
What's going on?


None of this - resource share etc. - should matter once work is in the queue and nearing the deadline. If there are tasks that are close to the deadline BOINC should recognize that and start running them. In severe cases of deadline pressure it should also stop running GPU work to free up a thread for these tasks. I can only think of a few reasons why that isn't happening.


You are welcome to post your thoughts to the Boinc Alpha email list where all the software developers communicate with each other and the alpha testers.
ID: 69558 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : MilkyWay hogging processor time.

©2024 Astroinformatics Group