Welcome to MilkyWay@home

Micromanaging CPU vs GPU Workunit Limits


Advanced search

Message boards : News : Micromanaging CPU vs GPU Workunit Limits
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 330
50 million credit badge6 year member badgeextraordinary contributions badge
Message 68034 - Posted: 17 Jan 2019, 21:56:52 UTC

Hey Everyone,

I will be tweaking some config options on the server to better improve database stability and allow for stockpiling of more workunits by GPU users.

The new workunit limits should be as follows:
Separation:
600 total
200 GPU (Per GPU up to 600)
40 CPU (Per CPU up to 600)
Nbody:
120 total
20 CPU (Per CPU up to 120)


Hopefully you guys notice this on your end. If we notice we are running out of workunits more frequently on the server, I will increase the workunit cache a bit.

Let me know what you all think about these numbers and how it is working for you.

Jake
ID: 68034 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 330
50 million credit badge6 year member badgeextraordinary contributions badge
Message 68035 - Posted: 17 Jan 2019, 22:13:42 UTC
Last modified: 17 Jan 2019, 23:05:24 UTC

Hey Everyone,

So it is looking like the server is struggling to make enough workunits for everyone. I am going to try increasing the workunit cache size to 10000 temporarily to hopefully help.

Jake

Looks like we are holding stable with this number in reserve. I'm going to leave it at this over night and hopefully it stays good. I'll check here periodically in case there are any issues.
ID: 68035 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bluestang

Send message
Joined: 13 Oct 16
Posts: 88
Credit: 357,719,377
RAC: 1,031,638
300 million credit badge2 year member badge
Message 68038 - Posted: 18 Jan 2019, 1:28:49 UTC

Thanks for the hard work!
ID: 68038 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 356
Credit: 16,317,754
RAC: 0
10 million credit badge9 year member badge
Message 68040 - Posted: 18 Jan 2019, 10:38:33 UTC
Last modified: 18 Jan 2019, 10:40:38 UTC

Regarding n-body WUs: is there any hope, that we might get the possibility to choose wether we want to get multi core or single core tasks? Right now we have to use annonymous platform for this, but that means no automatic updates for the application.
.
ID: 68040 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 330
50 million credit badge6 year member badgeextraordinary contributions badge
Message 68043 - Posted: 18 Jan 2019, 15:30:29 UTC

Hi Link,

Can you explain a little more about what you want? Is there a reason you would prefer to run say 5 or 6 single-core runs instead of 1 multi-core run on 6 cores?

You can already limit the number of cores you crunch on through the BOINC manager so I am a little unsure what you are asking for, sorry.

Jake
ID: 68043 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 55
Credit: 6,181,530
RAC: 4,005
5 million credit badge2 year member badge
Message 68044 - Posted: 18 Jan 2019, 17:42:02 UTC - in response to Message 68043.  
Last modified: 18 Jan 2019, 17:47:29 UTC

Can you explain a little more about what you want? Is there a reason you would prefer to run say 5 or 6 single-core runs instead of 1 multi-core run on 6 cores?

Let me try. Single core work units are OK, and multi-core (for me) are OK. But mixing them is a mess for the BOINC scheduler. Your are left with some odds and ends that don't fit, and have to be finished eventually, usually in panic mode.

I can do one or the other, but not both.

EDIT: And the problem is not panic mode per se, but you are inevitably left running some cores empty. It is not an efficient use of resources.
ID: 68044 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 356
Credit: 16,317,754
RAC: 0
10 million credit badge9 year member badge
Message 68045 - Posted: 18 Jan 2019, 18:46:25 UTC - in response to Message 68043.  
Last modified: 18 Jan 2019, 18:50:24 UTC

Can you explain a little more about what you want? Is there a reason you would prefer to run say 5 or 6 single-core runs instead of 1 multi-core run on 6 cores?

No, I'd like to run just one multicore nbody WU on all 4 cores of my i3 and no single core WUs at all.

With BOINC v6, after starting a single core WU the client wouldn't start another single core WU unless it's the next one in the queue and let the other cores idle, since there were not enough cores for the multi-core WU, which wants all of them.

BOINC v7 starts the multi core WU together with the single core, but with 4 threads (or for example 2 threads if I allow it only to use 50% of CPUs), so it's using actually more cores than it's allowed to, it runs the multi core taks like there was no single core task running. Not a real issue in my case, however I can imagine this can be an issue for people, who need to keep some CPU cores free to feed their GPUs.

If you need more details, just ask. I tried getting only multicore WUs with the help of app_config.xml, but it didn't work.

EDIT: like Jim1348 said, nothing wrong with any of the WU types, just mixing them isn't good.
.
ID: 68045 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 75,271,794
RAC: 330
50 million credit badge6 year member badgeextraordinary contributions badge
Message 68046 - Posted: 18 Jan 2019, 19:50:43 UTC

I am thinking, why do we even have a single threaded application? How would everyone feel about having only multithreaded, and if you want to run it single threaded wu you just choose to have 1 CPU used? Would that work?

Jake
ID: 68046 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 55
Credit: 6,181,530
RAC: 4,005
5 million credit badge2 year member badge
Message 68047 - Posted: 18 Jan 2019, 20:29:42 UTC - in response to Message 68046.  

Would that work?

Yes, I was beginning to wonder why bother with single-threaded also.
ID: 68047 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 356
Credit: 16,317,754
RAC: 0
10 million credit badge9 year member badge
Message 68048 - Posted: 19 Jan 2019, 10:03:09 UTC - in response to Message 68046.  
Last modified: 19 Jan 2019, 10:08:56 UTC

Would that work?

Yes, at least with BOINC v7 clients (not sure starting from which version exactly), older clients will let the application use all cores. At least that what I have seen with v 6.10.18 and v6.12.34. The BOINC client I use now, v7.12.1, starts the mt application with as many thraeds as it is allowed to use, it will only use too many cores, if some single core tasks are running.

I'm not sure if the server will send the multi core application to computers that are allowed to use just one core, but probably yes since it was still sending single core WUs to my laptop even if according to my app_config they were using more cores than BOINC was allowed to use (this information is passed to the server in sched_request_milkyway.cs.rpi.edu_milkyway.xml). And the BOINC client also didn't care that the WU used 8 CPUs even if I allowed to use only 2.

So yes, I think that will work not be worse than now... I mean, if some problems occur, you can always return to two different applications.
.
ID: 68048 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 289,978,673
RAC: 1,145,392
200 million credit badge10 year member badgeextraordinary contributions badge
Message 68050 - Posted: 21 Jan 2019, 1:03:37 UTC - in response to Message 68046.  

I am thinking, why do we even have a single threaded application? How would everyone feel about having only multithreaded, and if you want to run it single threaded wu you just choose to have 1 CPU used? Would that work?

Jake


If you do I think alot of people will leave MW, I personally hate running multi-core wu's as they never seem to work well with other projects when you guys for example run out of work. I would much rather run 6 workunits on 6 different cpu cores than 1 wu on 6 cpu cores at once. The other problem I have is the wu's don't work right on at least some peoples computers, I know you keep tweaking the program but it just doesn't work right on every pc, BUT the single core wu's work just fine.

Boinc also seems to have problems running multi-core wu's in general...I do run multi-core wu's on another project, running them as a single core wu takes 12 days or more, but since you can't set it for 3.5 cores for each wu I'm stuck with using 3 cores for each wu and then having 2 cores free, on an 8 cpu core machine, when I also run a wu on the gpu. Boinc itself won't fill the remaining cores with wu's because the setting is 3 cores per wu! That's a very inefficient way to run a pc!! I try to always leave one cpu core free for the gpu to use, leaving me 7 cpu cores to crunch cpu wu's, on that 8 core cpu machine. On a 6 cpu core machine it leaves me 5 cpu cores to crunch with again making it a problem. And yes I could set it for 7 or 5 cpu cores per wu but then we are back to the problem of your multi-threaded software not working for everyone.
ID: 68050 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 55
Credit: 6,181,530
RAC: 4,005
5 million credit badge2 year member badge
Message 68051 - Posted: 21 Jan 2019, 1:59:06 UTC - in response to Message 68050.  

I would much rather run 6 workunits on 6 different cpu cores than 1 wu on 6 cpu cores at once.

Then you could just set the MW to run on a single core. That is what Jake was saying. With a multi-threaded application, you can set it to run on as many cores as you want, including one.

They would probably include an easy to set option in the preferences page in that case, such as LHC and Cosmology (usually called "Max # CPUs").
ID: 68051 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 356
Credit: 16,317,754
RAC: 0
10 million credit badge9 year member badge
Message 68052 - Posted: 21 Jan 2019, 16:08:02 UTC - in response to Message 68050.  

Boinc itself won't fill the remaining cores with wu's because the setting is 3 cores per wu! That's a very inefficient way to run a pc!!

As descibed above the current mix of single core and multi core WUs is very inefficient as well, in some cases you may end up with just one core in use.


If you do I think alot of people will leave MW, I personally hate running multi-core wu's as they never seem to work well with other projects when you guys for example run out of work.

Well, the current situation is even more a reason to search for some other work for the CPU, I mean right now milkyway nbody does not even work well with itself. But my initial idea was to give the users the possibility to choose the application since I know the complains from number crunching forum about the mt application.
.
ID: 68052 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 289,978,673
RAC: 1,145,392
200 million credit badge10 year member badgeextraordinary contributions badge
Message 68053 - Posted: 22 Jan 2019, 13:14:47 UTC - in response to Message 68051.  

I would much rather run 6 workunits on 6 different cpu cores than 1 wu on 6 cpu cores at once.

Then you could just set the MW to run on a single core. That is what Jake was saying. With a multi-threaded application, you can set it to run on as many cores as you want, including one.

They would probably include an easy to set option in the preferences page in that case, such as LHC and Cosmology (usually called "Max # CPUs").


I have 17 desktops that crunch for me, most have different cpu's in them so unless MW is going to be like PrimeGrid and add a whole bunch of venues I can't do that for my pc's. Setting the number of cpu's to 5 works fine for a 6 core pc but won't work for an 8 core, a 16 core or even a 24 core pc! I would very quickly run out of venues and then if I were to crunch Einstein on the gpu in a pc since they automatically reserve a cpu core just for the gpu to use no MW units would run at all. AND as I said before the MT wu's all crash for me, no idea why they just do.

It's no my Project and I'm just crunching here so they can do what they want to do, but if they go to only MT wu's I won't be crunching cpu wu's here any more.
ID: 68053 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 55
Credit: 6,181,530
RAC: 4,005
5 million credit badge2 year member badge
Message 68054 - Posted: 22 Jan 2019, 18:15:12 UTC - in response to Message 68053.  

I have 17 desktops that crunch for me, most have different cpu's in them so unless MW is going to be like PrimeGrid and add a whole bunch of venues I can't do that for my pc's. Setting the number of cpu's to 5 works fine for a 6 core pc but won't work for an 8 core, a 16 core or even a 24 core pc! I would very quickly run out of venues and then if I were to crunch Einstein on the gpu in a pc since they automatically reserve a cpu core just for the gpu to use no MW units would run at all. AND as I said before the MT wu's all crash for me, no idea why they just do.

You can always do it the hard way: use an app_config.xml file in the project folder. You can then set it to use only one core per work unit.

<app_config>
<app>
<name>milkyway_nbody</name>
<max_concurrent>8</max_concurrent>
</app>
<app_version>
<app_name>milkyway_nbody</app_name>
<plan_class>mt</plan_class>
<avg_ncpus>1</avg_ncpus>
</app_version>
</app_config>


Then, set the "<max_concurrent>" if you want to limit the number of work units running at once, and set "<avg_ncpus>" to use the number of cores per work unit.
Hopefully, that will prevent the crashing (I have never seen it though, and don't know what might be causing it).
ID: 68054 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 289,978,673
RAC: 1,145,392
200 million credit badge10 year member badgeextraordinary contributions badge
Message 68055 - Posted: 23 Jan 2019, 0:39:42 UTC - in response to Message 68054.  

I have 17 desktops that crunch for me, most have different cpu's in them so unless MW is going to be like PrimeGrid and add a whole bunch of venues I can't do that for my pc's. Setting the number of cpu's to 5 works fine for a 6 core pc but won't work for an 8 core, a 16 core or even a 24 core pc! I would very quickly run out of venues and then if I were to crunch Einstein on the gpu in a pc since they automatically reserve a cpu core just for the gpu to use no MW units would run at all. AND as I said before the MT wu's all crash for me, no idea why they just do.

You can always do it the hard way: use an app_config.xml file in the project folder. You can then set it to use only one core per work unit.

<app_config>
<app>
<name>milkyway_nbody</name>
<max_concurrent>8</max_concurrent>
</app>
<app_version>
<app_name>milkyway_nbody</app_name>
<plan_class>mt</plan_class>
<avg_ncpus>1</avg_ncpus>
</app_version>
</app_config>


Then, set the "<max_concurrent>" if you want to limit the number of work units running at once, and set "<avg_ncpus>" to use the number of cores per work unit.
Hopefully, that will prevent the crashing (I have never seen it though, and don't know what might be causing it).


Nope that didn't work for me, I think it's just a Win10 problem with my not made for Win10 hardware. The pc's crunches everything else except Climate Prediction but I've heard they have a higher than average error rate anyway so I have no clue. But my point is if the NBody wu's don't work for everyone then going to it exclusively means lost crunchers.

For me my gpu's work here just fine, it's just my cpu's that only work one way.
ID: 68055 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 55
Credit: 6,181,530
RAC: 4,005
5 million credit badge2 year member badge
Message 68056 - Posted: 23 Jan 2019, 1:41:15 UTC - in response to Message 68055.  
Last modified: 23 Jan 2019, 2:31:58 UTC

But my point is if the NBody wu's don't work for everyone then going to it exclusively means lost crunchers.

They have already lost me, as I posted a couple of weeks ago. I won't go back until they have all multi-thread, or all single-thread. I don't care which.
But not all projects work for all crunchers. That is always the case.
ID: 68056 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 289,978,673
RAC: 1,145,392
200 million credit badge10 year member badgeextraordinary contributions badge
Message 68057 - Posted: 23 Jan 2019, 12:21:56 UTC - in response to Message 68056.  

But my point is if the NBody wu's don't work for everyone then going to it exclusively means lost crunchers.


They have already lost me, as I posted a couple of weeks ago. I won't go back until they have all multi-thread, or all single-thread. I don't care which.
But not all projects work for all crunchers. That is always the case.


You do know you can turn each kind off individually in the settings...right?

Run only the selected applications MilkyWay@Home: yes
MilkyWay@Home N-Body Simulation: no

Those are my settings here and as you can see I turned the n-body wu's off so I never get any. There are separate settings for the different kinds of gpu's too but they only run one wu at a time anyway.
ID: 68057 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 55
Credit: 6,181,530
RAC: 4,005
5 million credit badge2 year member badge
Message 68058 - Posted: 23 Jan 2019, 15:27:01 UTC - in response to Message 68057.  

You do know you can turn each kind off individually in the settings...right?

Run only the selected applications MilkyWay@Home: yes
MilkyWay@Home N-Body Simulation: no

But that doesn't select between single-core and multi-core N-Body, does it?
ID: 68058 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bluestang

Send message
Joined: 13 Oct 16
Posts: 88
Credit: 357,719,377
RAC: 1,031,638
300 million credit badge2 year member badge
Message 68059 - Posted: 24 Jan 2019, 0:37:08 UTC

Hi Jake,

Even with the 600 max WU limit for GPU, I was about 10 min away from running out of work on my Radeon 280X GPU. Running 3 concurrent WUs at a time and with daily downtime like this warrants maybe doubling that limit easily. Maybe increase the per 200 per GPU too? If I fire up another 280X or 2 in that rig on MilkyWay, then I'd burn through those WUs way before the server/database comes back for connection again.

I'm not complaining, these are nice improvements you've made. I especially like the bundle5_3s WUs your using now. But I want to through some serious GPU horsepower at this project and I'm afraid of running out of cached WUs if I do that.

Thanks,
blue
ID: 68059 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : News : Micromanaging CPU vs GPU Workunit Limits

©2019 Astroinformatics Group