Welcome to MilkyWay@home

Credit and CPU utilisation of NBody_104

Message boards : Number crunching : Credit and CPU utilisation of NBody_104
Message board moderation

To post messages, you must log in.

AuthorMessage
Freeze_XJ

Send message
Joined: 18 Aug 12
Posts: 4
Credit: 14,849,001
RAC: 0
Message 56718 - Posted: 4 Jan 2013, 13:01:33 UTC
Last modified: 4 Jan 2013, 13:14:25 UTC

When I'm looking in my awarded credits for the fresh NBody runs, they are given only CPU time according to the list, however when monitoring them, they do run on the GPU (though lightly), and do not consume much CPU-power (1% or so). As expected the awarded credits are abysmally low, see the example. I'm wondering if this is just a stats bug, or an issue with scheduling.
[edit] After some checking, the 104s DO use a lot of CPU, and no GPU, even though it says OpenCL. They're quite nice multicore though. [/edit]

374707636 291018734 462964 4 Jan 2013 | 10:51:03 UTC 4 Jan 2013 | 11:55:33 UTC Completed and validated 73.66 3.36 213.76 MilkyWay@Home v1.02 (opencl_amd_ati)
374706893 291018208 462964 4 Jan 2013 | 10:53:21 UTC 4 Jan 2013 | 11:59:49 UTC Completed and validated 224.07 224.07 2.04 MilkyWay@Home N-Body Simulation v1.04 (opencl_amd_ati)
(numbers after 'validated' are <total time> <CPU time> <awarded credit> )

ps: it seems that many many NBodies are either erroring out, or returning different results, so my 'pending' list has grown, although there are a lot of common v1.02s in there.
ID: 56718 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 28 Aug 07
Posts: 133
Credit: 29,423,179
RAC: 0
Message 56722 - Posted: 4 Jan 2013, 14:39:07 UTC

I just looked those nvidia-nbodys up in my WU-list, and saw that they get extremely poor credits for a GPU-WU. Here's an example with paallel CPU-results (290919767):
374569179 	256454 	4 Jan 2013 | 5:35:27 UTC 	4 Jan 2013 | 5:50:05 UTC 	Fertig und Bestätigt 	198.13 	195.52 	1.31 	MilkyWay@Home N-Body Simulation v1.04
374579051 	449717 	4 Jan 2013 | 5:55:27 UTC 	4 Jan 2013 | 10:04:38 UTC 	Fertig und Bestätigt 	197.60 	197.60 	1.31 	MilkyWay@Home N-Body Simulation v1.04 (opencl_nvidia)
374676981 	456675 	4 Jan 2013 | 10:07:22 UTC 	4 Jan 2013 | 10:16:00 UTC 	Fehler beim Berechnen 	0.00 	0.00 	--- 	MilkyWay@Home N-Body Simulation v1.04
374684677 	475680 	4 Jan 2013 | 10:19:42 UTC 	4 Jan 2013 | 10:53:07 UTC 	Fertig und Bestätigt 	189.65 	187.81 	1.31 	MilkyWay@Home N-Body Simulation v1.04 


They all took about the same amount of time on the CPU and on the clock, but mine probably used the GPU as well, 'though I can't say, as it crunched while I was away from the machine.

There are two possibilities:

    -> They really don't use the GPU, then they should never be sent as GPU-WUs
    -> They use the GPU, but are so poorly programmed, that they can't make any real use of it, which should not happen as well
    -> They use the GPU for crunching, and really do work on the GPU, so they should be awarded much more credits compared with CPU-only.



If all results of one WU should do the same work, the GPU-ones are a waste of ressources.


Grüße vom Sänger
ID: 56722 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Bauer
Project developer
Project tester
Project scientist

Send message
Joined: 20 Aug 12
Posts: 66
Credit: 406,916
RAC: 0
Message 56728 - Posted: 4 Jan 2013, 17:54:21 UTC - in response to Message 56722.  

N-Body does not use the GPU at this time, as the GPU code is still in development.

Jake
ID: 56728 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Miklos M

Send message
Joined: 29 Dec 11
Posts: 26
Credit: 1,456,736,094
RAC: 0
Message 56729 - Posted: 4 Jan 2013, 17:59:15 UTC - in response to Message 56722.  

Mine was trying to run on CPU, but after 8 hours it was taking too long.
ID: 56729 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 28 Aug 07
Posts: 133
Credit: 29,423,179
RAC: 0
Message 56735 - Posted: 4 Jan 2013, 22:39:21 UTC - in response to Message 56728.  

N-Body does not use the GPU at this time, as the GPU code is still in development.

Jake


Then why do you send them out, and block the GPU from doing real work?
Grüße vom Sänger
ID: 56735 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pvh

Send message
Joined: 8 Feb 10
Posts: 23
Credit: 513,143,911
RAC: 0
Message 56744 - Posted: 5 Jan 2013, 8:33:46 UTC

I see the same thing here. No GPU usage and dismally low credits. One example:

run time: 1,037.00
CPU time: 11,426.18
credit: 2.93

This is a bad joke. I have the project set up to _only_ receive GPU tasks, so even if the credits had been OK, I would not want to get these work units as they are CPU only. I will disable this project until this is sorted.

Please stop sending these work units as fake GPU tasks immediately!
ID: 56744 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pvh

Send message
Joined: 8 Feb 10
Posts: 23
Credit: 513,143,911
RAC: 0
Message 56745 - Posted: 5 Jan 2013, 9:09:48 UTC

Another problem with these nbody tasks is that they are parallel tasks (presumably openMP) and by default use all the cores they can get. However, my version of BOINC (7.0.28 for linux) doesn't realize that that is happening and assumes the nbody WU uses a single core. Thus it loads the remaining N-1 cores with other WUs. This is bad news for the parallel task as having it compete with other jobs for the same cores is generally a Really Bad Idea. This can significantly slow down the parallel job. This depends on how the code is written, but in most cases there will be a slow-down.

So in my opinion these parallel nbody WUs should be kept on ice anyway until BOINC is able to handle parallel WUs correctly.
ID: 56745 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 Sep 12
Posts: 219
Credit: 456,474
RAC: 0
Message 56748 - Posted: 5 Jan 2013, 17:06:39 UTC - in response to Message 56745.  

Another problem with these nbody tasks is that they are parallel tasks (presumably openMP) and by default use all the cores they can get. However, my version of BOINC (7.0.28 for linux) doesn't realize that that is happening and assumes the nbody WU uses a single core. Thus it loads the remaining N-1 cores with other WUs. This is bad news for the parallel task as having it compete with other jobs for the same cores is generally a Really Bad Idea. This can significantly slow down the parallel job. This depends on how the code is written, but in most cases there will be a slow-down.

So in my opinion these parallel nbody WUs should be kept on ice anyway until BOINC is able to handle parallel WUs correctly.

That seems to be a Linux-specific set of problems.

The Windows version is clearly coded with multi-core usage in mind, but as the line in stderr_txt says,

Using OpenMP 1 max threads on a system with 8 processors

only one thread is actually being used.

Some of the tasks could clearly use some extra threads in support (I have one running on a 4.5GHz i7 which looks as if it's going to take about 30 hours), but it would need to be better plumbed into the BOINC infrastructure before that happens - they would certainly need a _mt_ <plan_class> to tell BOINC how to schedule NBody against single-core tasks from other projects, and I suspect they would also need a <cmdline>--nthreads to tell OpenMP how many cores to use.
ID: 56748 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Credit and CPU utilisation of NBody_104

©2024 Astroinformatics Group