GPU app teaser

Author	Message
Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 11018 - Posted: 16 Feb 2009, 14:49:21 UTC - in response to Message 11017. Also, I can raise the workunit-per-cpu limit. What would be a good value? 3600*24/9 is up to ~10K WUs per day on HD4870. Too bad BOINC is still far from ready for GPUs. I would have suggested to raise WU limit only for hosts with GPUs and distribute WUs with pretty short deadline or extra large ones for such hosts... I think one of the recent updates to the BOINC server code allows for a separate daily WU queue for GPUs. I'll do a little looking into it and if thats the case then we can give the GPUs a 10k daily limit without touching the other one. ID: 11018 · Rating: 0 · rate: / Reply Quote

Daniel Send message Joined: 25 Nov 07 Posts: 25 Credit: 54,443,893 RAC: 0	Message 11019 - Posted: 16 Feb 2009, 14:54:51 UTC Sounds like a good idea to me. ID: 11019 · Rating: 0 · rate: / Reply Quote

Honza Send message Joined: 28 Aug 07 Posts: 31 Credit: 86,152,236 RAC: 0	Message 11023 - Posted: 16 Feb 2009, 15:09:39 UTC - in response to Message 11018. Last modified: 16 Feb 2009, 15:10:41 UTC I think one of the recent updates to the BOINC server code allows for a separate daily WU queue for GPUs. Well, it may (i don't know). But I known that even latest BOINC client 6.6.7 still doesn't recognize GPUs (means both nVidia and ATI/AMD GPUs), only CUDA capable devices. (not only) MW would benefit a lot from support of ATI GPUs under BOINC, especially those capable of double precision... BOINC Project specifications and hardware requirements ID: 11023 · Rating: 0 · rate: / Reply Quote

Cluster Physik Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0	Message 11024 - Posted: 16 Feb 2009, 15:09:56 UTC - in response to Message 11007. Last modified: 16 Feb 2009, 15:11:29 UTC avg_ncpus set to 0.1 max_ncpus set to 3 (core2duo) Seems to work very fine, the Milkyway units are still completed in 8 or 10 seconds. It's the same time when I used 0.50 core, is it "normal" ? Too bad we have the 1000 workunit-per-cpu limit, is it possible to send a request somewhere to remove this limit when using an optimized app ? This limit is now obsolete when you can calculate 3 or 4 times more units in a day with an optimized app. The only reason I'm moving a computer on milkyway is to help you for your excellent work on ATI graphic cards, but the credits are not very interesting ^^ THank you for this app, anyway ^^ You should not set max_ncpus to another value than exactly 1. It is the maximal number of cores a single WU can use. As the app is single threaded it can't use more than a core. That the WUs are taking the same time no matter how many WUs are running concurrently is perfectly normal. There is probably a slight increase in efficiency (maybe 5%) if two WUs are running compared to a single one. The reason is that you can carry out the few calculations still necessary on the CPU in the time when another WU is waiting for the GPU. But more than two WUs won't help more (but don't hurt either). You will have a throughput of about one WU per 9.x seconds either way on a HD4870. But there is a limit on the number of concurrent WUs. If you try to run more than 12-16 (~30) WUs on a 512MB (1GB) card, it starts to get slower and finally breaks, because there is not enough memory on the card. In the moment there is no mechanism to check for available RAM on the card. You shouldn't set avg_cpus to very low values to avoid this situation. PS: I guess the credit situation gets better if the limits are lifted by Travis ;) ID: 11024 · Rating: 0 · rate: / Reply Quote

banditwolf Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0	Message 11031 - Posted: 16 Feb 2009, 16:52:04 UTC - in response to Message 11018. I think one of the recent updates to the BOINC server code allows for a separate daily WU queue for GPUs. I'll do a little looking into it and if thats the case then we can give the GPUs a 10k daily limit without touching the other one. I think you need to make sure that there is plenty of work to do if many of these are ran. Might be time for server #2. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. ID: 11031 · Rating: 0 · rate: / Reply Quote

Exar Kun [HoloNet] Send message Joined: 12 Nov 08 Posts: 26 Credit: 1,542,686 RAC: 0	Message 11038 - Posted: 16 Feb 2009, 19:28:27 UTC max_ncpus back to "1". avg_ncpus set to 0.25. If I keep the default value of 0.50, I have only three WU running at the same time : two optimized MW and one World Community Grid. With avg_ncpus to 0.25 I have four workunits (two for each project) Travis > thank you very much for your answer about the credits, it's a very good news. For maximum number of units per core, I can say that : with a Core2Duo and a HD4850 512Mo, I crunched 2000 workunits in 6 hours (with no other projet running at the same time, so the CPU wasn't at full use). So 4.000 workunits per core and day will be enough for this configuration (my computer is working "only" 12 hours per day or so, so 4.000 WU per core and day is the very, very maximum). I don't know how many workunits can be crunched with another model of ATI. Cluster > If I understand correctly, your optimization uses "only" one core at a time, is that right ? Is it possible to use more core, so we can use only Milkyway on one computer with more than one core ? Do you need something special to help you for your tests ? PS : during this writing I reached my 2.000 workunits limit - later than before, probably because I tried avg_ncpus to 0.10 ... Star Wars BOINC Team ID: 11038 · Rating: 0 · rate: / Reply Quote

bobgoblin Send message Joined: 8 Dec 07 Posts: 60 Credit: 67,028,931 RAC: 0	Message 11041 - Posted: 16 Feb 2009, 20:16:42 UTC - in response to Message 11038. max_ncpus back to "1". avg_ncpus set to 0.25. If I keep the default value of 0.50, I have only three WU running at the same time : two optimized MW and one World Community Grid. With avg_ncpus to 0.25 I have four workunits (two for each project) Travis > thank you very much for your answer about the credits, it's a very good news. For maximum number of units per core, I can say that : with a Core2Duo and a HD4850 512Mo, I crunched 2000 workunits in 6 hours (with no other projet running at the same time, so the CPU wasn't at full use). So 4.000 workunits per core and day will be enough for this configuration (my computer is working "only" 12 hours per day or so, so 4.000 WU per core and day is the very, very maximum). I don't know how many workunits can be crunched with another model of ATI. Cluster > If I understand correctly, your optimization uses "only" one core at a time, is that right ? Is it possible to use more core, so we can use only Milkyway on one computer with more than one core ? Do you need something special to help you for your tests ? PS : during this writing I reached my 2.000 workunits limit - later than before, probably because I tried avg_ncpus to 0.10 ... With the gpu, I can crunch 16 @ a time with the i7, so 10k limit per core, or 80,000 in my case, would be a more realistic than 4000. ID: 11041 · Rating: 0 · rate: / Reply Quote

Cluster Physik Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0	Message 11055 - Posted: 16 Feb 2009, 22:26:08 UTC - in response to Message 11038. Cluster > If I understand correctly, your optimization uses "only" one core at a time, is that right ? Is it possible to use more core, so we can use only Milkyway on one computer with more than one core ? The goal is actually to use not a full core (or even more), but maybe only 10% of a core or so. This way your CPU would be free to crunch something else. If it is really wanted I could put in support for simultaneous crunching of MW on GPU and CPU. But this would have a low priority on my list. ID: 11055 · Rating: 0 · rate: / Reply Quote

Cluster Physik Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0	Message 11057 - Posted: 16 Feb 2009, 22:33:20 UTC - in response to Message 11041. With the gpu, I can crunch 16 @ a time with the i7, so 10k limit per core, or 80,000 in my case, would be a more realistic than 4000. But the throughput is still be one WU every 9 seconds or so with a HD4870. It is not getting faster with more concurrent WUs. So with a HD4870 a limit of 10,000 WUs a day would be enough as long there is no multi GPU support implemented (or massive overclocking involved). I would say 10,000 WUs per host and day are needed now. When multiple cards are working and/or newer GPUs are available, this needs to be raised again. ID: 11057 · Rating: 0 · rate: / Reply Quote

GalaxyIce Send message Joined: 6 Apr 08 Posts: 2018 Credit: 100,142,856 RAC: 0	Message 11058 - Posted: 16 Feb 2009, 22:38:37 UTC - in response to Message 11055. Cluster > If I understand correctly, your optimization uses "only" one core at a time, is that right ? Is it possible to use more core, so we can use only Milkyway on one computer with more than one core ? The goal is actually to use not a full core (or even more), but maybe only 10% of a core or so. This way your CPU would be free to crunch something else. If it is really wanted I could put in support for simultaneous crunching of MW on GPU and CPU. But this would have a low priority on my list. What I would like to see is the ability to use my Nvidia in MW. I know you are using an ATI since it is faster, but I only have the Nvidia which I'd be interested in transferring from crunching on GPUGRID to MW. ID: 11058 · Rating: 0 · rate: / Reply Quote

Cluster Physik Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0	Message 11060 - Posted: 16 Feb 2009, 22:42:37 UTC - in response to Message 11058. Last modified: 16 Feb 2009, 22:43:38 UTC Cluster > If I understand correctly, your optimization uses "only" one core at a time, is that right ? Is it possible to use more core, so we can use only Milkyway on one computer with more than one core ? The goal is actually to use not a full core (or even more), but maybe only 10% of a core or so. This way your CPU would be free to crunch something else. If it is really wanted I could put in support for simultaneous crunching of MW on GPU and CPU. But this would have a low priority on my list. What I would like to see is the ability to use my Nvidia in MW. I know you are using an ATI since it is faster, but I only have the Nvidia which I'd be interested in transferring from crunching on GPUGRID to MW. Afaik, there is already a student starting to work on a CUDA app. As this is easier to work with, I guess we could see some results soon ;) But don't expect times much below 25s per WU for nvidias GTX line. And older ones won't work at all (lack of double precision units). ID: 11060 · Rating: 0 · rate: / Reply Quote

GalaxyIce Send message Joined: 6 Apr 08 Posts: 2018 Credit: 100,142,856 RAC: 0	Message 11061 - Posted: 16 Feb 2009, 22:51:43 UTC - in response to Message 11060. What I would like to see is the ability to use my Nvidia in MW. I know you are using an ATI since it is faster, but I only have the Nvidia which I'd be interested in transferring from crunching on GPUGRID to MW. Afaik, there is already a student starting to work on a CUDA app. As this is easier to work with, I guess we could see some results soon ;) But don't expect times much below 25s per WU for nvidias GTX line. And older ones won't work at all (lack of double precision units). 25 secs? Blimey, hurry up student :) ID: 11061 · Rating: 0 · rate: / Reply Quote

Cluster Physik Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0	Message 11064 - Posted: 16 Feb 2009, 23:03:04 UTC - in response to Message 11061. 25 secs? Blimey, hurry up student :) But that is soooo slooooow compared to the less than 10 seconds on ATIs HD4870 ;) ID: 11064 · Rating: 0 · rate: / Reply Quote

Temujin Send message Joined: 12 Oct 07 Posts: 77 Credit: 404,471,187 RAC: 0	Message 11067 - Posted: 16 Feb 2009, 23:16:07 UTC - in response to Message 11064. 25 secs? Blimey, hurry up student :) But that is soooo slooooow compared to the less than 10 seconds on ATIs HD4870 ;) 25 sec is good enough for me, hurry up student :-) ID: 11067 · Rating: 0 · rate: / Reply Quote

Cori Send message Joined: 27 Aug 07 Posts: 647 Credit: 27,592,547 RAC: 0	Message 11068 - Posted: 16 Feb 2009, 23:17:58 UTC - in response to Message 11067. 25 secs? Blimey, hurry up student :) But that is soooo slooooow compared to the less than 10 seconds on ATIs HD4870 ;) 25 sec is good enough for me, hurry up student :-) Rats, I need a new graphics card finally! Lovely greetings, Cori ID: 11068 · Rating: 0 · rate: / Reply Quote

bobgoblin Send message Joined: 8 Dec 07 Posts: 60 Credit: 67,028,931 RAC: 0	Message 11081 - Posted: 17 Feb 2009, 0:38:40 UTC - in response to Message 11057. With the gpu, I can crunch 16 @ a time with the i7, so 10k limit per core, or 80,000 in my case, would be a more realistic than 4000. But the throughput is still be one WU every 9 seconds or so with a HD4870. It is not getting faster with more concurrent WUs. So with a HD4870 a limit of 10,000 WUs a day would be enough as long there is no multi GPU support implemented (or massive overclocking involved). I would say 10,000 WUs per host and day are needed now. When multiple cards are working and/or newer GPUs are available, this needs to be raised again. oh, i agree with that too. the turn around time for the gpu app was about 2 1/2 minutes. i've been running the op app this week and it's crunching 8 wu's in 6 minutes since the .19's came out, so that limit needs to go much higher as well. ID: 11081 · Rating: 0 · rate: / Reply Quote

Cluster Physik Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0	Message 11087 - Posted: 17 Feb 2009, 0:54:19 UTC - in response to Message 11081. With the gpu, I can crunch 16 @ a time with the i7, so 10k limit per core, or 80,000 in my case, would be a more realistic than 4000. But the throughput is still be one WU every 9 seconds or so with a HD4870. It is not getting faster with more concurrent WUs. So with a HD4870 a limit of 10,000 WUs a day would be enough as long there is no multi GPU support implemented (or massive overclocking involved). I would say 10,000 WUs per host and day are needed now. When multiple cards are working and/or newer GPUs are available, this needs to be raised again. oh, i agree with that too. the turn around time for the gpu app was about 2 1/2 minutes. i've been running the op app this week and it's crunching 8 wu's in 6 minutes since the .19's came out, so that limit needs to go much higher as well. It should be enough for your i7, as the current limit is 1,000 WUs per day and core/thread. That means on your i7 you have actually 8,000 WUs a day to play with. You won't come close to that limit with the CPU alone, but it will last for 21 hours a day only on the GPU ;) ID: 11087 · Rating: 0 · rate: / Reply Quote

Travis Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0	Message 11101 - Posted: 17 Feb 2009, 2:04:30 UTC - in response to Message 11060. Cluster > If I understand correctly, your optimization uses "only" one core at a time, is that right ? Is it possible to use more core, so we can use only Milkyway on one computer with more than one core ? The goal is actually to use not a full core (or even more), but maybe only 10% of a core or so. This way your CPU would be free to crunch something else. If it is really wanted I could put in support for simultaneous crunching of MW on GPU and CPU. But this would have a low priority on my list. What I would like to see is the ability to use my Nvidia in MW. I know you are using an ATI since it is faster, but I only have the Nvidia which I'd be interested in transferring from crunching on GPUGRID to MW. Afaik, there is already a student starting to work on a CUDA app. As this is easier to work with, I guess we could see some results soon ;) But don't expect times much below 25s per WU for nvidias GTX line. And older ones won't work at all (lack of double precision units). Yeah hopefully within the next week or two we'll have an alpha CUDA application for you guys to crash :D ID: 11101 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0	Message 11116 - Posted: 17 Feb 2009, 6:54:30 UTC - in response to Message 11101. Last modified: 17 Feb 2009, 6:55:37 UTC Yeah hopefully within the next week or two we'll have an alpha CUDA application for you guys to crash :D Well, I have one GTX 280 and 2 GTX 295s ... start your engines ... Of course we will need a setting on the site to get only CPU work, only CUDA work ... or both ... ID: 11116 · Rating: 0 · rate: / Reply Quote

[AF>HFR>RR] ThierryH Send message Joined: 2 Jan 08 Posts: 23 Credit: 495,882,464 RAC: 0	Message 11132 - Posted: 17 Feb 2009, 10:02:06 UTC - in response to Message 11116. Yeah hopefully within the next week or two we'll have an alpha CUDA application for you guys to crash :D Well, I have one GTX 280 and 2 GTX 295s ... start your engines ... Of course we will need a setting on the site to get only CPU work, only CUDA work ... or both ... It's effectively important to have both options. ID: 11132 · Rating: 0 · rate: / Reply Quote