rpi_logo
Now that we have native ATI GPU support, how about longer tasks?
Now that we have native ATI GPU support, how about longer tasks?
log in

Advanced search

Message boards : Number crunching : Now that we have native ATI GPU support, how about longer tasks?

Author Message
zombie67 [MM]
Avatar
Send message
Joined: 29 Aug 07
Posts: 115
Credit: 257,878,412
RAC: 3

Message 35703 - Posted: 16 Jan 2010, 6:20:12 UTC
Last modified: 16 Jan 2010, 6:22:38 UTC

Thanks for implementing native ATI support!

And now that we have it, how about issuing tasks exclusively for ATI GPUs that run (say) for an hour (on a 4870)?

No change in credits/hour. That way we can fill up a normal queue of work lasting (say) a day or two.

In addition to letting us to weather downtime or network issues, it would DRAMATICALLY drop the load on the project server and network load.
____________

Bigred
Avatar
Send message
Joined: 23 Nov 07
Posts: 33
Credit: 300,042,542
RAC: 0

Message 35705 - Posted: 16 Jan 2010, 18:48:32 UTC - in response to Message 35703.

Thanks for implementing native ATI support!

And now that we have it, how about issuing tasks exclusively for ATI GPUs that run (say) for an hour (on a 4870)?

No change in credits/hour. That way we can fill up a normal queue of work lasting (say) a day or two.

In addition to letting us to weather downtime or network issues, it would DRAMATICALLY drop the load on the project server and network load.


Sounds like a good idea to me but it should be for all GPUs not just the ATIs.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 35718 - Posted: 17 Jan 2010, 4:03:12 UTC - in response to Message 35703.

Thanks for implementing native ATI support!

And now that we have it, how about issuing tasks exclusively for ATI GPUs that run (say) for an hour (on a 4870)?

No change in credits/hour. That way we can fill up a normal queue of work lasting (say) a day or two.

In addition to letting us to weather downtime or network issues, it would DRAMATICALLY drop the load on the project server and network load.


I think that might really take some new science from the astronomers. We do have a change in the works that should increase the compute time by (hopefully) another factor of 2 - 4. Once we get the server side GPU issues settled, we'll be releasing that.
____________

Brian Silvers
Send message
Joined: 21 Aug 08
Posts: 625
Credit: 558,425
RAC: 0

Message 35721 - Posted: 17 Jan 2010, 4:43:02 UTC - in response to Message 35718.

Thanks for implementing native ATI support!

And now that we have it, how about issuing tasks exclusively for ATI GPUs that run (say) for an hour (on a 4870)?

No change in credits/hour. That way we can fill up a normal queue of work lasting (say) a day or two.

In addition to letting us to weather downtime or network issues, it would DRAMATICALLY drop the load on the project server and network load.


I think that might really take some new science from the astronomers. We do have a change in the works that should increase the compute time by (hopefully) another factor of 2 - 4. Once we get the server side GPU issues settled, we'll be releasing that.


In my opinion, you should consider looking at whether or not there is a way to use Homogeneous Redundancy classes to separate GPU from CPU and give GPUs the longer task and leave the shorter tasks to CPUs.

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0

Message 35725 - Posted: 17 Jan 2010, 5:12:43 UTC

I also think we should ask Gipsel/Cluster Physik nicely if he will implement checkpointing in the app as longer tasks will definitely need it.
____________

Profile David Glogau*
Avatar
Send message
Joined: 12 Aug 09
Posts: 172
Credit: 645,240,165
RAC: 0

Message 35737 - Posted: 17 Jan 2010, 11:05:20 UTC - in response to Message 35725.

I also think we should ask Gipsel/Cluster Physik nicely if he will implement checkpointing in the app as longer tasks will definitely need it.


I second that request. 25%, 50% and 75% would be my wish, thanks.
____________

zombie67 [MM]
Avatar
Send message
Joined: 29 Aug 07
Posts: 115
Credit: 257,878,412
RAC: 3

Message 35751 - Posted: 17 Jan 2010, 17:04:33 UTC - in response to Message 35721.


In my opinion, you should consider looking at whether or not there is a way to use Homogeneous Redundancy classes to separate GPU from CPU and give GPUs the longer task and leave the shorter tasks to CPUs.


HR isn't a factor here. With a quorum of 1, you don't use multiple replications for validation.
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 35758 - Posted: 17 Jan 2010, 19:36:06 UTC - in response to Message 35751.


In my opinion, you should consider looking at whether or not there is a way to use Homogeneous Redundancy classes to separate GPU from CPU and give GPUs the longer task and leave the shorter tasks to CPUs.


HR isn't a factor here. With a quorum of 1, you don't use multiple replications for validation.


We actually use a different method for validation than any of the typical BOINC redundancy strategies. My thesis goes into this with a little bit of detail, and I'm actually working on a paper right now about it.

We don't need to validate EVERY workunit, unlike other projects. Since we're doing evolutionary algorithms which are based on populations of solutions (newly generated work is based off of different recombinations of a known population of best solutions), when we get a result back that could potentially improve the population, we validate it before we put it into the population. This keeps us from generating new work from potentially invalid results.

What we've been testing lately is a more optimistic validation strategy. Since most of your results are correct, waiting for results to be validated before putting them in the population can slow our search progress down quite a bit. So I've been trying out a validation strategy which uses potentially good results immediately, and then reverts them to previously validated results if they turn out to be invalid. So far it's working out really well so that's what the paper is about.
____________

Brian Silvers
Send message
Joined: 21 Aug 08
Posts: 625
Credit: 558,425
RAC: 0

Message 35760 - Posted: 17 Jan 2010, 20:26:00 UTC - in response to Message 35751.
Last modified: 17 Jan 2010, 20:28:57 UTC


In my opinion, you should consider looking at whether or not there is a way to use Homogeneous Redundancy classes to separate GPU from CPU and give GPUs the longer task and leave the shorter tasks to CPUs.


HR isn't a factor here. With a quorum of 1, you don't use multiple replications for validation.


Perhaps I'm getting ahead of the curve with trying to segregate tasks, regardless of quorum. Not sure if there's already a way to do that, but the whole point is that GPU users need to be placed in a different classification category than CPU users. You folks can exclusively have the 3-stream (longer-running) tasks, and leave CPU users with the 1-stream, 2-stream, or other shorter-running tasks.

Perhaps I am phrasing the BOINC equivalent wrong, and there is something there already, but if the planned "2 to 4 times increase" in runtime happens again, then that will undo the increase in deadline and will cause people with CPUs to start howling again...

I'm advocating making everyone happier, not just a few. Same as I've been doing all along... I think if something like what I'm suggesting is done, it will improve total project throughput and maybe, just maybe, allow you all to have a larger cache. Might not, but it is certainly worth a try if there is a way to do that already or if it is a minimal change.

Crab
Send message
Joined: 6 Oct 09
Posts: 25
Credit: 4,849,998
RAC: 0

Message 35771 - Posted: 18 Jan 2010, 4:00:10 UTC
Last modified: 18 Jan 2010, 4:14:22 UTC

in short: yes, we need a longer gpu's tasks, checkpoints (for not very fast gpu) and a shorter tasks for cpu.

and regarding WU's validation. what if i'm overclock my gpu and it becomes produce results with errors. me and you will never know of this errors?

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 35776 - Posted: 18 Jan 2010, 4:49:16 UTC - in response to Message 35771.

and regarding WU's validation. what if i'm overclock my gpu and it becomes produce results with errors. me and you will never know of this errors?


The majority of errors tend to give us good results, ie., they're false positives; especially from overclocked CPUs and GPUs. So a couple results might get validated, but they don't harm our searches, and the majority get caught.
____________

Profile Beyond
Send message
Joined: 15 Jul 08
Posts: 383
Credit: 501,817,790
RAC: 0

Message 35783 - Posted: 18 Jan 2010, 17:48:59 UTC - in response to Message 35718.
Last modified: 18 Jan 2010, 17:49:25 UTC

I think that might really take some new science from the astronomers. We do have a change in the works that should increase the compute time by (hopefully) another factor of 2 - 4. Once we get the server side GPU issues settled, we'll be releasing that.

That should be very helpful. Anything to get the queue time up for GPUs is most appreciated. Thanks!

The majority of errors tend to give us good results, ie., they're false positives; especially from overclocked CPUs and GPUs. So a couple results might get validated, but they don't harm our searches, and the majority get caught.

Good to hear, would like to know more about this when you get the chance.


Post to thread

Message boards : Number crunching : Now that we have native ATI GPU support, how about longer tasks?


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group