Message boards :
Application Code Discussion :
CUDA for Milkyway@Home
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next
Author | Message |
---|---|
Send message Joined: 4 Jul 08 Posts: 165 Credit: 364,966 RAC: 0 |
Thanks for the info not such an easy task it would appear...... |
Send message Joined: 27 Feb 09 Posts: 45 Credit: 305,963 RAC: 0 |
Glenn, it may well be very easy. It's been almost 10 years since I did anything related to Unix commands. So getting CUDA itself to work is probably far easier than I found it. I found an idiots guide. I take my hate of to Travis and CP for building any GPU app. Mars rules this confectionery war! |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
Well i've finally managed to get CUDA working properly on the Mac Pro. Not bad considering it's a slow old 8800GT. Do you have any performance figures to share? trisf told us a 9600GT on a C2D 6750 took about 15 minutes for the wedge 20 test unit. These test WUs are quite small so the execution time may be somehow limited by the CPU and all the calling overhead for the GPU stuff. Nevertheless it would be interesting to have a comparison with the 8800GT. |
Send message Joined: 30 Nov 08 Posts: 11 Credit: 25,658 RAC: 0 |
I tried to run ps_sgr_214F_2s* on my 9600gt and self compiled linux64 binary... 1) insane desktop performance slowdown 2) after running 3hours i have to kill it 3) CPU load 100% |
Send message Joined: 4 Jul 08 Posts: 165 Credit: 364,966 RAC: 0 |
Gday Satan, I dont have any code writing experience or i would have a go at it myself and my ATI X1300 only handles single precision so it looks like i have to upgrade my graphics card...May have go trolling for some info on what my card is actually capable of... Absolutely hats off to Cluster and Travis they have done an outstanding job getting the app up and running....... Glenn |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
I tried to run ps_sgr_214F_2s* on my 9600gt and self compiled linux64 binary... Yes, the production WUs are quite bit larger than the test WUs. As the code for MW_GPU does quite a bit more with one WU as the legacy MW@home code (roughly 300 or 400 times as much for the WU you tried to run, would have to check it to give an exact number), it is normal for them to take several hours. The fastest GPUs out there complete these WUs in about 50 seconds with the "classic" algorithm, albeit in double precision. Multiplying that time with 400 equals 5.5 hours. Such long WU were one of the goals of MW_GPU actually. That slow and sluggish behaviour of the GUI is a side affect of GPU apps with a very high utilization of the GPU. The ATI app also suffered (and still does to some extent) from this. One has to limit the duration of the GPU kernels somehow. That creates short opportunities for other tasks (like the screen refresh) to execute which will result in a smoother experience. The high CPU load should be easy to cure. One only have to send the application to sleep (a millisecond is enough) when it busy waits for the completion of a GPU kernel. That should be one line in the code (at least I hope so). |
Send message Joined: 27 Feb 09 Posts: 45 Credit: 305,963 RAC: 0 |
Cluster, I haven't dared mess with the Milkyway stuff, Gave me a big enough head ache just making sure CUDA was installed correctly. Will have a go over the next couple of days. I keep screwing something up because it keeps telling me that not target has been set. Will need to go through take a slow careful look at what i'm screwing up. I doubt i'll notice a slowdown with the desktop though as I run the 8800 purely on its own without a monitor connected. Will post back if/when I finally get the damn thing working properly. Mars rules this confectionery war! |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
Will have a go over the next couple of days. I keep screwing something up because it keeps telling me that not target has been set. Will need to go through take a slow careful look at what i'm screwing up. That could be the problem. Don't know how it works on a Mac, but under Win and Linux you have to attach a monitor to the card. Otherwise it is not active and one can't run anything on the GPU. |
Send message Joined: 27 Feb 09 Posts: 45 Credit: 305,963 RAC: 0 |
I had no trouble getting it to run under BootCamp without a display connected. I don't know whether it is something in the Apple drivers or not, but I can run the CUDA examlples such as oceanFFT no problems and they show perfectly fine. Arkayn might have a better idea of why it works. [img=http://img44.imageshack.us/img44/7240/cudascreenshot.th.png] Mars rules this confectionery war! |
Send message Joined: 18 Nov 07 Posts: 280 Credit: 2,442,757 RAC: 0 |
According to Nvidia the requirement of having to attach a monitor is a strange Microsoft requirement that they could work around - but not without breaking WHQL certification. I don't know the deal with Linux though. |
Send message Joined: 14 Feb 09 Posts: 999 Credit: 74,932,619 RAC: 0 |
I had no trouble getting it to run under BootCamp without a display connected. I don't know whether it is something in the Apple drivers or not, but I can run the CUDA examlples such as oceanFFT no problems and they show perfectly fine. Not really, I don't know hardly anything about software/driver developing. I am pretty good on app_info's up to when they added all that fplops to the mix. |
Send message Joined: 26 Jan 09 Posts: 589 Credit: 497,834,261 RAC: 0 |
|
Send message Joined: 21 Feb 09 Posts: 180 Credit: 27,806,824 RAC: 0 |
There is a way around the monitor bug thing in windows without using a second monitor or a dummy plug. Go to your display settings, enable the second monitor as an extention of your desktop, AND as the primary monitor. When you click apply, you'll be left with a screen which is just your background. Now unplug the monitor cable from it's current graphics card, into the one you just enabled. You should be back to your desktop, albeit able to move your mouse off to the left. This enables both cards. The one drawback is that sometimes (not often) windows will pop up on the other screen - I had it with my MSN messenger, until I dragged the window over and then it was fine. |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
Will have a go over the next couple of days. I keep screwing something up because it keeps telling me that not target has been set. Will need to go through take a slow careful look at what i'm screwing up. On the new macbook pros, you need to go into system preferences -> energy saver then select higher performance to use the other (faster) GPU. If you don't want to use that there's a line in evaluation_gpuX.cu which sets the device (it's at 1, i think it should be changed to 0 to use the on-chip GPU). |
Send message Joined: 30 Nov 08 Posts: 11 Credit: 25,658 RAC: 0 |
Trying to obtain results for linux_x86_64 cuda gpu http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=81905297 after ~4hours got this out result ps_sgr_214F5_2s_hiw_470211_1245248961_0_0
and some stderr.txt
wu is still runnning |
Send message Joined: 27 Feb 09 Posts: 45 Credit: 305,963 RAC: 0 |
I'll give it ago when Travis posts the updated code files. I can't say that i will have any success though. Mars rules this confectionery war! |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
Trying to obtain results for linux_x86_64 cuda gpu http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=81905297 As this is a 2 stream WU with a double sized wedge, I can give some comparison with a a HD3870 (overclocked to 860MHz). I've not run a whole WU yet (takes too long ;)), but I know the time for a single evaluation. As the the number of evaluations is given in the output file, I can say that HD3870 would take about 8000 seconds (2:15 hours) for the 447 evaluations (roughly 18 seconds per evaluation). This number is deduced from a normal sized wedge, so there is some uncertainty to it (maybe 20%). What graphics card do you use? Was it a 9600GT? It has 64 stream processors, opposed to the 112 to 128 of the 8800GT/GTX, 9800GT/GTX series. That would mean a G92 based graphics card is roughly as fast as a HD3870 with the current code, depending on the clock and the exact number of enabled units also a bit faster. The GT200 would battle it out with the HD4800 series then ;) |
Send message Joined: 30 Nov 08 Posts: 11 Credit: 25,658 RAC: 0 |
Thanks CP. yes it was 9600gt strange behavior: when you stop project wus dont stop and continues to run only kill boinc helps |
Send message Joined: 4 Jul 08 Posts: 165 Credit: 364,966 RAC: 0 |
In the BOINC manager options menu check the enable manager exit menu check box then ok. Then file exit..the dialog box should have the checkbox stop science applications when exiting manager make sure this is checked click ok. That should be it |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
Trying to obtain results for linux_x86_64 cuda gpu http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=81905297 By the way, there may be a bug in the CUDA version when initializing the stream_c parameters. If one compares the init_constants function from the CPU version if (ap->sgr_coordinates == 0) { atGCToEq(ap->stream_parameters[i][0], 0, &ra, &dec, get_node(), wedge_incl(ap->wedge)); atEqToGal(ra, dec, &l, &b); } else if (ap->sgr_coordinates == 1) { gcToSgr(ap->stream_parameters[i][0], 0, ap->wedge, &lamda, &beta); //vickej2 sgrToGal(lamda, beta, &l, &b); //vickej2 } else { printf("Error: sgr_coordinates not valid"); } lbr[0] = l; lbr[1] = b; lbr[2] = ap->stream_parameters[i][1]; lbr2xyz(lbr, stream_c[i]); with the beginning of gpu__likelihood gc_to_gal(wedge, stream_parameters(i,0) * D_DEG2RAD, 0 * D_DEG2RAD, &(lbr[0]), &(lbr[1])); lbr[2] = stream_parameters(i,1); d_lbr2xyz(lbr, stream_c); one sees the CUDA version lacks the if statement for the SGR coordinates. Actually the CUDA version assumes that no SGR coordinates are used. At least this is how I read the code, the rotation matrix used in gc_to_gal is the same as in atEqToGal. I will stay with the CPU code version of that for the time being ;) |
©2024 Astroinformatics Group