Welcome to MilkyWay@home

milkyway & milkywayGPU makefile


Advanced search

Message boards : Application Code Discussion : milkyway & milkywayGPU makefile
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 23829 - Posted: 1 Jun 2009, 8:26:58 UTC

Here's a thread for discussing (and improving) the makefile we're using for milkyway. The newest code release has a combined linux, osx and GPU makefile. I've tested it on OSX and it works fine -- unfortunately AFAIK there is no 64 bit or PPC version of CUDA for OSX so it will only compile an i686 binary.

I don't have a linux machine with a GPU to test the makefile, so let me know if the makefile works for those (I'm pretty sure it should as it's shouldn't be doing anything different than OSX).
ID: 23829 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKSMarksPsych
Avatar

Send message
Joined: 9 Sep 07
Posts: 22
Credit: 320,035
RAC: 0
100 thousand credit badge14 year member badge
Message 23831 - Posted: 1 Jun 2009, 10:13:02 UTC

I have a 64 bit Linux machine with a CUDA capable card. Getting it to work with BOINC hasn't gone well.

If you can give a general idea of what you'd like me to do, I'd be happy to get compiling ;)
Kathryn :o)
The BOINC FAQ Service
The Unofficial BOINC Wiki
The Trac System
More BOINC information than you can shake a stick of RAM at.
ID: 23831 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 23832 - Posted: 1 Jun 2009, 10:39:30 UTC - in response to Message 23831.  

I have a 64 bit Linux machine with a CUDA capable card. Getting it to work with BOINC hasn't gone well.

If you can give a general idea of what you'd like me to do, I'd be happy to get compiling ;)


Welp, your guess is as good as mine :) JK.

Well, you'd need to download the cuda driver and toolkit: http://www.nvidia.com/object/cuda_get.html

You can test and see if it works with the samples.

After that, you should be able to just download the latest GPU code from: http://milkyway.cs.rpi.edu/milkyway/download/code_release/

After unzipping, you should be able to go to the /milkyway/bin/ directory and try running the makefile:

make linux_x86_64_gpu

You'll probably need to specify the right directories pointing to where you have boinc and cuda installed in the makefile.
ID: 23832 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
trisf

Send message
Joined: 30 Nov 08
Posts: 11
Credit: 25,658
RAC: 0
10 thousand credit badge12 year member badge
Message 23838 - Posted: 1 Jun 2009, 12:27:29 UTC
Last modified: 1 Jun 2009, 13:15:01 UTC

Some questions about compilation with make linux_x86_64_gpu
1) what is "evaluator.h" in evaluation/simple_evaluator.c , searches/[hessian,line_search,gradient].c ?
is it evaluation/simple_evaluator.h ?
2) evaluate function in searches/[gradient,hessian,regression,line_search].c no visible declaration.

../searches/hessian.c:127: error: 'evaluate' was not declared in this scope
../searches/hessian.c: In function 'void get_hessian(int, double*, double*, double**)':
../searches/hessian.c:188: error: 'evaluate' was not declared in this scope
../searches/hessian.c:196: error: 'evaluate' was not declared in this scope
make: *** [../searches/hessian.o] Error 1

PS: sorry for english
ID: 23838 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jedirock
Avatar

Send message
Joined: 8 Nov 08
Posts: 178
Credit: 6,140,854
RAC: 0
5 million credit badge12 year member badge
Message 23857 - Posted: 1 Jun 2009, 18:25:06 UTC - in response to Message 23829.  

I've tested it on OSX and it works fine -- unfortunately AFAIK there is no 64 bit or PPC version of CUDA for OSX so it will only compile an i686 binary.

AFAIK, there is no CUDA library for OS X, period. Nvidia doesn't release separate drivers, so they have to work with Apple to get them into an OS update. With Apple pushing OpenCL though, they may have to go with OpenCL first. I don't know what Apple's schedule is on that.
ID: 23857 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 23888 - Posted: 1 Jun 2009, 23:09:06 UTC - in response to Message 23857.  

I've tested it on OSX and it works fine -- unfortunately AFAIK there is no 64 bit or PPC version of CUDA for OSX so it will only compile an i686 binary.

AFAIK, there is no CUDA library for OS X, period. Nvidia doesn't release separate drivers, so they have to work with Apple to get them into an OS update. With Apple pushing OpenCL though, they may have to go with OpenCL first. I don't know what Apple's schedule is on that.


There's a 32 bit CUDA library for Intel macs. That's what I've been using.
ID: 23888 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 23889 - Posted: 1 Jun 2009, 23:10:04 UTC - in response to Message 23838.  
Last modified: 1 Jun 2009, 23:17:49 UTC

Some questions about compilation with make linux_x86_64_gpu
1) what is "evaluator.h" in evaluation/simple_evaluator.c , searches/[hessian,line_search,gradient].c ?
is it evaluation/simple_evaluator.h ?
2) evaluate function in searches/[gradient,hessian,regression,line_search].c no visible declaration.

../searches/hessian.c:127: error: 'evaluate' was not declared in this scope
../searches/hessian.c: In function 'void get_hessian(int, double*, double*, double**)':
../searches/hessian.c:188: error: 'evaluate' was not declared in this scope
../searches/hessian.c:196: error: 'evaluate' was not declared in this scope
make: *** [../searches/hessian.o] Error 1

PS: sorry for english


Looks like I missed yet another file :( I'll update the v0.05 release.

*update*

Ok it should be in there now.
ID: 23889 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
trisf

Send message
Joined: 30 Nov 08
Posts: 11
Credit: 25,658
RAC: 0
10 thousand credit badge12 year member badge
Message 23911 - Posted: 2 Jun 2009, 5:13:05 UTC

bin/Makefile line 147
missing space

$(OBJ_CXX) $(OBJ_CXXFLAGS) $(LDFLAGS_x86_64) -o milkywayGPU_$(APP_VERSION)_x86_64-pc-linux-gnu$(APP_OBJS) $(SEARCH_OBJS) $(UTIL_OBJS) $(GPU_APP_OBJS) -lboinc -lboinc_api -lcudart
$(OBJ_CXX) $(OBJ_CXXFLAGS) $(LDFLAGS_x86_64) -o milkywayGPU_$(APP_VERSION)_x86_64-pc-linux-gnu $(APP_OBJS) $(SEARCH_OBJS) $(UTIL_OBJS) $(GPU_APP_OBJS) -lboinc -lboinc_api -lcudart
ID: 23911 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 23912 - Posted: 2 Jun 2009, 5:22:28 UTC - in response to Message 23911.  

bin/Makefile line 147
missing space

$(OBJ_CXX) $(OBJ_CXXFLAGS) $(LDFLAGS_x86_64) -o milkywayGPU_$(APP_VERSION)_x86_64-pc-linux-gnu$(APP_OBJS) $(SEARCH_OBJS) $(UTIL_OBJS) $(GPU_APP_OBJS) -lboinc -lboinc_api -lcudart
$(OBJ_CXX) $(OBJ_CXXFLAGS) $(LDFLAGS_x86_64) -o milkywayGPU_$(APP_VERSION)_x86_64-pc-linux-gnu $(APP_OBJS) $(SEARCH_OBJS) $(UTIL_OBJS) $(GPU_APP_OBJS) -lboinc -lboinc_api -lcudart


Nice catch, it'll be in the next update.
ID: 23912 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
trisf

Send message
Joined: 30 Nov 08
Posts: 11
Credit: 25,658
RAC: 0
10 thousand credit badge12 year member badge
Message 23917 - Posted: 2 Jun 2009, 7:24:51 UTC
Last modified: 2 Jun 2009, 7:38:52 UTC

linux_x86_64_gpu
maybe its my problem
linking libboinc_api.a errors without openssl
just added -lssl to line 147

how to run test units with milkywayGPU_0.18_x86_64-pc-linux-gnu?

update
renamed *-20.txt to *txt
executing...
looks like it works
ID: 23917 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
trisf

Send message
Joined: 30 Nov 08
Posts: 11
Credit: 25,658
RAC: 0
10 thousand credit badge12 year member badge
Message 23919 - Posted: 2 Jun 2009, 8:02:27 UTC
Last modified: 2 Jun 2009, 8:03:22 UTC

linux_x86_64_gpu out, sorry for huge post.


initial likelihood: -2.98530684176687044484

point[14]: 0.57171300000000002672, 12.31211899999999914712, -3.30518700000000009709, 148.01025699999999574175, 22.45390199999999936153, 0.42035000000000000142, -0.46885799999999999699, 0.76057900000000000507, -1.36164400000000007651, 177.88423800000001051558, 23.88289199999999823376, 1.21063900000000002066, -1.61197400000000001796, 8.53437800000000024170
step[14]: 0.00000400000000000000, 0.00008000000000000001, 0.00000100000000000000, 0.00003000000000000000, 0.00004000000000000000, 0.00006000000000000000, 0.00004000000000000000, 0.00000400000000000000, 0.00000100000000000000, 0.00003000000000000000, 0.00004000000000000000, 0.00006000000000000000, 0.00004000000000000000, 0.00000400000000000000

hessian[0][0] = 882.45647594797912915965, (-2.98530682678311976019 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530680027340666882)/(4 * 0.00000400000000000000 * 0.00000400000000000000)
hessian[0][1] = hessian[1][0] = -6.30326069117614817827, (-2.98530681295196531622 - -2.98530680373119539084 - -2.98530676800071281818 + -2.98530676684811657751)/(4 * 0.00000400000000000000 * 0.00008000000000000001)
hessian[0][2] = hessian[2][0] = -792.40988770656883843913, (-2.98530683369869720423 - -2.98530681525715779756 - -2.98530677952667522490 + -2.98530677376369402154)/(4 * 0.00000400000000000000 * 0.00000100000000000000)
hessian[0][3] = hessian[3][0] = -69.63602102357431533619, (-2.98530682908831224154 - -2.98530681756235027891 - -2.98530676800071281818 + -2.98530678990004094686)/(4 * 0.00000400000000000000 * 0.00003000000000000000)
hessian[0][4] = hessian[4][0] = 54.02794739373106125413, (-2.98530681525715779756 - -2.98530682908831224154 - -2.98530679335782966888 + -2.98530677261109778087)/(4 * 0.00000400000000000000 * 0.00004000000000000000)
hessian[0][5] = hessian[5][0] = 55.22856894035754748984, (-2.98530680488379163151 - -2.98530681525715779756 - -2.98530681525715779756 + -2.98530677261109778087)/(4 * 0.00000400000000000000 * 0.00006000000000000000)
hessian[0][6] = hessian[6][0] = 21.61117881871454571296, (-2.98530682447792727885 - -2.98530684061427420417 - -2.98530678298446350283 + -2.98530678528965598417)/(4 * 0.00000400000000000000 * 0.00004000000000000000)
hessian[0][7] = hessian[7][0] = -198.10247886553611351701, (-2.98530682793571600087 - -2.98530683139350472288 - -2.98530678298446350283 + -2.98530679912081087224)/(4 * 0.00000400000000000000 * 0.00000400000000000000)
hessian[0][8] = hessian[8][0] = -1152.59621291663461306598, (-2.98530683024090848221 - -2.98530682102013855683 - -2.98530677261109778087 + -2.98530678183186726216)/(4 * 0.00000400000000000000 * 0.00000100000000000000)
hessian[0][9] = hessian[9][0] = 36.01863252100656609400, (-2.98530681986754231616 - -2.98530683485129344490 - -2.98530678644225222484 + -2.98530678413705974350)/(4 * 0.00000400000000000000 * 0.00003000000000000000)
hessian[0][10] = hessian[10][0] = 10.80558975630196805184, (-2.98530681525715779756 - -2.98530681871494651958 - -2.98530678759484846552 + -2.98530678413705974350)/(4 * 0.00000400000000000000 * 0.00004000000000000000)
hessian[0][11] = hessian[11][0] = -22.81180013404456374815, (-2.98530680949417659420 - -2.98530678413705974350 - -2.98530681525715779756 + -2.98530681179936907554)/(4 * 0.00000400000000000000 * 0.00006000000000000000)
hessian[0][12] = hessian[12][0] = 90.04657991473762024270, (-2.98530678874744470619 - -2.98530683830908216692 - -2.98530680834158035353 + -2.98530680027340666882)/(4 * 0.00000400000000000000 * 0.00004000000000000000)
hessian[0][13] = hessian[13][0] = 108.05589062412579437478, (-2.98530683254610096355 - -2.98530684176687044484 - -2.98530678874744470619 + -2.98530679105263718753)/(4 * 0.00000400000000000000 * 0.00000400000000000000)
hessian[1][1] = 2.70139735233931821412, (-2.98530683485129344490 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530677952667522490)/(4 * 0.00008000000000000001 * 0.00008000000000000001)
hessian[1][2] = hessian[2][1] = 3.60186325210065616531, (-2.98530682678311976019 - -2.98530682102013855683 - -2.98530681410456155689 + -2.98530680718898411286)/(4 * 0.00008000000000000001 * 0.00000100000000000000)
hessian[1][3] = hessian[3][1] = 3.36173894277536033925, (-2.98530682908831224154 - -2.98530684522465916686 - -2.98530680027340666882 + -2.98530678413705974350)/(4 * 0.00008000000000000001 * 0.00003000000000000000)
hessian[1][4] = hessian[4][1] = -1.44074526614579290218, (-2.98530684868244788888 - -2.98530682102013855683 - -2.98530679681561839089 + -2.98530678759484846552)/(4 * 0.00008000000000000001 * 0.00004000000000000000)
hessian[1][5] = hessian[5][1] = -0.42021737941174319708, (-2.98530682563052351952 - -2.98530680718898411286 - -2.98530678874744470619 + -2.98530677837407898423)/(4 * 0.00008000000000000001 * 0.00006000000000000000)
hessian[1][6] = hessian[6][1] = 3.42176998541221477623, (-2.98530681295196531622 - -2.98530684983504412955 - -2.98530680142600290949 + -2.98530679451042590955)/(4 * 0.00008000000000000001 * 0.00004000000000000000)
hessian[1][7] = hessian[7][1] = 0.90046581302516404133, (-2.98530683485129344490 - -2.98530683485129344490 - -2.98530679105263718753 + -2.98530678990004094686)/(4 * 0.00008000000000000001 * 0.00000400000000000000)
hessian[1][8] = hessian[8][1] = 64.83353715003303818776, (-2.98530682102013855683 - -2.98530682908831224154 - -2.98530681986754231616 + -2.98530680718898411286)/(4 * 0.00008000000000000001 * 0.00000100000000000000)
hessian[1][9] = hessian[9][1] = 0.24012417054741772016, (-2.98530683369869720423 - -2.98530683024090848221 - -2.98530680142600290949 + -2.98530679566302215022)/(4 * 0.00008000000000000001 * 0.00003000000000000000)
hessian[1][10] = hessian[10][1] = 0.09004658130251640136, (-2.98530683369869720423 - -2.98530683485129344490 - -2.98530679335782966888 + -2.98530679335782966888)/(4 * 0.00008000000000000001 * 0.00004000000000000000)
hessian[1][11] = hessian[11][1] = 0.42021735628209683222, (-2.98530683139350472288 - -2.98530681179936907554 - -2.98530682217273479750 + -2.98530679451042590955)/(4 * 0.00008000000000000001 * 0.00006000000000000000)
hessian[1][12] = hessian[12][1] = 2.07107130056893806724, (-2.98530677261109778087 - -2.98530682563052351952 - -2.98530676915330905885 + -2.98530679566302215022)/(4 * 0.00008000000000000001 * 0.00004000000000000000)
hessian[1][13] = hessian[13][1] = -3.60186325210065616531, (-2.98530683254610096355 - -2.98530683139350472288 - -2.98530678528965598417 + -2.98530678874744470619)/(4 * 0.00008000000000000001 * 0.00000400000000000000)
hessian[2][2] = -1152.59635169451257752371, (-2.98530684868244788888 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530683946167840759)/(4 * 0.00000100000000000000 * 0.00000100000000000000)
hessian[2][3] = hessian[3][2] = 96.04968302194076557043, (-2.98530682217273479750 - -2.98530684637725540753 - -2.98530682217273479750 + -2.98530683485129344490)/(4 * 0.00000100000000000000 * 0.00003000000000000000)
hessian[2][4] = hessian[4][2] = -36.01863252100655898857, (-2.98530683715648592624 - -2.98530682102013855683 - -2.98530683254610096355 + -2.98530682217273479750)/(4 * 0.00000100000000000000 * 0.00004000000000000000)
hessian[2][5] = hessian[5][2] = 28.81490786717695939956, (-2.98530682447792727885 - -2.98530681756235027891 - -2.98530681756235027891 + -2.98530680373119539084)/(4 * 0.00000100000000000000 * 0.00006000000000000000)
hessian[2][6] = hessian[6][2] = 0.00000000000000000000, (-2.98530682678311976019 - -2.98530684061427420417 - -2.98530681410456155689 + -2.98530682793571600087)/(4 * 0.00000100000000000000 * 0.00004000000000000000)
hessian[2][7] = hessian[7][2] = 288.14903241247691312310, (-2.98530683715648592624 - -2.98530683139350472288 - -2.98530684291946668552 + -2.98530683254610096355)/(4 * 0.00000100000000000000 * 0.00000400000000000000)
hessian[2][8] = hessian[8][2] = 3169.63955082627535375650, (-2.98530683254610096355 - -2.98530684868244788888 - -2.98530683369869720423 + -2.98530683715648592624)/(4 * 0.00000100000000000000 * 0.00000100000000000000)
hessian[2][9] = hessian[9][2] = 19.20993364379341983295, (-2.98530683946167840759 - -2.98530683139350472288 - -2.98530682908831224154 + -2.98530681871494651958)/(4 * 0.00000100000000000000 * 0.00003000000000000000)
hessian[2][10] = hessian[10][2] = -93.64844177905949607066, (-2.98530684868244788888 - -2.98530680488379163151 - -2.98530683946167840759 + -2.98530681064677283487)/(4 * 0.00000100000000000000 * 0.00004000000000000000)
hessian[2][11] = hessian[11][2] = 0.00000185037170770859, (-2.98530682102013855683 - -2.98530679451042590955 - -2.98530681525715779756 + -2.98530678874744470619)/(4 * 0.00000100000000000000 * 0.00006000000000000000)
hessian[2][12] = hessian[12][2] = 7.20372650420131233062, (-2.98530676454292409616 - -2.98530681295196531622 - -2.98530676339032785549 + -2.98530681064677283487)/(4 * 0.00000100000000000000 * 0.00004000000000000000)
hessian[2][13] = hessian[13][2] = -432.22359025207879312802, (-2.98530683715648592624 - -2.98530682678311976019 - -2.98530682678311976019 + -2.98530682332533103818)/(4 * 0.00000100000000000000 * 0.00000400000000000000)
hessian[3][3] = 3.52182172314030594862, (-2.98530682793571600087 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530684291946668552)/(4 * 0.00003000000000000000 * 0.00003000000000000000)
hessian[3][4] = hessian[4][3] = -4.56235993429032671287, (-2.98530684176687044484 - -2.98530681871494651958 - -2.98530684522465916686 + -2.98530684407206292619)/(4 * 0.00003000000000000000 * 0.00004000000000000000)
hessian[3][5] = hessian[5][3] = 1.28066248963578899200, (-2.98530682332533103818 - -2.98530681986754231616 - -2.98530683369869720423 + -2.98530682102013855683)/(4 * 0.00003000000000000000 * 0.00006000000000000000)
hessian[3][6] = hessian[6][3] = -1.20062099151496659566, (-2.98530684637725540753 - -2.98530685098764037022 - -2.98530683254610096355 + -2.98530684291946668552)/(4 * 0.00003000000000000000 * 0.00004000000000000000)
hessian[3][7] = hessian[7][3] = -7.20372650420131321880, (-2.98530683254610096355 - -2.98530682793571600087 - -2.98530684637725540753 + -2.98530684522465916686)/(4 * 0.00003000000000000000 * 0.00000400000000000000)
hessian[3][8] = hessian[8][3] = -67.23477700513551269523, (-2.98530684407206292619 - -2.98530682332533103818 - -2.98530685444542909224 + -2.98530684176687044484)/(4 * 0.00003000000000000000 * 0.00000100000000000000)
hessian[3][9] = hessian[9][3] = 1.92099373445368359903, (-2.98530683369869720423 - -2.98530683024090848221 - -2.98530685214023661089 + -2.98530684176687044484)/(4 * 0.00003000000000000000 * 0.00003000000000000000)
hessian[3][10] = hessian[10][3] = -1.44074520832167696227, (-2.98530685098764037022 - -2.98530681756235027891 - -2.98530685329283285157 + -2.98530682678311976019)/(4 * 0.00003000000000000000 * 0.00004000000000000000)
hessian[3][11] = hessian[11][3] = -0.32016562240894724800, (-2.98530682908831224154 - -2.98530681525715779756 - -2.98530682102013855683 + -2.98530680949417659420)/(4 * 0.00003000000000000000 * 0.00006000000000000000)
hessian[3][12] = hessian[12][3] = -0.48024834109483544031, (-2.98530678298446350283 - -2.98530683139350472288 - -2.98530677952667522490 + -2.98530683024090848221)/(4 * 0.00003000000000000000 * 0.00004000000000000000)
hessian[3][13] = hessian[13][3] = 0.00000000000000000000, (-2.98530683254610096355 - -2.98530683024090848221 - -2.98530684868244788888 + -2.98530684637725540753)/(4 * 0.00003000000000000000 * 0.00000400000000000000)
hessian[4][4] = 1.08055890624125772170, (-2.98530683600388968557 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530684061427420417)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[4][5] = hessian[5][4] = -2.64136629235522901737, (-2.98530684061427420417 - -2.98530681986754231616 - -2.98530681756235027891 + -2.98530682217273479750)/(4 * 0.00004000000000000000 * 0.00006000000000000000)
hessian[4][6] = hessian[6][4] = 1.62083853283423429126, (-2.98530684176687044484 - -2.98530685905581405493 - -2.98530683715648592624 + -2.98530684407206292619)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[4][7] = hessian[7][4] = -1.80093162605032808266, (-2.98530685444542909224 - -2.98530685214023661089 - -2.98530683715648592624 + -2.98530683600388968557)/(4 * 0.00004000000000000000 * 0.00000400000000000000)
hessian[4][8] = hessian[8][4] = 7.20372927975887389351, (-2.98530684291946668552 - -2.98530683715648592624 - -2.98530683600388968557 + -2.98530682908831224154)/(4 * 0.00004000000000000000 * 0.00000100000000000000)
hessian[4][9] = hessian[9][4] = 5.28273267722904371624, (-2.98530684983504412955 - -2.98530684637725540753 - -2.98530683830908216692 + -2.98530680949417659420)/(4 * 0.00004000000000000000 * 0.00003000000000000000)
hessian[4][10] = hessian[10][4] = 6.30326055239827010013, (-2.98530684061427420417 - -2.98530683139350472288 - -2.98530684407206292619 + -2.98530679451042590955)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[4][11] = hessian[11][4] = -2.88149055542123200269, (-2.98530682908831224154 - -2.98530680142600290949 - -2.98530681756235027891 + -2.98530681756235027891)/(4 * 0.00004000000000000000 * 0.00006000000000000000)
hessian[4][12] = hessian[12][4] = 0.18009316260503280271, (-2.98530678644225222484 - -2.98530683254610096355 - -2.98530678528965598417 + -2.98530683024090848221)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[4][13] = hessian[13][4] = 5.40279487815098402592, (-2.98530683830908216692 - -2.98530683715648592624 - -2.98530683024090848221 + -2.98530682563052351952)/(4 * 0.00004000000000000000 * 0.00000400000000000000)
hessian[5][5] = 3.92202875115148996699, (-2.98530683254610096355 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530679451042590955)/(4 * 0.00006000000000000000 * 0.00006000000000000000)
hessian[5][6] = hessian[6][5] = -7.32378842756749648402, (-2.98530684291946668552 - -2.98530684176687044484 - -2.98530677145850154020 + -2.98530684061427420417)/(4 * 0.00006000000000000000 * 0.00004000000000000000)
hessian[5][7] = hessian[7][5] = 7.20372696679423984989, (-2.98530682793571600087 - -2.98530683946167840759 - -2.98530681756235027891 + -2.98530682217273479750)/(4 * 0.00006000000000000000 * 0.00000400000000000000)
hessian[5][8] = hessian[8][5] = 33.61738850256775634762, (-2.98530683715648592624 - -2.98530682447792727885 - -2.98530683600388968557 + -2.98530681525715779756)/(4 * 0.00006000000000000000 * 0.00000100000000000000)
hessian[5][9] = hessian[9][5] = 4.16215302963725708452, (-2.98530682793571600087 - -2.98530683830908216692 - -2.98530684061427420417 + -2.98530682102013855683)/(4 * 0.00006000000000000000 * 0.00003000000000000000)
hessian[5][10] = hessian[10][5] = 1.08055892937090414208, (-2.98530683139350472288 - -2.98530682217273479750 - -2.98530683254610096355 + -2.98530681295196531622)/(4 * 0.00006000000000000000 * 0.00004000000000000000)
hessian[5][11] = hessian[11][5] = 0.80041402518283966128, (-2.98530681640975403823 - -2.98530680142600290949 - -2.98530680718898411286 + -2.98530678067927102148)/(4 * 0.00006000000000000000 * 0.00006000000000000000)
hessian[5][12] = hessian[12][5] = -0.72037265042013121086, (-2.98530679220523342821 - -2.98530680949417659420 - -2.98530678644225222484 + -2.98530681064677283487)/(4 * 0.00006000000000000000 * 0.00004000000000000000)
hessian[5][13] = hessian[13][5] = 1.20062108403355227715, (-2.98530683830908216692 - -2.98530683369869720423 - -2.98530682563052351952 + -2.98530681986754231616)/(4 * 0.00006000000000000000 * 0.00000400000000000000)
hessian[6][6] = 1.08055883685231868263, (-2.98530683830908216692 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530683830908216692)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[6][7] = hessian[7][6] = -12.60652068846290596582, (-2.98530684176687044484 - -2.98530683600388968557 - -2.98530685214023661089 + -2.98530685444542909224)/(4 * 0.00004000000000000000 * 0.00000400000000000000)
hessian[6][8] = hessian[8][6] = -108.05589756301968407115, (-2.98530685329283285157 - -2.98530682102013855683 - -2.98530684752985164820 + -2.98530683254610096355)/(4 * 0.00004000000000000000 * 0.00000100000000000000)
hessian[6][9] = hessian[9][6] = -2.88149050916193960603, (-2.98530685329283285157 - -2.98530683369869720423 - -2.98530685098764037022 + -2.98530684522465916686)/(4 * 0.00004000000000000000 * 0.00003000000000000000)
hessian[6][10] = hessian[10][6] = -3.24167685750165146530, (-2.98530683600388968557 - -2.98530681525715779756 - -2.98530684291946668552 + -2.98530684291946668552)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[6][11] = hessian[11][6] = 4.08211154693619882039, (-2.98530683139350472288 - -2.98530682102013855683 - -2.98530686481879481420 + -2.98530681525715779756)/(4 * 0.00004000000000000000 * 0.00006000000000000000)
hessian[6][12] = hessian[12][6] = 1.08055897563019676078, (-2.98530680027340666882 - -2.98530684522465916686 - -2.98530680373119539084 + -2.98530684176687044484)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[6][13] = hessian[13][6] = 14.40745231451323427052, (-2.98530683600388968557 - -2.98530684637725540753 - -2.98530683830908216692 + -2.98530683946167840759)/(4 * 0.00004000000000000000 * 0.00000400000000000000)
hessian[7][7] = 36.01862558211266218677, (-2.98530684176687044484 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530683946167840759)/(4 * 0.00000400000000000000 * 0.00000400000000000000)
hessian[7][8] = hessian[8][7] = -648.33538537811818969203, (-2.98530684868244788888 - -2.98530683369869720423 - -2.98530684291946668552 + -2.98530683830908216692)/(4 * 0.00000400000000000000 * 0.00000100000000000000)
hessian[7][9] = hessian[9][7] = -2.40124216806710455430, (-2.98530684752985164820 - -2.98530683715648592624 - -2.98530684522465916686 + -2.98530683600388968557)/(4 * 0.00000400000000000000 * 0.00003000000000000000)
hessian[7][10] = hessian[10][7] = -45.02328926347941973063, (-2.98530684868244788888 - -2.98530681179936907554 - -2.98530683600388968557 + -2.98530682793571600087)/(4 * 0.00000400000000000000 * 0.00004000000000000000)
hessian[7][11] = hessian[11][7] = -25.21304230211167052289, (-2.98530683946167840759 - -2.98530680488379163151 - -2.98530681640975403823 + -2.98530680603638787218)/(4 * 0.00000400000000000000 * 0.00006000000000000000)
hessian[7][12] = hessian[12][7] = -19.81024719266421740826, (-2.98530679220523342821 - -2.98530683139350472288 - -2.98530677722148274356 + -2.98530682908831224154)/(4 * 0.00000400000000000000 * 0.00004000000000000000)
hessian[7][13] = hessian[13][7] = -36.01863252100656609400, (-2.98530684637725540753 - -2.98530683946167840759 - -2.98530684061427420417 + -2.98530683600388968557)/(4 * 0.00000400000000000000 * 0.00000400000000000000)
hessian[8][8] = 6051.13004148449817876099, (-2.98530682332533103818 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530683600388968557)/(4 * 0.00000100000000000000 * 0.00000100000000000000)
hessian[8][9] = hessian[9][8] = -76.83974937814734573749, (-2.98530685559802533291 - -2.98530684061427420417 - -2.98530683485129344490 + -2.98530682908831224154)/(4 * 0.00000100000000000000 * 0.00003000000000000000)
hessian[8][10] = hessian[10][8] = 201.70433656652161857892, (-2.98530685559802533291 - -2.98530684868244788888 - -2.98530684868244788888 + -2.98530680949417659420)/(4 * 0.00000100000000000000 * 0.00004000000000000000)
hessian[8][11] = hessian[11][8] = -24.01241983029933635407, (-2.98530682678311976019 - -2.98530681064677283487 - -2.98530681295196531622 + -2.98530680257859915017)/(4 * 0.00000100000000000000 * 0.00006000000000000000)
hessian[8][12] = hessian[12][8] = 151.27825103711242604732, (-2.98530678874744470619 - -2.98530684061427420417 - -2.98530676800071281818 + -2.98530679566302215022)/(4 * 0.00000100000000000000 * 0.00004000000000000000)
hessian[8][13] = hessian[13][8] = -216.11179512603939656401, (-2.98530684752985164820 - -2.98530683369869720423 - -2.98530684061427420417 + -2.98530683024090848221)/(4 * 0.00000100000000000000 * 0.00000400000000000000)
hessian[9][9] = 1.28066236627767526812, (-2.98530685098764037022 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530682793571600087)/(4 * 0.00003000000000000000 * 0.00003000000000000000)
hessian[9][10] = hessian[10][9] = -1.68086951764697278833, (-2.98530685675062157358 - -2.98530683024090848221 - -2.98530684061427420417 + -2.98530682217273479750)/(4 * 0.00003000000000000000 * 0.00004000000000000000)
hessian[9][11] = hessian[11][9] = 2.08107648397910027782, (-2.98530682908831224154 - -2.98530680373119539084 - -2.98530682908831224154 + -2.98530678874744470619)/(4 * 0.00003000000000000000 * 0.00006000000000000000)
hessian[9][12] = hessian[12][9] = -6.96360219487601650457, (-2.98530678874744470619 - -2.98530682447792727885 - -2.98530676800071281818 + -2.98530683715648592624)/(4 * 0.00003000000000000000 * 0.00004000000000000000)
hessian[9][13] = hessian[13][9] = 14.40745300840262643760, (-2.98530684752985164820 - -2.98530683369869720423 - -2.98530684868244788888 + -2.98530682793571600087)/(4 * 0.00003000000000000000 * 0.00000400000000000000)
hessian[10][10] = 5.76298106458317160872, (-2.98530684061427420417 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530680603638787218)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[10][11] = hessian[11][10] = 0.24012421680671039437, (-2.98530680142600290949 - -2.98530678990004094686 - -2.98530683024090848221 + -2.98530681640975403823)/(4 * 0.00004000000000000000 * 0.00006000000000000000)
hessian[10][12] = hessian[12][10] = 2.88149053229158580436, (-2.98530679335782966888 - -2.98530684637725540753 - -2.98530677837407898423 + -2.98530681295196531622)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[10][13] = hessian[13][10] = -10.80558975630196805184, (-2.98530685790321781425 - -2.98530684983504412955 - -2.98530681871494651958 + -2.98530681756235027891)/(4 * 0.00004000000000000000 * 0.00000400000000000000)
hessian[11][11] = 3.28169753717312406849, (-2.98530682678311976019 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530680949417659420)/(4 * 0.00006000000000000000 * 0.00006000000000000000)
hessian[11][12] = hessian[12][11] = 2.28117996714516335643, (-2.98530679335782966888 - -2.98530682217273479750 - -2.98530677145850154020 + -2.98530677837407898423)/(4 * 0.00006000000000000000 * 0.00004000000000000000)
hessian[11][13] = hessian[13][11] = -2.40124216806710455430, (-2.98530682447792727885 - -2.98530682563052351952 - -2.98530680718898411286 + -2.98530681064677283487)/(4 * 0.00006000000000000000 * 0.00000400000000000000)
hessian[12][12] = 20.35052688864613301689, (-2.98530677145850154020 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530678183186726216)/(4 * 0.00004000000000000000 * 0.00004000000000000000)
hessian[12][13] = hessian[13][12] = -16.20838463445295474230, (-2.98530677145850154020 - -2.98530677030590529952 - -2.98530682102013855683 + -2.98530683024090848221)/(4 * 0.00004000000000000000 * 0.00000400000000000000)
hessian[13][13] = 198.10247192664220960978, (-2.98530684407206292619 - -2.98530684176687044484 - -2.98530684176687044484 + -2.98530682678311976019)/(4 * 0.00000400000000000000 * 0.00000400000000000000)
gradient[0]: -0.00633927915716370194, (-2.98530684061427420417 - -2.98530678990004094686)/(2 * 0.000004)
gradient[1]: -0.00025933414860013215, (-2.98530683254610096355 - -2.98530679105263718753)/(2 * 0.000080)
gradient[2]: -0.00576298120336105058, (-2.98530683946167840759 - -2.98530682793571600087)/(2 * 0.000001)
gradient[3]: 0.00028814905276656572, (-2.98530683024090848221 - -2.98530684752985164820)/(2 * 0.000030)
gradient[4]: -0.00011525961851610587, (-2.98530684176687044484 - -2.98530683254610096355)/(2 * 0.000040)
gradient[5]: -0.00010565465539495258, (-2.98530683369869720423 - -2.98530682102013855683)/(2 * 0.000060)
gradient[6]: 0.00011525961851610587, (-2.98530683715648592624 - -2.98530684637725540753)/(2 * 0.000040)
gradient[7]: 0.00086444718050415759, (-2.98530684061427420417 - -2.98530684752985164820)/(2 * 0.000004)
gradient[8]: -0.00518668286098034059, (-2.98530684291946668552 - -2.98530683254610096355)/(2 * 0.000001)
gradient[9]: -0.00013446955401027102, (-2.98530684407206292619 - -2.98530683600388968557)/(2 * 0.000030)
gradient[10]: -0.00037459377266735311, (-2.98530684983504412955 - -2.98530681986754231616)/(2 * 0.000040)
gradient[11]: -0.00012486458903874600, (-2.98530682332533103818 - -2.98530680834158035353)/(2 * 0.000060)
gradient[12]: 0.00048985339118345905, (-2.98530678528965598417 - -2.98530682447792727885)/(2 * 0.000040)
gradient[13]: -0.00057629806482495383, (-2.98530684291946668552 - -2.98530683830908216692)/(2 * 0.000004)
line search starting at fitness: -2.985306841766870
initial point: [14]: 0.57171300000000002672, 12.31211899999999914712, -3.30518700000000009709, 148.01025699999999574175, 22.45390199999999936153, 0.42035000000000000142, -0.46885799999999999699, 0.76057900000000000507, -1.36164400000000007651, 177.88423800000001051558, 23.88289199999999823376, 1.21063900000000002066, -1.61197400000000001796, 8.53437800000000024170
loop 1, evaluations: 1, step: 1.000000000000000, fitness: -2.985306747253981
loop 2, evaluations: 2, step: 2.000000000000000, fitness: -2.985306733422826
loop 2, evaluations: 3, step: 4.000000000000000, fitness: -2.985306850987640
loop 3, evaluations: 4, step: 1.785714283530075, fitness: -2.985306755322155
loop 3, evaluations: 5, step: 2.595721102268431, fitness: -2.985306818714947
loop 3, evaluations: 6, step: 2.061540494505907, fitness: -2.985306771458502
loop 3, evaluations: 7, step: 1.912425577977664, fitness: -2.985306748406577
loop 3, evaluations: 8, step: 1.972377618796003, fitness: -2.985306729965038
loop 3, evaluations: 9, step: 1.973523614802404, fitness: -2.985306733422826
ID: 23919 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
trisf

Send message
Joined: 30 Nov 08
Posts: 11
Credit: 25,658
RAC: 0
10 thousand credit badge12 year member badge
Message 23920 - Posted: 2 Jun 2009, 8:04:38 UTC

out file

hessian [14 x 14]:
882.45647594797912915965 -6.30326069117614817827 -792.40988770656883843913 -69.63602102357431533619 54.02794739373106125413 55.22856894035754748984 21.61117881871454571296 -198.10247886553611351701 -1152.59621291663461306598 36.01863252100656609400 10.80558975630196805184 -22.81180013404456374815 90.04657991473762024270 108.05589062412579437478
-6.30326069117614817827 2.70139735233931821412 3.60186325210065616531 3.36173894277536033925 -1.44074526614579290218 -0.42021737941174319708 3.42176998541221477623 0.90046581302516404133 64.83353715003303818776 0.24012417054741772016 0.09004658130251640136 0.42021735628209683222 2.07107130056893806724 -3.60186325210065616531
-792.40988770656883843913 3.60186325210065616531 -1152.59635169451257752371 96.04968302194076557043 -36.01863252100655898857 28.81490786717695939956 0.00000000000000000000 288.14903241247691312310 3169.63955082627535375650 19.20993364379341983295 -93.64844177905949607066 0.00000185037170770859 7.20372650420131233062 -432.22359025207879312802
-69.63602102357431533619 3.36173894277536033925 96.04968302194076557043 3.52182172314030594862 -4.56235993429032671287 1.28066248963578899200 -1.20062099151496659566 -7.20372650420131321880 -67.23477700513551269523 1.92099373445368359903 -1.44074520832167696227 -0.32016562240894724800 -0.48024834109483544031 0.00000000000000000000
54.02794739373106125413 -1.44074526614579290218 -36.01863252100655898857 -4.56235993429032671287 1.08055890624125772170 -2.64136629235522901737 1.62083853283423429126 -1.80093162605032808266 7.20372927975887389351 5.28273267722904371624 6.30326055239827010013 -2.88149055542123200269 0.18009316260503280271 5.40279487815098402592
55.22856894035754748984 -0.42021737941174319708 28.81490786717695939956 1.28066248963578899200 -2.64136629235522901737 3.92202875115148996699 -7.32378842756749648402 7.20372696679423984989 33.61738850256775634762 4.16215302963725708452 1.08055892937090414208 0.80041402518283966128 -0.72037265042013121086 1.20062108403355227715
21.61117881871454571296 3.42176998541221477623 0.00000000000000000000 -1.20062099151496659566 1.62083853283423429126 -7.32378842756749648402 1.08055883685231868263 -12.60652068846290596582 -108.05589756301968407115 -2.88149050916193960603 -3.24167685750165146530 4.08211154693619882039 1.08055897563019676078 14.40745231451323427052
-198.10247886553611351701 0.90046581302516404133 288.14903241247691312310 -7.20372650420131321880 -1.80093162605032808266 7.20372696679423984989 -12.60652068846290596582 36.01862558211266218677 -648.33538537811818969203 -2.40124216806710455430 -45.02328926347941973063 -25.21304230211167052289 -19.81024719266421740826 -36.01863252100656609400
-1152.59621291663461306598 64.83353715003303818776 3169.63955082627535375650 -67.23477700513551269523 7.20372927975887389351 33.61738850256775634762 -108.05589756301968407115 -648.33538537811818969203 6051.13004148449817876099 -76.83974937814734573749 201.70433656652161857892 -24.01241983029933635407 151.27825103711242604732 -216.11179512603939656401
36.01863252100656609400 0.24012417054741772016 19.20993364379341983295 1.92099373445368359903 5.28273267722904371624 4.16215302963725708452 -2.88149050916193960603 -2.40124216806710455430 -76.83974937814734573749 1.28066236627767526812 -1.68086951764697278833 2.08107648397910027782 -6.96360219487601650457 14.40745300840262643760
10.80558975630196805184 0.09004658130251640136 -93.64844177905949607066 -1.44074520832167696227 6.30326055239827010013 1.08055892937090414208 -3.24167685750165146530 -45.02328926347941973063 201.70433656652161857892 -1.68086951764697278833 5.76298106458317160872 0.24012421680671039437 2.88149053229158580436 -10.80558975630196805184
-22.81180013404456374815 0.42021735628209683222 0.00000185037170770859 -0.32016562240894724800 -2.88149055542123200269 0.80041402518283966128 4.08211154693619882039 -25.21304230211167052289 -24.01241983029933635407 2.08107648397910027782 0.24012421680671039437 3.28169753717312406849 2.28117996714516335643 -2.40124216806710455430
90.04657991473762024270 2.07107130056893806724 7.20372650420131233062 -0.48024834109483544031 0.18009316260503280271 -0.72037265042013121086 1.08055897563019676078 -19.81024719266421740826 151.27825103711242604732 -6.96360219487601650457 2.88149053229158580436 2.28117996714516335643 20.35052688864613301689 -16.20838463445295474230
108.05589062412579437478 -3.60186325210065616531 -432.22359025207879312802 0.00000000000000000000 5.40279487815098402592 1.20062108403355227715 14.40745231451323427052 -36.01863252100656609400 -216.11179512603939656401 14.40745300840262643760 -10.80558975630196805184 -2.40124216806710455430 -16.20838463445295474230 198.10247192664220960978
gradient[14]: -0.00633927915716370194, -0.00025933414860013215, -0.00576298120336105058, 0.00028814905276656572, -0.00011525961851610587, -0.00010565465539495258, 0.00011525961851610587, 0.00086444718050415759, -0.00518668286098034059, -0.00013446955401027102, -0.00037459377266735311, -0.00012486458903874600, 0.00048985339118345905, -0.00057629806482495383
initial_fitness: -2.98530684176687044484
inital_parameters[14]: 0.57171300000000002672, 12.31211899999999914712, -3.30518700000000009709, 148.01025699999999574175, 22.45390199999999936153, 0.42035000000000000142, -0.46885799999999999699, 0.76057900000000000507, -1.36164400000000007651, 177.88423800000001051558, 23.88289199999999823376, 1.21063900000000002066, -1.61197400000000001796, 8.53437800000000024170
result_fitness: -2.98530673342282648619
result_parameters[14]: 0.57171023310301738451, 12.31159045077501801302, -3.30517495098843516743, 148.01059727942629251629, 22.45401028483425420745, 0.42010611491115568139, -0.46892261685864594645, 0.76055531511989238336, -1.36164776621840832860, 177.88429089670174221283, 23.88286053550067578044, 1.21053790484657564086, -1.61181317710548888122, 8.53439162571984688554
number_evaluations: 444
metadata: it: 5, ev: 588
ID: 23920 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 23935 - Posted: 2 Jun 2009, 9:44:28 UTC - in response to Message 23920.  

Sweet. Looks like you're getting what I'm getting. Is that for stripe 20?
ID: 23935 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
trisf

Send message
Joined: 30 Nov 08
Posts: 11
Credit: 25,658
RAC: 0
10 thousand credit badge12 year member badge
Message 23953 - Posted: 2 Jun 2009, 11:40:52 UTC

yes for 20
ID: 23953 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
50 million credit badge13 year member badgeextraordinary contributions badge
Message 23998 - Posted: 2 Jun 2009, 21:12:16 UTC - in response to Message 23935.  

initial likelihood: -2.98530684176687044484

Sweet. Looks like you're getting what I'm getting. Is that for stripe 20?

If that is okay, I guess my single precision ATI implementation will do it, too.

I'm getting a fitness of -2.985312812926748 for the stripe20 test unit.
As a reference point, the stock CPU app (using a complete DP calculation) arrives at a value of -2.985312797571472.

So it appears my approach ist actually a bit (two digits, i.e. 6 bits) more precise :D

In the moment I'm using the integration layout as mentioned in the other thread (mu-r plane) and doing all summations with the Kahan method. This includes the convolution loop and the summation of all the values between different mu-r planes as well as the final reduction (done on GPU in SP as a treelike Kahan sum). That way I have to transfer virtually nothing (16 bytes or so for the whole integral) back from the GPU to the CPU. As it appears to me, it is unnecessary to do the reduction on the CPU. But I have to mention, that I do all CPU operations (including the likelihood compution) in DP. I have to test, if one looses the precision there (or have you tried it already, Travis?).

ID: 23998 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileTravis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
10 thousand credit badge14 year member badge
Message 24013 - Posted: 3 Jun 2009, 0:54:44 UTC - in response to Message 23998.  

I'm working on getting the kahan summation working to see how much that improves the accuracy.

The general plan for milkyway_gpu is to have 2 types of applications:

1. single precision (probably using kahan summation in the kernel for highest accuracy) for GPUs that don't support double precision

2. double precision for GPUs that do support double precision

The server will have different validation for comparing floating point <-> floating point, floating point <-> double, double <-> double results, and we'll update our searches using the floating point values for exploration and double values for highly accurate exploitation.

I'm hoping to have the code updated tomorrow with double and single precision kernels (with the single precision kernel doing a kahan summation). I'll post the values I'm getting for the different streams made available in the code package and we can work from there.

After that the main priority will be getting workunits available on milkyway_gpu.
ID: 24013 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
trisf

Send message
Joined: 30 Nov 08
Posts: 11
Credit: 25,658
RAC: 0
10 thousand credit badge12 year member badge
Message 24210 - Posted: 4 Jun 2009, 22:34:26 UTC
Last modified: 4 Jun 2009, 22:52:23 UTC

CPU(intel core2 e6750) vs GPU(nv geforce 9600gt) computation times linux_x86_64 on stripe-20. Could you explain this?

CPU:
real 4m15.913s
user 4m6.327s
sys 0m0.800s

GPU:
real 15m45.243s
user 15m30.878s
sys 0m0.484s

about makefile
what the difference?
48: LINUX_LDFLAGS_i686 = -L/usr/X11R6/lib -L/usr/local/lib
53: LINUX_LDFLAGS_x86_64 = -L/usr/local/lib
ID: 24210 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilespeedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
50 million credit badge13 year member badge
Message 24314 - Posted: 5 Jun 2009, 23:04:39 UTC - in response to Message 24210.  

looks like the cpu is faster...


CPU(intel core2 e6750) vs GPU(nv geforce 9600gt) computation times linux_x86_64 on stripe-20. Could you explain this?

CPU:
real 4m15.913s
user 4m6.327s
sys 0m0.800s

GPU:
real 15m45.243s
user 15m30.878s
sys 0m0.484s

about makefile
what the difference?
48: LINUX_LDFLAGS_i686 = -L/usr/X11R6/lib -L/usr/local/lib
53: LINUX_LDFLAGS_x86_64 = -L/usr/local/lib


mic.


ID: 24314 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
50 million credit badge13 year member badgeextraordinary contributions badge
Message 24325 - Posted: 6 Jun 2009, 0:00:30 UTC - in response to Message 24314.  

looks like the cpu is faster...

Or the GPU calculates quite a bit more ;)
ID: 24325 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilespeedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
50 million credit badge13 year member badge
Message 24391 - Posted: 6 Jun 2009, 17:58:59 UTC - in response to Message 24325.  

looks like the cpu is faster...

Or the GPU calculates quite a bit more ;)


;))
mic.


ID: 24391 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Application Code Discussion : milkyway & milkywayGPU makefile

©2021 Astroinformatics Group