Separation Project Coming To An End

Author	Message
kotenok2000 Send message Joined: 22 May 11 Posts: 75 Credit: 5,758,579 RAC: 115	Message 75534 - Posted: 14 Jun 2023, 17:28:30 UTC - in response to Message 75533. There are two stale branches https://github.com/Milkyway-at-home/milkywayathome_client/tree/meyerGPU https://github.com/Milkyway-at-home/milkywayathome_client/tree/sheilsGPU ID: 75534 · Rating: 0 · rate: / Reply Quote

ahorek's team Send message Joined: 8 Sep 07 Posts: 8 Credit: 2,556,455 RAC: 20	Message 75535 - Posted: 14 Jun 2023, 18:05:28 UTC - in response to Message 75533. the code is available here https://github.com/Milkyway-at-home/milkywayathome_client/blob/master/nbody/kernels/nbody_kernels.cl I wanted to test the CPU vs GPU difference by myself enable OpenCL -DNBODY_OPENCL=ON + specify libraries -DOPENCL_LIBRARIES=C:/mingw/msys64/mingw64/bin/OpenCL.dll -DOPENCL_INCLUDE_DIRS=C:/mingw/msys64/mingw64/include/CL the GPU load rises, but after a few seconds, it crashes on a driver timeout. It could be a problem with my environment and I'm on Windows, so... milkyway_nbody -f settings.lua -o output_0gy.out -h correct_hist.hist -z hist_test.hist -n 32 -b -i 3.0 1.0 0.2 0.2 12 0.2 -p 0 -d 0 Using OpenMP 32 max threads on a system with 32 processors Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 2.1 AMD-APP (3516.0) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices Profile: FULL_PROFILE Using device 0 on platform 0 Found 1 CL device Device 'gfx1100' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Board: AMD Radeon RX 7900 XTX Driver version: 3516.0 (PAL,LC) Version: OpenCL 2.0 AMD-APP (3516.0) Compute capability: 0.0 Max compute units: 48 Clock frequency: 2482 Mhz Global mem size: 25753026560 Local mem size: 65536 Max const buf size: 25753026560 Double extension: cl_khr_fp64 Running MilkyWay@home Nbody v1.85 Optimal Softening Length = 0.112929680735593 kpc Dwarf Initial Position: [-34.375055953159666,104.152234974946268,-20.716946238453204] Dwarf Initial Velocity: [9.468491711421557,92.047887884146391,-59.319309910185652] Initial LMC position: [82.245240275588799,509.425796198476291,-150.407601305508365] Initial LMC velocity: [-16.741582632867392,-129.695727529343941,7.340647956328183] Kernel Compile Flags: -DDEBUG=0 -DDOUBLEPREC=1 -cl-mad-enable -DNBODY=40000 -DEFFNBODY=40000 -DNNODE=79999 -DWARPSIZE=64 -DNOSORT=0 -DTHREADS1=256 -DTHREADS2=256 -DTHREADS3=256 -DTHREADS4=256 -DTHREADS5=256 -DTHREADS6=256 -DTHREADS7=256 -DTHREADS8=256 -DMAXDEPTH=128 -DTIMESTEP=0x1.eed840c14c795p-12 -DEPS2=0x1.a1e4dd2e0b9bdp-7 -DTHETA=0x1p+0 -DUSE_QUAD=1 -DTREECODE=1 -DSW93=0 -DBH86=0 -DEXACT=0 -DUSE_EXTERNAL_POTENTIAL=1 -DDISK_TYPE=1 -DDISK_2_TYPE=0 -DHALO_TYPE=1 -DSPHERE_TYPE=1 -DSPHERICAL_MASS=0x1.2abd3374bc6a8p+17 -DSPHERICAL_SCALE=0x1.6666666666666p-1 -DDISK_MASS=0x1.b36a78d4fdf3bp+18 -DDISK_SCALE_LENGTH=0x1.ap+2 -DDISK_SCALE_HEIGHT=0x1.0a3d70a3d70a4p-2 -DHALO_VHALO=0x1.2a70a3d70a3d7p+6 -DHALO_SCALE_LENGTH=0x1.8p+3 -DHALO_FLATTEN_Z=0x1p+0 -DHALO_FLATTEN_Y=0x0p+0 -DHALO_FLATTEN_X=0x0p+0 -DHALO_TRIAX_ANGLE=0x0p+0 -DHALO_C1=0x0p+0 -DHALO_C2=0x0p+0 -DHALO_C3=0x0p+0 -DHALO_MASS=0x0p+0 -DHALO_GAMMA=0x0p+0 -DHALO_LAMBDA=0x0p+0 -DHALO_RHO0=0x0p+0 -DHAVE_INLINE_PTX=0 -DHAVE_CONSISTENT_MEMORY=0 19:47:17: Process 136840 created scene instance 0 -------------------------------------------------------------------------------- Total timing over 6357 steps: Average Total Fraction ---------------- ---------------- ---------------- boundingBox: 0.000000 0.000000 nan% buildTree: 0.000000 0.000000 nan% summarization: 0.000000 0.000000 nan% sort: 0.000000 0.000000 nan% quad moments: 0.000000 0.000000 nan% forceCalculation: 0.000000 0.000000 nan% integration: 0.000000 0.000000 nan% ============================================================================== total 0.000000 0.000000 nan% -------------------------------------------------------------------------------- 19:47:37: Making final checkpoint Running MilkyWay@home Nbody v1.85 Running MilkyWay@home Nbody v1.85 Error opening histogram file 'correct_hist.hist' 19:47:38: Removing checkpoint file 'nbody_checkpoint' ID: 75535 · Rating: 0 · rate: / Reply Quote

Cesium_133* Send message Joined: 22 Jan 09 Posts: 2 Credit: 3,574,972 RAC: 1,238	Message 75536 - Posted: 14 Jun 2023, 18:33:41 UTC - in response to Message 75485. We would appreciate your input on this because we expect that it will probably take some time for GPU-oriented users to swap that hardware over to different projects. How long would you like us to wait before we shut down Separation? You have a loyal following of GPU users, including this isotope, who like using their GPU's on this project. Einstein@Home is my backup for GPU usage. How long to keep this going? Until you can squeeze not a single additional useful WU out of your GPU coding. Not a second less. You have a great thing going here. I would hope you'd run it as long as your equipment, time (given your research paper coming out), and funds allow- Best to you and your great project- :) ID: 75536 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 18 Nov 22 Posts: 84 Credit: 653,454,842 RAC: 17,912	Message 75537 - Posted: 14 Jun 2023, 19:06:16 UTC - in response to Message 75535. the code is available here https://github.com/Milkyway-at-home/milkywayathome_client/blob/master/nbody/kernels/nbody_kernels.cl I wanted to test the CPU vs GPU difference by myself enable OpenCL -DNBODY_OPENCL=ON + specify libraries -DOPENCL_LIBRARIES=C:/mingw/msys64/mingw64/bin/OpenCL.dll -DOPENCL_INCLUDE_DIRS=C:/mingw/msys64/mingw64/include/CL the GPU load rises, but after a few seconds, it crashes on a driver timeout. It could be a problem with my environment and I'm on Windows, so... how did you compile the binary? I tried using the included build.py script but it just complains that it can't find boinc and gives an error. and there seem to be no other specific instructions from the github page. can you post exactly what you did? as well as how you ran it. i dont see this error in your task list. did you run it manually? or with anonymous platform? ID: 75537 · Rating: 0 · rate: / Reply Quote

HRFMguy Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0	Message 75538 - Posted: 14 Jun 2023, 19:28:11 UTC Question for Tom. In this video, https://www.youtube.com/watch?v=ma44b8-SLcA Professor Heidi mentions wanting to run simulations on multiple dwarf galaxies at the same time. Starts at about 11:40 in the video. Has this been done already? And will it be part of the paper? This is my mostedest favoritedest project ever! Thanks a bunch. ID: 75538 · Rating: 0 · rate: / Reply Quote

ahorek's team Send message Joined: 8 Sep 07 Posts: 8 Credit: 2,556,455 RAC: 20	Message 75539 - Posted: 14 Jun 2023, 19:31:43 UTC - in response to Message 75537. Last modified: 14 Jun 2023, 19:32:55 UTC > how did you compile the binary? I'm testing the standalone version because dealing with Boinc dependencies is painful... it's easier to build it on Linux: 1/ install prerequisites (depends on your system, not sure if this is the full list) apt-get install -y git build-essential cmake ocl-icd-opencl-dev ninja-build 2/ git clone git clone https://github.com/Milkyway-at-home/milkywayathome_client.git cd milkywayathome_client 3/ enable OPENCL build sed -i 's/DNBODY_OPENCL=OFF/DNBODY_OPENCL=ON/g' make_nbody_lite.sh 4/ compile it sh make_nbody_lite.sh cd ../../build 5/ the binary should be here ./milkyway_nbody also, I had to change to code to use the right device (CPU OpenCL platform is preferred for some reason) and [-p\|--platform=INT][-d\|--device=INT] options doesn't seem to work as expected.[/code] ID: 75539 · Rating: 0 · rate: / Reply Quote

Neil Haste Send message Joined: 8 May 22 Posts: 3 Credit: 4,640,216 RAC: 0	Message 75540 - Posted: 14 Jun 2023, 19:33:09 UTC Hi all, I have no idea what you lot are talking about. I just like to provide the use of my humble laptop to my chosen project. I'm happy for the notice period to be what ever someone deems it to be. What would be more useful for me is how to get onboard the nbody number crunching. I've never been one to chase credits so that side of it is unimportant. Being a lorry driver my laptop sits on my bunk for up to 15 hours a day so feel it should be able to help out with this nbody thingy. All I need to know is when and how to switch to it. Thank you. Oh, and congratulations on getting what you needed and the submission of all relevant papers. ID: 75540 · Rating: 0 · rate: / Reply Quote

Xterelle Send message Joined: 7 Aug 22 Posts: 24 Credit: 21,752,596 RAC: 18,406	Message 75541 - Posted: 14 Jun 2023, 20:09:19 UTC Is this something that can help? https://developer.nvidia.com/gpugems/gpugems3/part-v-physics-simulation/chapter-31-fast-n-body-simulation-cuda ID: 75541 · Rating: 0 · rate: / Reply Quote

sjmielh Send message Joined: 9 Aug 09 Posts: 10 Credit: 6,530,063 RAC: 0	Message 75542 - Posted: 14 Jun 2023, 20:24:31 UTC I curious as well if Macs well be supported again in Nbody Sjmielh ID: 75542 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 18 Nov 22 Posts: 84 Credit: 653,454,842 RAC: 17,912	Message 75543 - Posted: 14 Jun 2023, 22:08:43 UTC - in response to Message 75539. > how did you compile the binary? I'm testing the standalone version because dealing with Boinc dependencies is painful... it's easier to build it on Linux: 1/ install prerequisites (depends on your system, not sure if this is the full list) apt-get install -y git build-essential cmake ocl-icd-opencl-dev ninja-build 2/ git clone git clone https://github.com/Milkyway-at-home/milkywayathome_client.git cd milkywayathome_client 3/ enable OPENCL build sed -i 's/DNBODY_OPENCL=OFF/DNBODY_OPENCL=ON/g' make_nbody_lite.sh 4/ compile it sh make_nbody_lite.sh cd ../../build 5/ the binary should be here ./milkyway_nbody also, I had to change to code to use the right device (CPU OpenCL platform is preferred for some reason) and [-p\|--platform=INT][-d\|--device=INT] options doesn't seem to work as expected.[/code] thanks, I was able to compile it with these instructions and a fresh git pull. I did not make any changes to the code except the DNBODY_OPENCL=ON flag modification. running standalone with your same command line arguments seems to run fine. it loads up the GPU to 100%, but I can't be sure that it's actually doing anything useful or it's just spinning it's wheels. been running over an hour now on a low power RTX 3050 GPU. not sure what an expected runtime would or should be. this GPU is pretty weak and has pretty bad FP64 performance. but I just started another run on a Titan V to compare. ID: 75543 · Rating: 0 · rate: / Reply Quote

Skillz Send message Joined: 28 May 17 Posts: 76 Credit: 4,406,297,772 RAC: 58,184	Message 75544 - Posted: 14 Jun 2023, 22:16:01 UTC - in response to Message 75516. https://sech.me/boinc/Amicable/ http://asteroidsathome.net/boinc/ https://einsteinathome.org/ https://www.gpugrid.net/ https://numberfields.asu.edu/NumberFields/ https://www.primegrid.com/ https://srbase.my-firewall.org/sr5/ https://www.worldcommunitygrid.org/ https://foldingathome.org/ http://boincvm.proxyma.ru:30080/test4vm/ invite code is "PrimeGrid" http://gerasim.boinc.ru/ While those projects do use GPUs, none of them benefit from having high FP64 compute. Which means the P100s would be essentially wasting electricity running those projects when there are far better alternatives that can crunch more work with the same amount of power used. So unless Separation continues then there are no other projects that benefit from FP64. Which would ultimately mean any old GPU that has good FP64 would most likely not run other projects that well. Such as the AMD 200 series cards, S9000 series cards and the P100s. The Titan Vs still do pretty decent at other projects, but for the price of them newer GPUs would be a better option. ID: 75544 · Rating: 0 · rate: / Reply Quote

kotenok2000 Send message Joined: 22 May 11 Posts: 75 Credit: 5,758,579 RAC: 115	Message 75545 - Posted: 14 Jun 2023, 22:20:16 UTC - in response to Message 75544. I think https://foldingathome.org/ uses double precision. ID: 75545 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 18 Nov 22 Posts: 84 Credit: 653,454,842 RAC: 17,912	Message 75546 - Posted: 14 Jun 2023, 22:36:24 UTC - in response to Message 75545. I think https://foldingathome.org/ uses double precision. several projects "use" DP, like Einstein and Asteroids. but it's not a large portion of the total computation time like Milkyway Separation is/was. so even though a P100 has decent FP64 performance, it's relatively low FP32 performance makes it fall behind in these hybrid FP32/FP64 situations. far more worth it to go with a card with better FP32 since that's where most of the time is spent. ID: 75546 · Rating: 0 · rate: / Reply Quote

Toby Broom Send message Joined: 13 Jun 09 Posts: 24 Credit: 161,800,997 RAC: 190,760	Message 75547 - Posted: 14 Jun 2023, 22:39:55 UTC - in response to Message 75545. Not really: "Consumer GPUs are really bad at double precision calculations (so manufacturers can sell more expensive enterprise units to researchers). Luckily, molecular dynamics really only requires single precision to be useful. Folding@Home uses single precision." https://fahbench.github.io/details.html ID: 75547 · Rating: 0 · rate: / Reply Quote

Skillz Send message Joined: 28 May 17 Posts: 76 Credit: 4,406,297,772 RAC: 58,184	Message 75548 - Posted: 14 Jun 2023, 22:48:38 UTC - in response to Message 75545. I think https://foldingathome.org/ uses double precision. It does not. As Ian said, even though some projects do have some FP64 in their code, it's very minimal where higher DP doesn't really benefit that well compared to having higher FP32. F@H is not one of them. ID: 75548 · Rating: 0 · rate: / Reply Quote

ahorek's team Send message Joined: 8 Sep 07 Posts: 8 Credit: 2,556,455 RAC: 20	Message 75549 - Posted: 14 Jun 2023, 23:15:43 UTC - in response to Message 75543. it loads up the GPU to 100%, but I can't be sure that it's actually doing anything useful or it's just spinning it's wheels yeah, I've retested it on my linux + nvidia (turing) PC, but it seems the current nbody opencl version doesn't work at all. Maybe developers here have some insights if it ever worked? Anyway, if the current code is buggy, it would take much more effort to fix it and optimize it. It would be great to have a GPU version, but I don't think I'll be able to fix it... ID: 75549 · Rating: 0 · rate: / Reply Quote

S984s5KN6muKjYePgfqf7F37RiXw5f... Send message Joined: 8 May 09 Posts: 3339 Credit: 524,374,462 RAC: 2,651	Message 75550 - Posted: 15 Jun 2023, 3:22:19 UTC - in response to Message 75540. Last modified: 15 Jun 2023, 3:30:50 UTC Hi all, I have no idea what you lot are talking about. I just like to provide the use of my humble laptop to my chosen project. I'm happy for the notice period to be what ever someone deems it to be. What would be more useful for me is how to get onboard the nbody number crunching. I've never been one to chase credits so that side of it is unimportant. Being a lorry driver my laptop sits on my bunk for up to 15 hours a day so feel it should be able to help out with this nbody thingy. All I need to know is when and how to switch to it. Thank you. Oh, and congratulations on getting what you needed and the submission of all relevant papers. Your pc is hidden so it's hard to be exact but something like this hsoul work for you: <app_config> <app_version> <app_name>milkyway_nbody</app_name> <plan_class>mt</plan_class> <avg_ncpus>2</avg_ncpus> <cmdline>--nthreads 2</cmdline> </app_version> <project_max_concurrent>3</project_max_concurrent> </app_config> Copy that into Notepad in windows and put that in your, in Windows, c:\program data\boinc\projects\milyway.cs.rpi..edu_milkyway folder and save it is app_config.xml. Be sure it saves it exactly as I wrote and NOT with a ',txt' on the end of it. Then go into the Boinc Manager and click on options, read config files and then you can get the NBody tasks using 2 cpu cores per task and only running 3 tasks at one time on your pc. If you need to change that just be sure to use Notepad in Windows as a word processing program adds hidden stuff that Boinc can't read. Asking for anymore help should be done in the Number Crunching thread so we don't get bogged down here. ID: 75550 · Rating: 0 · rate: / Reply Quote

Neil Haste Send message Joined: 8 May 22 Posts: 3 Credit: 4,640,216 RAC: 0	Message 75552 - Posted: 15 Jun 2023, 4:29:45 UTC - in response to Message 75550. Last modified: 15 Jun 2023, 4:30:47 UTC Your pc is hidden so it's hard to be exact but something like this hsoul work for you: <app_config> <app_version> <app_name>milkyway_nbody</app_name> <plan_class>mt</plan_class> <avg_ncpus>2</avg_ncpus> <cmdline>--nthreads 2</cmdline> </app_version> <project_max_concurrent>3</project_max_concurrent> </app_config> Copy that into Notepad in windows and put that in your, in Windows, c:\program data\boinc\projects\milyway.cs.rpi..edu_milkyway folder and save it is app_config.xml. Be sure it saves it exactly as I wrote and NOT with a ',txt' on the end of it. Then go into the Boinc Manager and click on options, read config files and then you can get the NBody tasks using 2 cpu cores per task and only running 3 tasks at one time on your pc. If you need to change that just be sure to use Notepad in Windows as a word processing program adds hidden stuff that Boinc can't read. Asking for anymore help should be done in the Number Crunching thread so we don't get bogged down here. Thank you for the reply. I have a MacBook Air. I wasn't directly asking for help on this thread more a general point of when and how when the time comes. ID: 75552 · Rating: 0 · rate: / Reply Quote

Chooka Send message Joined: 13 Dec 12 Posts: 101 Credit: 1,782,952,901 RAC: 0	Message 75553 - Posted: 15 Jun 2023, 7:23:48 UTC - in response to Message 75514. Last modified: 15 Jun 2023, 7:24:35 UTC ?? All the ones you are using Crashtech :) ID: 75553 · Rating: 0 · rate: / Reply Quote

Chooka Send message Joined: 13 Dec 12 Posts: 101 Credit: 1,782,952,901 RAC: 0	Message 75554 - Posted: 15 Jun 2023, 7:30:21 UTC - in response to Message 75544. Last modified: 15 Jun 2023, 7:32:54 UTC https://sech.me/boinc/Amicable/ http://asteroidsathome.net/boinc/ https://einsteinathome.org/ https://www.gpugrid.net/ https://numberfields.asu.edu/NumberFields/ https://www.primegrid.com/ https://srbase.my-firewall.org/sr5/ https://www.worldcommunitygrid.org/ https://foldingathome.org/ http://boincvm.proxyma.ru:30080/test4vm/ invite code is "PrimeGrid" http://gerasim.boinc.ru/ While those projects do use GPUs, none of them benefit from having high FP64 compute. Which means the P100s would be essentially wasting electricity running those projects when there are far better alternatives that can crunch more work with the same amount of power used. So unless Separation continues then there are no other projects that benefit from FP64. Which would ultimately mean any old GPU that has good FP64 would most likely not run other projects that well. Such as the AMD 200 series cards, S9000 series cards and the P100s. The Titan Vs still do pretty decent at other projects, but for the price of them newer GPUs would be a better option. I've been switching over to NGREEDIA gpu's anyway lately. They are now pretty good at Einstein and excellent at Primegrid. Quite power efficient the 40 series as well. I wrote a while ago on this forum if anyone was worried about crunchers dropping off as newer GPU's don't use FP64 like the old days. I don't even bother with Milkyway & NVIDIA cards. Not worth it. ID: 75554 · Rating: 0 · rate: / Reply Quote