Message boards :
News :
Separation Project Coming To An End
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 13 · Next
Author | Message |
---|---|
Send message Joined: 22 May 11 Posts: 71 Credit: 5,685,114 RAC: 0 |
|
Send message Joined: 8 Sep 07 Posts: 7 Credit: 2,363,377 RAC: 295 |
the code is available here https://github.com/Milkyway-at-home/milkywayathome_client/blob/master/nbody/kernels/nbody_kernels.cl I wanted to test the CPU vs GPU difference by myself enable OpenCL -DNBODY_OPENCL=ON + specify libraries -DOPENCL_LIBRARIES=C:/mingw/msys64/mingw64/bin/OpenCL.dll -DOPENCL_INCLUDE_DIRS=C:/mingw/msys64/mingw64/include/CL the GPU load rises, but after a few seconds, it crashes on a driver timeout. It could be a problem with my environment and I'm on Windows, so... milkyway_nbody -f settings.lua -o output_0gy.out -h correct_hist.hist -z hist_test.hist -n 32 -b -i 3.0 1.0 0.2 0.2 12 0.2 -p 0 -d 0 Using OpenMP 32 max threads on a system with 32 processors Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 2.1 AMD-APP (3516.0) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices Profile: FULL_PROFILE Using device 0 on platform 0 Found 1 CL device Device 'gfx1100' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Board: AMD Radeon RX 7900 XTX Driver version: 3516.0 (PAL,LC) Version: OpenCL 2.0 AMD-APP (3516.0) Compute capability: 0.0 Max compute units: 48 Clock frequency: 2482 Mhz Global mem size: 25753026560 Local mem size: 65536 Max const buf size: 25753026560 Double extension: cl_khr_fp64 Running MilkyWay@home Nbody v1.85 Optimal Softening Length = 0.112929680735593 kpc Dwarf Initial Position: [-34.375055953159666,104.152234974946268,-20.716946238453204] Dwarf Initial Velocity: [9.468491711421557,92.047887884146391,-59.319309910185652] Initial LMC position: [82.245240275588799,509.425796198476291,-150.407601305508365] Initial LMC velocity: [-16.741582632867392,-129.695727529343941,7.340647956328183] Kernel Compile Flags: -DDEBUG=0 -DDOUBLEPREC=1 -cl-mad-enable -DNBODY=40000 -DEFFNBODY=40000 -DNNODE=79999 -DWARPSIZE=64 -DNOSORT=0 -DTHREADS1=256 -DTHREADS2=256 -DTHREADS3=256 -DTHREADS4=256 -DTHREADS5=256 -DTHREADS6=256 -DTHREADS7=256 -DTHREADS8=256 -DMAXDEPTH=128 -DTIMESTEP=0x1.eed840c14c795p-12 -DEPS2=0x1.a1e4dd2e0b9bdp-7 -DTHETA=0x1p+0 -DUSE_QUAD=1 -DTREECODE=1 -DSW93=0 -DBH86=0 -DEXACT=0 -DUSE_EXTERNAL_POTENTIAL=1 -DDISK_TYPE=1 -DDISK_2_TYPE=0 -DHALO_TYPE=1 -DSPHERE_TYPE=1 -DSPHERICAL_MASS=0x1.2abd3374bc6a8p+17 -DSPHERICAL_SCALE=0x1.6666666666666p-1 -DDISK_MASS=0x1.b36a78d4fdf3bp+18 -DDISK_SCALE_LENGTH=0x1.ap+2 -DDISK_SCALE_HEIGHT=0x1.0a3d70a3d70a4p-2 -DHALO_VHALO=0x1.2a70a3d70a3d7p+6 -DHALO_SCALE_LENGTH=0x1.8p+3 -DHALO_FLATTEN_Z=0x1p+0 -DHALO_FLATTEN_Y=0x0p+0 -DHALO_FLATTEN_X=0x0p+0 -DHALO_TRIAX_ANGLE=0x0p+0 -DHALO_C1=0x0p+0 -DHALO_C2=0x0p+0 -DHALO_C3=0x0p+0 -DHALO_MASS=0x0p+0 -DHALO_GAMMA=0x0p+0 -DHALO_LAMBDA=0x0p+0 -DHALO_RHO0=0x0p+0 -DHAVE_INLINE_PTX=0 -DHAVE_CONSISTENT_MEMORY=0 19:47:17: Process 136840 created scene instance 0 -------------------------------------------------------------------------------- Total timing over 6357 steps: Average Total Fraction ---------------- ---------------- ---------------- boundingBox: 0.000000 0.000000 nan% buildTree: 0.000000 0.000000 nan% summarization: 0.000000 0.000000 nan% sort: 0.000000 0.000000 nan% quad moments: 0.000000 0.000000 nan% forceCalculation: 0.000000 0.000000 nan% integration: 0.000000 0.000000 nan% ============================================================================== total 0.000000 0.000000 nan% -------------------------------------------------------------------------------- 19:47:37: Making final checkpoint Running MilkyWay@home Nbody v1.85 Running MilkyWay@home Nbody v1.85 Error opening histogram file 'correct_hist.hist' 19:47:38: Removing checkpoint file 'nbody_checkpoint' |
Send message Joined: 22 Jan 09 Posts: 2 Credit: 3,431,971 RAC: 169 |
We would appreciate your input on this because we expect that it will probably take some time for GPU-oriented users to swap that hardware over to different projects. How long would you like us to wait before we shut down Separation? You have a loyal following of GPU users, including this isotope, who like using their GPU's on this project. Einstein@Home is my backup for GPU usage. How long to keep this going? Until you can squeeze not a single additional useful WU out of your GPU coding. Not a second less. You have a great thing going here. I would hope you'd run it as long as your equipment, time (given your research paper coming out), and funds allow- Best to you and your great project- :) |
Send message Joined: 18 Nov 22 Posts: 84 Credit: 640,530,847 RAC: 0 |
the code is available here how did you compile the binary? I tried using the included build.py script but it just complains that it can't find boinc and gives an error. and there seem to be no other specific instructions from the github page. can you post exactly what you did? as well as how you ran it. i dont see this error in your task list. did you run it manually? or with anonymous platform? |
Send message Joined: 12 Nov 21 Posts: 236 Credit: 575,038,236 RAC: 0 |
Question for Tom. In this video, https://www.youtube.com/watch?v=ma44b8-SLcA Professor Heidi mentions wanting to run simulations on multiple dwarf galaxies at the same time. Starts at about 11:40 in the video. Has this been done already? And will it be part of the paper? This is my mostedest favoritedest project ever! Thanks a bunch. |
Send message Joined: 8 Sep 07 Posts: 7 Credit: 2,363,377 RAC: 295 |
> how did you compile the binary? I'm testing the standalone version because dealing with Boinc dependencies is painful... it's easier to build it on Linux: 1/ install prerequisites (depends on your system, not sure if this is the full list) apt-get install -y git build-essential cmake ocl-icd-opencl-dev ninja-build 2/ git clone git clone https://github.com/Milkyway-at-home/milkywayathome_client.git cd milkywayathome_client 3/ enable OPENCL build sed -i 's/DNBODY_OPENCL=OFF/DNBODY_OPENCL=ON/g' make_nbody_lite.sh 4/ compile it sh make_nbody_lite.sh cd ../../build 5/ the binary should be here ./milkyway_nbody also, I had to change to code to use the right device (CPU OpenCL platform is preferred for some reason) and [-p|--platform=INT][-d|--device=INT] options doesn't seem to work as expected.[/code] |
Send message Joined: 8 May 22 Posts: 3 Credit: 4,640,216 RAC: 0 |
Hi all, I have no idea what you lot are talking about. I just like to provide the use of my humble laptop to my chosen project. I'm happy for the notice period to be what ever someone deems it to be. What would be more useful for me is how to get onboard the nbody number crunching. I've never been one to chase credits so that side of it is unimportant. Being a lorry driver my laptop sits on my bunk for up to 15 hours a day so feel it should be able to help out with this nbody thingy. All I need to know is when and how to switch to it. Thank you. Oh, and congratulations on getting what you needed and the submission of all relevant papers. |
Send message Joined: 7 Aug 22 Posts: 9 Credit: 20,033,952 RAC: 0 |
Is this something that can help? https://developer.nvidia.com/gpugems/gpugems3/part-v-physics-simulation/chapter-31-fast-n-body-simulation-cuda |
Send message Joined: 9 Aug 09 Posts: 10 Credit: 6,530,063 RAC: 0 |
I curious as well if Macs well be supported again in Nbody Sjmielh |
Send message Joined: 18 Nov 22 Posts: 84 Credit: 640,530,847 RAC: 0 |
> how did you compile the binary? thanks, I was able to compile it with these instructions and a fresh git pull. I did not make any changes to the code except the DNBODY_OPENCL=ON flag modification. running standalone with your same command line arguments seems to run fine. it loads up the GPU to 100%, but I can't be sure that it's actually doing anything useful or it's just spinning it's wheels. been running over an hour now on a low power RTX 3050 GPU. not sure what an expected runtime would or should be. this GPU is pretty weak and has pretty bad FP64 performance. but I just started another run on a Titan V to compare. |
Send message Joined: 28 May 17 Posts: 76 Credit: 4,398,910,125 RAC: 24 |
https://sech.me/boinc/Amicable/ While those projects do use GPUs, none of them benefit from having high FP64 compute. Which means the P100s would be essentially wasting electricity running those projects when there are far better alternatives that can crunch more work with the same amount of power used. So unless Separation continues then there are no other projects that benefit from FP64. Which would ultimately mean any old GPU that has good FP64 would most likely not run other projects that well. Such as the AMD 200 series cards, S9000 series cards and the P100s. The Titan Vs still do pretty decent at other projects, but for the price of them newer GPUs would be a better option. |
Send message Joined: 22 May 11 Posts: 71 Credit: 5,685,114 RAC: 0 |
I think https://foldingathome.org/ uses double precision. |
Send message Joined: 18 Nov 22 Posts: 84 Credit: 640,530,847 RAC: 0 |
I think https://foldingathome.org/ uses double precision. several projects "use" DP, like Einstein and Asteroids. but it's not a large portion of the total computation time like Milkyway Separation is/was. so even though a P100 has decent FP64 performance, it's relatively low FP32 performance makes it fall behind in these hybrid FP32/FP64 situations. far more worth it to go with a card with better FP32 since that's where most of the time is spent. |
Send message Joined: 13 Jun 09 Posts: 24 Credit: 140,147,805 RAC: 125,159 |
Not really: "Consumer GPUs are really bad at double precision calculations (so manufacturers can sell more expensive enterprise units to researchers). Luckily, molecular dynamics really only requires single precision to be useful. Folding@Home uses single precision." https://fahbench.github.io/details.html |
Send message Joined: 28 May 17 Posts: 76 Credit: 4,398,910,125 RAC: 24 |
I think https://foldingathome.org/ uses double precision. It does not. As Ian said, even though some projects do have some FP64 in their code, it's very minimal where higher DP doesn't really benefit that well compared to having higher FP32. F@H is not one of them. |
Send message Joined: 8 Sep 07 Posts: 7 Credit: 2,363,377 RAC: 295 |
it loads up the GPU to 100%, but I can't be sure that it's actually doing anything useful or it's just spinning it's wheels yeah, I've retested it on my linux + nvidia (turing) PC, but it seems the current nbody opencl version doesn't work at all. Maybe developers here have some insights if it ever worked? Anyway, if the current code is buggy, it would take much more effort to fix it and optimize it. It would be great to have a GPU version, but I don't think I'll be able to fix it... |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Hi all, Your pc is hidden so it's hard to be exact but something like this hsoul work for you: <app_config> <app_version> <app_name>milkyway_nbody</app_name> <plan_class>mt</plan_class> <avg_ncpus>2</avg_ncpus> <cmdline>--nthreads 2</cmdline> </app_version> <project_max_concurrent>3</project_max_concurrent> </app_config> Copy that into Notepad in windows and put that in your, in Windows, c:\program data\boinc\projects\milyway.cs.rpi..edu_milkyway folder and save it is app_config.xml. Be sure it saves it exactly as I wrote and NOT with a ',txt' on the end of it. Then go into the Boinc Manager and click on options, read config files and then you can get the NBody tasks using 2 cpu cores per task and only running 3 tasks at one time on your pc. If you need to change that just be sure to use Notepad in Windows as a word processing program adds hidden stuff that Boinc can't read. Asking for anymore help should be done in the Number Crunching thread so we don't get bogged down here. |
Send message Joined: 8 May 22 Posts: 3 Credit: 4,640,216 RAC: 0 |
Thank you for the reply. I have a MacBook Air. I wasn't directly asking for help on this thread more a general point of when and how when the time comes. |
Send message Joined: 13 Dec 12 Posts: 101 Credit: 1,782,758,310 RAC: 0 |
?? All the ones you are using Crashtech :) |
Send message Joined: 13 Dec 12 Posts: 101 Credit: 1,782,758,310 RAC: 0 |
https://sech.me/boinc/Amicable/ I've been switching over to NGREEDIA gpu's anyway lately. They are now pretty good at Einstein and excellent at Primegrid. Quite power efficient the 40 series as well. I wrote a while ago on this forum if anyone was worried about crunchers dropping off as newer GPU's don't use FP64 like the old days. I don't even bother with Milkyway & NVIDIA cards. Not worth it. |
©2024 Astroinformatics Group