Welcome to MilkyWay@home

Separation Project Coming To An End

Message boards : News : Separation Project Coming To An End
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 13 · Next

AuthorMessage
kotenok2000
Avatar

Send message
Joined: 22 May 11
Posts: 71
Credit: 5,685,114
RAC: 0
Message 75534 - Posted: 14 Jun 2023, 17:28:30 UTC - in response to Message 75533.  

ID: 75534 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ahorek's team

Send message
Joined: 8 Sep 07
Posts: 7
Credit: 2,363,377
RAC: 295
Message 75535 - Posted: 14 Jun 2023, 18:05:28 UTC - in response to Message 75533.  

the code is available here
https://github.com/Milkyway-at-home/milkywayathome_client/blob/master/nbody/kernels/nbody_kernels.cl

I wanted to test the CPU vs GPU difference by myself

enable OpenCL
-DNBODY_OPENCL=ON
+ specify libraries
-DOPENCL_LIBRARIES=C:/mingw/msys64/mingw64/bin/OpenCL.dll -DOPENCL_INCLUDE_DIRS=C:/mingw/msys64/mingw64/include/CL

the GPU load rises, but after a few seconds, it crashes on a driver timeout.

It could be a problem with my environment and I'm on Windows, so...
milkyway_nbody -f settings.lua -o output_0gy.out -h correct_hist.hist -z hist_test.hist -n 32 -b  -i 3.0 1.0 0.2 0.2 12 0.2 -p 0 -d 0
Using OpenMP 32 max threads on a system with 32 processors
Found 1 platform
Platform 0 information:
  Name:       AMD Accelerated Parallel Processing
  Version:    OpenCL 2.1 AMD-APP (3516.0)
  Vendor:     Advanced Micro Devices, Inc.
  Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
  Profile:    FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'gfx1100' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: AMD Radeon RX 7900 XTX
Driver version:      3516.0 (PAL,LC)
Version:             OpenCL 2.0 AMD-APP (3516.0)
Compute capability:  0.0
Max compute units:   48
Clock frequency:     2482 Mhz
Global mem size:     25753026560
Local mem size:      65536
Max const buf size:  25753026560
Double extension:    cl_khr_fp64
Running MilkyWay@home Nbody v1.85
Optimal Softening Length = 0.112929680735593 kpc
Dwarf Initial Position: [-34.375055953159666,104.152234974946268,-20.716946238453204]
Dwarf Initial Velocity: [9.468491711421557,92.047887884146391,-59.319309910185652]
Initial LMC position: [82.245240275588799,509.425796198476291,-150.407601305508365]
Initial LMC velocity: [-16.741582632867392,-129.695727529343941,7.340647956328183]
Kernel Compile Flags: -DDEBUG=0 -DDOUBLEPREC=1 -cl-mad-enable -DNBODY=40000 -DEFFNBODY=40000 -DNNODE=79999 -DWARPSIZE=64 -DNOSORT=0 -DTHREADS1=256 -DTHREADS2=256 -DTHREADS3=256 -DTHREADS4=256 -DTHREADS5=256 -DTHREADS6=256 -DTHREADS7=256 -DTHREADS8=256 -DMAXDEPTH=128 -DTIMESTEP=0x1.eed840c14c795p-12 -DEPS2=0x1.a1e4dd2e0b9bdp-7 -DTHETA=0x1p+0 -DUSE_QUAD=1 -DTREECODE=1 -DSW93=0 -DBH86=0 -DEXACT=0 -DUSE_EXTERNAL_POTENTIAL=1 -DDISK_TYPE=1 -DDISK_2_TYPE=0 -DHALO_TYPE=1 -DSPHERE_TYPE=1 -DSPHERICAL_MASS=0x1.2abd3374bc6a8p+17 -DSPHERICAL_SCALE=0x1.6666666666666p-1 -DDISK_MASS=0x1.b36a78d4fdf3bp+18 -DDISK_SCALE_LENGTH=0x1.ap+2 -DDISK_SCALE_HEIGHT=0x1.0a3d70a3d70a4p-2 -DHALO_VHALO=0x1.2a70a3d70a3d7p+6 -DHALO_SCALE_LENGTH=0x1.8p+3 -DHALO_FLATTEN_Z=0x1p+0 -DHALO_FLATTEN_Y=0x0p+0 -DHALO_FLATTEN_X=0x0p+0 -DHALO_TRIAX_ANGLE=0x0p+0 -DHALO_C1=0x0p+0 -DHALO_C2=0x0p+0 -DHALO_C3=0x0p+0 -DHALO_MASS=0x0p+0 -DHALO_GAMMA=0x0p+0 -DHALO_LAMBDA=0x0p+0 -DHALO_RHO0=0x0p+0   -DHAVE_INLINE_PTX=0 -DHAVE_CONSISTENT_MEMORY=0
19:47:17: Process 136840 created scene instance 0

--------------------------------------------------------------------------------
Total timing over 6357 steps:
                         Average             Total            Fraction
                    ----------------   ----------------   ----------------
  boundingBox:              0.000000           0.000000               nan%
  buildTree:                0.000000           0.000000               nan%
  summarization:            0.000000           0.000000               nan%
  sort:                     0.000000           0.000000               nan%
  quad moments:             0.000000           0.000000               nan%
  forceCalculation:         0.000000           0.000000               nan%
  integration:              0.000000           0.000000               nan%
  ==============================================================================
  total                     0.000000           0.000000               nan%

--------------------------------------------------------------------------------

19:47:37: Making final checkpoint
Running MilkyWay@home Nbody v1.85
Running MilkyWay@home Nbody v1.85
Error opening histogram file 'correct_hist.hist'
19:47:38: Removing checkpoint file 'nbody_checkpoint'
ID: 75535 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Cesium_133*

Send message
Joined: 22 Jan 09
Posts: 2
Credit: 3,431,971
RAC: 169
Message 75536 - Posted: 14 Jun 2023, 18:33:41 UTC - in response to Message 75485.  

We would appreciate your input on this because we expect that it will probably take some time for GPU-oriented users to swap that hardware over to different projects. How long would you like us to wait before we shut down Separation?


You have a loyal following of GPU users, including this isotope, who like using their GPU's on this project. Einstein@Home is my backup for GPU usage. How long to keep this going? Until you can squeeze not a single additional useful WU out of your GPU coding. Not a second less. You have a great thing going here. I would hope you'd run it as long as your equipment, time (given your research paper coming out), and funds allow-

Best to you and your great project- :)
ID: 75536 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 18 Nov 22
Posts: 84
Credit: 640,530,847
RAC: 0
Message 75537 - Posted: 14 Jun 2023, 19:06:16 UTC - in response to Message 75535.  

the code is available here
https://github.com/Milkyway-at-home/milkywayathome_client/blob/master/nbody/kernels/nbody_kernels.cl

I wanted to test the CPU vs GPU difference by myself

enable OpenCL
-DNBODY_OPENCL=ON
+ specify libraries
-DOPENCL_LIBRARIES=C:/mingw/msys64/mingw64/bin/OpenCL.dll -DOPENCL_INCLUDE_DIRS=C:/mingw/msys64/mingw64/include/CL

the GPU load rises, but after a few seconds, it crashes on a driver timeout.

It could be a problem with my environment and I'm on Windows, so...


how did you compile the binary? I tried using the included build.py script but it just complains that it can't find boinc and gives an error. and there seem to be no other specific instructions from the github page.

can you post exactly what you did? as well as how you ran it. i dont see this error in your task list. did you run it manually? or with anonymous platform?

ID: 75537 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,038,236
RAC: 0
Message 75538 - Posted: 14 Jun 2023, 19:28:11 UTC

Question for Tom. In this video, https://www.youtube.com/watch?v=ma44b8-SLcA Professor Heidi mentions wanting to run simulations on multiple dwarf galaxies at the same time. Starts at about 11:40 in the video. Has this been done already? And will it be part of the paper?

This is my mostedest favoritedest project ever! Thanks a bunch.
ID: 75538 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ahorek's team

Send message
Joined: 8 Sep 07
Posts: 7
Credit: 2,363,377
RAC: 295
Message 75539 - Posted: 14 Jun 2023, 19:31:43 UTC - in response to Message 75537.  
Last modified: 14 Jun 2023, 19:32:55 UTC

> how did you compile the binary?
I'm testing the standalone version because dealing with Boinc dependencies is painful... it's easier to build it on Linux:
1/ install prerequisites (depends on your system, not sure if this is the full list)
apt-get install -y git build-essential cmake ocl-icd-opencl-dev ninja-build
2/ git clone
git clone https://github.com/Milkyway-at-home/milkywayathome_client.git
cd milkywayathome_client
3/ enable OPENCL build
sed -i 's/DNBODY_OPENCL=OFF/DNBODY_OPENCL=ON/g' make_nbody_lite.sh
4/ compile it
sh make_nbody_lite.sh
cd ../../build
5/ the binary should be here
./milkyway_nbody


also, I had to change to code to use the right device (CPU OpenCL platform is preferred for some reason) and [-p|--platform=INT][-d|--device=INT] options doesn't seem to work as expected.[/code]
ID: 75539 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Neil Haste

Send message
Joined: 8 May 22
Posts: 3
Credit: 4,640,216
RAC: 0
Message 75540 - Posted: 14 Jun 2023, 19:33:09 UTC

Hi all,

I have no idea what you lot are talking about.

I just like to provide the use of my humble laptop to my chosen project.

I'm happy for the notice period to be what ever someone deems it to be.

What would be more useful for me is how to get onboard the nbody number crunching.

I've never been one to chase credits so that side of it is unimportant.

Being a lorry driver my laptop sits on my bunk for up to 15 hours a day so feel it should be able to help out with this nbody thingy.

All I need to know is when and how to switch to it.

Thank you.

Oh, and congratulations on getting what you needed and the submission of all relevant papers.
ID: 75540 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Xterelle

Send message
Joined: 7 Aug 22
Posts: 9
Credit: 20,033,952
RAC: 0
Message 75541 - Posted: 14 Jun 2023, 20:09:19 UTC

Is this something that can help? https://developer.nvidia.com/gpugems/gpugems3/part-v-physics-simulation/chapter-31-fast-n-body-simulation-cuda
ID: 75541 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
sjmielh

Send message
Joined: 9 Aug 09
Posts: 10
Credit: 6,530,063
RAC: 0
Message 75542 - Posted: 14 Jun 2023, 20:24:31 UTC

I curious as well if Macs well be supported again in Nbody

Sjmielh
ID: 75542 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 18 Nov 22
Posts: 84
Credit: 640,530,847
RAC: 0
Message 75543 - Posted: 14 Jun 2023, 22:08:43 UTC - in response to Message 75539.  

> how did you compile the binary?
I'm testing the standalone version because dealing with Boinc dependencies is painful... it's easier to build it on Linux:
1/ install prerequisites (depends on your system, not sure if this is the full list)
apt-get install -y git build-essential cmake ocl-icd-opencl-dev ninja-build
2/ git clone
git clone https://github.com/Milkyway-at-home/milkywayathome_client.git
cd milkywayathome_client
3/ enable OPENCL build
sed -i 's/DNBODY_OPENCL=OFF/DNBODY_OPENCL=ON/g' make_nbody_lite.sh
4/ compile it
sh make_nbody_lite.sh
cd ../../build
5/ the binary should be here
./milkyway_nbody


also, I had to change to code to use the right device (CPU OpenCL platform is preferred for some reason) and [-p|--platform=INT][-d|--device=INT] options doesn't seem to work as expected.[/code]


thanks, I was able to compile it with these instructions and a fresh git pull. I did not make any changes to the code except the DNBODY_OPENCL=ON flag modification.

running standalone with your same command line arguments seems to run fine. it loads up the GPU to 100%, but I can't be sure that it's actually doing anything useful or it's just spinning it's wheels. been running over an hour now on a low power RTX 3050 GPU. not sure what an expected runtime would or should be. this GPU is pretty weak and has pretty bad FP64 performance. but I just started another run on a Titan V to compare.

ID: 75543 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skillz

Send message
Joined: 28 May 17
Posts: 76
Credit: 4,398,910,125
RAC: 24
Message 75544 - Posted: 14 Jun 2023, 22:16:01 UTC - in response to Message 75516.  

https://sech.me/boinc/Amicable/
http://asteroidsathome.net/boinc/
https://einsteinathome.org/
https://www.gpugrid.net/
https://numberfields.asu.edu/NumberFields/
https://www.primegrid.com/
https://srbase.my-firewall.org/sr5/
https://www.worldcommunitygrid.org/
https://foldingathome.org/
http://boincvm.proxyma.ru:30080/test4vm/ invite code is "PrimeGrid"
http://gerasim.boinc.ru/


While those projects do use GPUs, none of them benefit from having high FP64 compute. Which means the P100s would be essentially wasting electricity running those projects when there are far better alternatives that can crunch more work with the same amount of power used.

So unless Separation continues then there are no other projects that benefit from FP64. Which would ultimately mean any old GPU that has good FP64 would most likely not run other projects that well. Such as the AMD 200 series cards, S9000 series cards and the P100s. The Titan Vs still do pretty decent at other projects, but for the price of them newer GPUs would be a better option.
ID: 75544 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 May 11
Posts: 71
Credit: 5,685,114
RAC: 0
Message 75545 - Posted: 14 Jun 2023, 22:20:16 UTC - in response to Message 75544.  

I think https://foldingathome.org/ uses double precision.
ID: 75545 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 18 Nov 22
Posts: 84
Credit: 640,530,847
RAC: 0
Message 75546 - Posted: 14 Jun 2023, 22:36:24 UTC - in response to Message 75545.  

I think https://foldingathome.org/ uses double precision.


several projects "use" DP, like Einstein and Asteroids. but it's not a large portion of the total computation time like Milkyway Separation is/was. so even though a P100 has decent FP64 performance, it's relatively low FP32 performance makes it fall behind in these hybrid FP32/FP64 situations. far more worth it to go with a card with better FP32 since that's where most of the time is spent.

ID: 75546 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toby Broom

Send message
Joined: 13 Jun 09
Posts: 24
Credit: 140,166,483
RAC: 124,516
Message 75547 - Posted: 14 Jun 2023, 22:39:55 UTC - in response to Message 75545.  

Not really:

"Consumer GPUs are really bad at double precision calculations (so manufacturers can sell more expensive enterprise units to researchers). Luckily, molecular dynamics really only requires single precision to be useful. Folding@Home uses single precision."

https://fahbench.github.io/details.html
ID: 75547 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skillz

Send message
Joined: 28 May 17
Posts: 76
Credit: 4,398,910,125
RAC: 24
Message 75548 - Posted: 14 Jun 2023, 22:48:38 UTC - in response to Message 75545.  

I think https://foldingathome.org/ uses double precision.


It does not.

As Ian said, even though some projects do have some FP64 in their code, it's very minimal where higher DP doesn't really benefit that well compared to having higher FP32.

F@H is not one of them.
ID: 75548 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ahorek's team

Send message
Joined: 8 Sep 07
Posts: 7
Credit: 2,363,377
RAC: 295
Message 75549 - Posted: 14 Jun 2023, 23:15:43 UTC - in response to Message 75543.  

it loads up the GPU to 100%, but I can't be sure that it's actually doing anything useful or it's just spinning it's wheels


yeah, I've retested it on my linux + nvidia (turing) PC, but it seems the current nbody opencl version doesn't work at all. Maybe developers here have some insights if it ever worked? Anyway, if the current code is buggy, it would take much more effort to fix it and optimize it. It would be great to have a GPU version, but I don't think I'll be able to fix it...
ID: 75549 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 75550 - Posted: 15 Jun 2023, 3:22:19 UTC - in response to Message 75540.  
Last modified: 15 Jun 2023, 3:30:50 UTC

Hi all,

I have no idea what you lot are talking about.

I just like to provide the use of my humble laptop to my chosen project.

I'm happy for the notice period to be what ever someone deems it to be.

What would be more useful for me is how to get onboard the nbody number crunching.

I've never been one to chase credits so that side of it is unimportant.

Being a lorry driver my laptop sits on my bunk for up to 15 hours a day so feel it should be able to help out with this nbody thingy.

All I need to know is when and how to switch to it.

Thank you.

Oh, and congratulations on getting what you needed and the submission of all relevant papers.


Your pc is hidden so it's hard to be exact but something like this hsoul work for you:
<app_config>

<app_version>
<app_name>milkyway_nbody</app_name>

<plan_class>mt</plan_class>
<avg_ncpus>2</avg_ncpus>
<cmdline>--nthreads 2</cmdline>
</app_version>
<project_max_concurrent>3</project_max_concurrent>

</app_config>

Copy that into Notepad in windows and put that in your, in Windows, c:\program data\boinc\projects\milyway.cs.rpi..edu_milkyway folder and save it is app_config.xml. Be sure it saves it exactly as I wrote and NOT with a ',txt' on the end of it.

Then go into the Boinc Manager and click on options, read config files and then you can get the NBody tasks using 2 cpu cores per task and only running 3 tasks at one time on your pc. If you need to change that just be sure to use Notepad in Windows as a word processing program adds hidden stuff that Boinc can't read.

Asking for anymore help should be done in the Number Crunching thread so we don't get bogged down here.
ID: 75550 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Neil Haste

Send message
Joined: 8 May 22
Posts: 3
Credit: 4,640,216
RAC: 0
Message 75552 - Posted: 15 Jun 2023, 4:29:45 UTC - in response to Message 75550.  
Last modified: 15 Jun 2023, 4:30:47 UTC



Your pc is hidden so it's hard to be exact but something like this hsoul work for you:
<app_config>

<app_version>
<app_name>milkyway_nbody</app_name>

<plan_class>mt</plan_class>
<avg_ncpus>2</avg_ncpus>
<cmdline>--nthreads 2</cmdline>
</app_version>
<project_max_concurrent>3</project_max_concurrent>

</app_config>

Copy that into Notepad in windows and put that in your, in Windows, c:\program data\boinc\projects\milyway.cs.rpi..edu_milkyway folder and save it is app_config.xml. Be sure it saves it exactly as I wrote and NOT with a ',txt' on the end of it.

Then go into the Boinc Manager and click on options, read config files and then you can get the NBody tasks using 2 cpu cores per task and only running 3 tasks at one time on your pc. If you need to change that just be sure to use Notepad in Windows as a word processing program adds hidden stuff that Boinc can't read.

Asking for anymore help should be done in the Number Crunching thread so we don't get bogged down here.


Thank you for the reply.

I have a MacBook Air.

I wasn't directly asking for help on this thread more a general point of when and how when the time comes.
ID: 75552 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chooka
Avatar

Send message
Joined: 13 Dec 12
Posts: 101
Credit: 1,782,758,310
RAC: 0
Message 75553 - Posted: 15 Jun 2023, 7:23:48 UTC - in response to Message 75514.  
Last modified: 15 Jun 2023, 7:24:35 UTC

??
All the ones you are using Crashtech :)

ID: 75553 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chooka
Avatar

Send message
Joined: 13 Dec 12
Posts: 101
Credit: 1,782,758,310
RAC: 0
Message 75554 - Posted: 15 Jun 2023, 7:30:21 UTC - in response to Message 75544.  
Last modified: 15 Jun 2023, 7:32:54 UTC

https://sech.me/boinc/Amicable/
http://asteroidsathome.net/boinc/
https://einsteinathome.org/
https://www.gpugrid.net/
https://numberfields.asu.edu/NumberFields/
https://www.primegrid.com/
https://srbase.my-firewall.org/sr5/
https://www.worldcommunitygrid.org/
https://foldingathome.org/
http://boincvm.proxyma.ru:30080/test4vm/ invite code is "PrimeGrid"
http://gerasim.boinc.ru/


While those projects do use GPUs, none of them benefit from having high FP64 compute. Which means the P100s would be essentially wasting electricity running those projects when there are far better alternatives that can crunch more work with the same amount of power used.

So unless Separation continues then there are no other projects that benefit from FP64. Which would ultimately mean any old GPU that has good FP64 would most likely not run other projects that well. Such as the AMD 200 series cards, S9000 series cards and the P100s. The Titan Vs still do pretty decent at other projects, but for the price of them newer GPUs would be a better option.


I've been switching over to NGREEDIA gpu's anyway lately. They are now pretty good at Einstein and excellent at Primegrid. Quite power efficient the 40 series as well.
I wrote a while ago on this forum if anyone was worried about crunchers dropping off as newer GPU's don't use FP64 like the old days. I don't even bother with Milkyway & NVIDIA cards. Not worth it.

ID: 75554 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 13 · Next

Message boards : News : Separation Project Coming To An End

©2024 Astroinformatics Group