Welcome to MilkyWay@home

Tried running nbody: CPU temps way too low.

Message boards : Number crunching : Tried running nbody: CPU temps way too low.
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 75287 - Posted: 6 Apr 2023, 19:37:09 UTC

root@dual-linux:/var/lib/boinc/projects/milkyway.cs.rpi.edu_milkyway# sensors
coretemp-isa-0000
Adapter: ISA adapter
Core 0:       +26.0°C  (high = +80.0°C, crit = +96.0°C)
Core 1:       +29.0°C  (high = +80.0°C, crit = +96.0°C)
Core 2:       +32.0°C  (high = +80.0°C, crit = +96.0°C)
Core 8:       +32.0°C  (high = +80.0°C, crit = +96.0°C)
Core 9:       +28.0°C  (high = +80.0°C, crit = +96.0°C)
Core 10:      +28.0°C  (high = +80.0°C, crit = +96.0°C)

amdgpu-pci-0400
Adapter: PCI adapter
vddgfx:        1.02 V
fan1:        2981 RPM  (min =    0 RPM, max = 3700 RPM)
edge:         +46.0°C  (crit = +94.0°C, hyst = -273.1°C)
power1:       79.02 W  (cap =  90.00 W)

coretemp-isa-0001
Adapter: ISA adapter
Core 0:       +23.0°C  (high = +80.0°C, crit = +96.0°C)
Core 1:       +21.0°C  (high = +80.0°C, crit = +96.0°C)
Core 2:       +25.0°C  (high = +80.0°C, crit = +96.0°C)
Core 8:       +24.0°C  (high = +80.0°C, crit = +96.0°C)
Core 9:       +22.0°C  (high = +80.0°C, crit = +96.0°C)
Core 10:      +18.0°C  (high = +80.0°C, crit = +96.0°C)

intel5500-pci-00a3
Adapter: PCI adapter
temp1:        +78.5°C  (high = +100.0°C, hyst = +95.0°C)
                       (crit = +110.0°C)



Dual Xeon Ubuntu: total of 12 cores, 24 huperthreads. Looks both CPUs are idling. This cannot be right?
Is the command ' --threads": documented anywhere? I tried setting it to 8 and leaving it off entirely then 32. Made no difference. l assume that "threads" refers to system or kernel threads and not hyperthread allocaton. I also assume "avg_ncpu" refers to hyperthreads of which there are 24 available.

app_config.xml


<app_config>
 <app>
  <name>milkyway_nbody</name>
  <max_concurrent>2</max_concurrent>
 </app>
 <app_version>
  <app_name>milkyway_nbody</app_name>
  <plan_class>mt</plan_class>
  <avg_ncpus>8</avg_ncpus>
  <cmdline>--nthreads 32</cmdline>
 </app_version>
</app_config>
ID: 75287 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 152,257,111
RAC: 26,289
Message 75288 - Posted: 6 Apr 2023, 19:51:45 UTC - in response to Message 75287.  

I took a look at a couple of your hosts, are you sure you have selected the nbody application?
ID: 75288 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 75289 - Posted: 6 Apr 2023, 20:13:22 UTC - in response to Message 75288.  
Last modified: 6 Apr 2023, 20:22:32 UTC

I took a look at a couple of your hosts, are you sure you have selected the nbody application?


From boinc tasks

Milkyway@Home	1.82 Milkyway@home N-Body Simulation (mt)	de_nbody_02_27_2023_v182_pal5__data__3_1680661949_11332_0	02:15:26 (23:29:56)	130.12	71.177	00:54:50	4/18/2023 12:56:38 PM	32.0 °C	Running	8C	0	dual-linux	
Milkyway@Home	1.82 Milkyway@home N-Body Simulation (mt)	de_nbody_02_27_2023_v182_pal5__data__2_1680661949_11309_0	01:18:41 (12:19:55)	117.55	60.215	01:29:01	4/18/2023 12:56:38 PM	32.0 °C	Running	8C	0	dual-linux	


the "0" between the 8c and the dual-linux means there is no cpu throttling

The Intel i9-7900x has a more reasonable temperature of 63.9 c

Milkyway@Home	1.82 Milkyway@home N-Body Simulation (mt)	de_nbody_02_27_2023_v182_pal5__data__2_1674667492_1117989_3	01:00:41 (05:37:38)	69.55	98.758	00:00:45	4/18/2023 7:21:59 AM	63.9 °C	Running	8C	0	JYSArea51	
Milkyway@Home	1.82 Milkyway@home N-Body Simulation (mt)	de_nbody_02_27_2023_v182_pal5__data__1_1680661949_7646_0	00:25:56 (02:51:44)	82.77	15.649	02:19:48	4/18/2023 7:21:59 AM	63.9 °C	Running	8C	0	JYSArea51	
ID: 75289 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Kiska

Send message
Joined: 31 Mar 12
Posts: 94
Credit: 152,257,111
RAC: 26,289
Message 75290 - Posted: 6 Apr 2023, 20:40:49 UTC - in response to Message 75289.  

Perhaps you have exotic cooling :D

Also the command switch is --nthreads

I guess it shouldn't cause concern it does look like the task is progressing?
ID: 75290 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 709
Credit: 545,187,395
RAC: 64,174
Message 75291 - Posted: 6 Apr 2023, 20:43:56 UTC
Last modified: 6 Apr 2023, 20:44:54 UTC

Simply a matter of your dual Xeon not reporting temps correctly through TThrottle I assume. Has nothing to do with the application or you app_config.

You know that the 0℃ temp is impossible unless you are using extreme cooling.
ID: 75291 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 75292 - Posted: 6 Apr 2023, 21:24:29 UTC - in response to Message 75291.  

Both are on separate closed liquid cooling. Temp shown by boinctasks is 32 deg. the "0" is the throttling percentage, not the temperature.

The app finished and is awaiting verification so possible the temps are OK !


https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=959265&offset=0&show_names=0&state=3&appid=2
ID: 75292 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 75293 - Posted: 6 Apr 2023, 21:31:17 UTC - in response to Message 75290.  
Last modified: 6 Apr 2023, 21:31:54 UTC

Perhaps you have exotic cooling :D

Also the command switch is --nthreads

I guess it shouldn't cause concern it does look like the task is progressing?


That was just a typo, the xml has --nthreads


The following is all the command arguments that I know of. Had not seen --nthreads before

<cmdline>--non-responsive --verbose --gpu-target-frequency 1 --gpu-polling-mode -1 --gpu-wait-factor 0 --process-priority 4 --gpu-disable-checkpointing</cmdline>
ID: 75293 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 709
Credit: 545,187,395
RAC: 64,174
Message 75294 - Posted: 7 Apr 2023, 0:09:50 UTC - in response to Message 75293.  

The Boinc "bible" for client/application configuration is here. https://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration

It specifically shows an example for MT application configuration and shows the --nthreads parameter syntax.
ID: 75294 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Tried running nbody: CPU temps way too low.

©2024 Astroinformatics Group