Welcome to MilkyWay@home

Posts by Keith Myers

21) Questions and Answers : Unix/Linux : MW stopped using my nvidia GPU (Message 69887)
Posted 3 Jun 2020 by ProfileKeith Myers
Post:
Also, as an aside, how does one view the specific computer on the website? I cant seem to sort by computer id so searching for a particular computer in my set of computers is like searching for a needle in a haystack

Don't understand this at all. Login to MW, go to your account main page, click the computers link on the page. https://milkyway.cs.rpi.edu/milkyway/hosts_user.php
Voila! All your computers are listed, even with their assigned network names. Easy to figure out which computer is which.

If you are constantly running out of work and the 10 minute backoff bugs you too much, you can always run JStateson's modified client which removes that aggravation.
22) Message boards : Number crunching : Need help with linux and app_info (Message 69881)
Posted 1 Jun 2020 by ProfileKeith Myers
Post:
Well if you want to run 14 total cpu threads out of the 16 and assign two threads for the two gpus, that leaves you with 12 threads to run the cpu milkyway nbody tasks. So you should run this app_config.xml file.

<app_config>
<app>
  <name>milkyway_nbody</name>
  <max_concurrent>3</max_concurrent>
 </app>
<app_version>
   <app_name>milkyway_nbody</app_name>
   <plan_class>mt</plan_class>
   <avg_ncpus>4</avg_ncpus>
   <cmdline>--nthreads 4</cmdline>
</app_version>

<app>
    <name>milkyway</name>
    <gpu_versions>
       <gpu_usage>1.0</gpu_usage>
       <cpu_usage>1.0</cpu_usage>
    </gpu_versions>
</app>
</app_config>


This would run 3 concurrent nbody cpu tasks using 4 threads each and two gpu tasks running.
23) Message boards : Number crunching : Need help with linux and app_info (Message 69876)
Posted 30 May 2020 by ProfileKeith Myers
Post:
First question to answer is how many total cpu threads on the host do you want to commit to BOINC.

Second question is how many concurrent mt tasks do you want to run.

Third question is how many cpu threads per mt task do you want to commit to the task.

Fourth question is how many gpu tasks per card do you want to run. I advise to stick to a single task per Nvidia card unless they are very high end like a 2080 or 2080 Ti.

All of those configurations need to be put into an app_config.xml file for the project to control how you want to run the project on your hardware.
24) Message boards : Number crunching : Need help with linux and app_info (Message 69875)
Posted 30 May 2020 by ProfileKeith Myers
Post:
Most people don't understand the use of <cpu_usage> values in the app_config or app_info files. That can't limit the actual usage of the cpu thread in support of the gpu task.

Only the actual science gpu application itself determines how much cpu support the gpu task needs. Some applications need very little cpu support, the MW ATI application for example. But the
MW Nvidia application uses almost the full cpu thread on the exact same tasks. Just the difference in the applications is what determines the actual cpu usage.

The setting of cpu usage is only for BOINC scheduling purposes, to assist BOINC in determining how much resources to allocate for each project and how much work can be run simultaneously.

In the case of your Seti special sauce application, those tasks actually used almost a full cpu core to support the gpu task. All the 0.45 usage value did was free up a cpu thread to do something else, run another cpu task or another gpu task for another project for example.

You would need to run an app_config with max_concurrent statements to control the Separation tasks and most definitely you would need to run a <nthreads> statement to control and limit the mt tasks which would commandeer all cpu threads if not limited and prevent the other applications from running.

Read the BOINC document for examples of setting up the proper controls. https://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration
25) Message boards : Number crunching : Need help with linux and app_info (Message 69871)
Posted 29 May 2020 by ProfileKeith Myers
Post:
In the Computing Preferences in the Options menu in the Manager.
In the Computing tab, select "Use at most 50% of cpus" and you will only use 8 threads out of your 16.
Question - you are running 4 concurrent tasks per gpu? Seems unlikely for Nvidia cards.
26) Message boards : Number crunching : Increased GPU speed? (Message 69843)
Posted 20 May 2020 by ProfileKeith Myers
Post:
Not all tasks are the same. You got a bunch with an easy parameter set compared to what you crunched before. So they take less time to crunch.


I know there are at least two types - the bundles of 4 and the bundles of 5. Plus there are some marked test and some not. But I've never seen them run that quick before.

If the condition persists, count your blessings and see if you can figure out if you made any changes to the host to account for this new behavior. Then publish for the rest of us to copy. LOL.
27) Message boards : Number crunching : Increased GPU speed? (Message 69841)
Posted 20 May 2020 by ProfileKeith Myers
Post:
Not all tasks are the same. You got a bunch with an easy parameter set compared to what you crunched before. So they take less time to crunch.
28) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69837)
Posted 19 May 2020 by ProfileKeith Myers
Post:
is there an Ubuntu version of this I can just copy to the folder and restart boinc? Or do I have to get it compiled first?

Yes, the compiled Linux Ubuntu client binary is downloadable directly from his repository.

https://github.com/JStateson/MilkywayNewWork/blob/master/boinc_ubuntu
Just download it and set execute permissions. Also make sure you download the cc_config.xml to use along with the binary as it is necessary to use his modified client.

https://github.com/JStateson/MilkywayNewWork
ubuntu version be sure to use 0751 on program and 0664 on the xml
29) Message boards : Number crunching : All tasks end with Computation error or Validation error (Message 69824)
Posted 15 May 2020 by ProfileKeith Myers
Post:
The problem is apparent right there in the stderr.txt output.

Failed to move file 'separation_checkpoint_tmp' to 'separation_checkpoint' (317): (null)Failed to move file 'separation_checkpoint_tmp' to 'separation_checkpoint' (317): (null)Failed to move file 'separation_checkpoint_tmp' to 'separation_checkpoint' (317): (null)Failed to move file 'separation_checkpoint_tmp' to 'separation_checkpoint' (317): (null)Failed to move file 'separation_checkpoint_tmp' to 'separation_checkpoint' (317): (null)Failed to move file 'separation_checkpoint_tmp' to 'separation_checkpoint' (317): (null)Failed to move file 'separation_checkpoint_tmp' to 'separation_checkpoint' (317): (null)Failed to update checkpoint file ('separation_checkpoint_tmp' to 'separation_checkpoint') (0):

Opening checkpoint 'separation_checkpoint_tmp' (13): Permission denied


You have a permission problem. Probably your AV is blocking access to BOINC's slot directories. Whitelist the BOINC data directory and all its sub-directories.
30) Questions and Answers : Preferences : MilkyWay steals all my CPUs (Message 69817)
Posted 14 May 2020 by ProfileKeith Myers
Post:
I have played with those settings but could not come up with anything that works. I will give this a try. Thank you. It would still be nice if they would add some kind of direct control on the prefs page. More casual users will just give up and delete the project.

This app_config.xml would for example limit your host to running 1 mt nbody task using 4 threads per task and only one task at a time.

<app_config>
   <app_version>
       <app_name>milkyway_nbody</app_name>
       <plan_class>mt</plan_class>
       <avg_ncpus>4</avg_ncpus>
       <cmdline>--nthreads 4</cmdline>
   </app_version>
   <max_concurrent>1</max_concurrent>
</app_config>


I understand that some other projects do in fact give you control over the number of threads used on multi-thread tasks directly in the application preferences. This project does not however and you will need to use the app_config for finer control.
31) Questions and Answers : Unix/Linux : Not Switching Between Projects (Message 69816)
Posted 14 May 2020 by ProfileKeith Myers
Post:
I'm sure you must have had some exposure to a project or projects that change app_names often, but I have not come across that project yet in 18 years of BOINCing.
Even application version changes do not change the base app_name referenced in the client_state so the app_names are very constant and once set up in an app_config never have to be changed again in my experience. The "friendly name" might change slightly or the application name, but the base name for a project does not and that is the app_name referenced in app_config.xml.
32) Questions and Answers : Preferences : MilkyWay steals all my CPUs (Message 69812)
Posted 13 May 2020 by ProfileKeith Myers
Post:
When starting ANY new project, ALWAYS change the REC parameter in cc_config first from the default 10 days to a single day. That allows the project credit debts to balance out much faster and you will have less likelihood of the new project commandeering the host's crunching time and preventing your other projects from crunching. In fact leave REC set at 1 day for the future after editing. The client runs much smoother on multiple project hosts that way.

<rec_half_life_days>1.000000</rec_half_life_days>
33) Questions and Answers : Preferences : MilkyWay steals all my CPUs (Message 69811)
Posted 13 May 2020 by ProfileKeith Myers
Post:
I also usually don't allow n_body to run because there is no way to control how many CPUs it wants to use.

Incorrect. You have posted the link to the configuration document that shows explicitly an example for multi-threaded applications. Set the number of nthreads(cpu cores) you allow the mt application to use. Use a max_concurrent statement to restrict the number of mt tasks running at any one time.

...
   [<app_version>
       <app_name>Application_Name</app_name>
       [<plan_class>mt</plan_class>]
       [<avg_ncpus>x</avg_ncpus>]
       [<ngpus>x</ngpus>]
       [<cmdline>--nthreads 7</cmdline>]
   </app_version>]
   ...
   [<project_max_concurrent>N</project_max_concurrent>]
34) Questions and Answers : Unix/Linux : Disk Usage (Message 69810)
Posted 13 May 2020 by ProfileKeith Myers
Post:
Projects don't have any controls over where to store their data. That is under the control of BOINC. When you install BOINC select the custom installation which then allows you to choose a different data directory location other than the default of the C: drive.

With Linux and the distro versions of BOINC it is a lot more difficult. You can try the first steps outlined in this link.

https://alpheratz.net/how-to-move-boinc-data-directory-linux/

Probably won't work. Didn't for me with Ubuntu. I had to laboriously strip out all the symlinks in half a dozen directories on my Jetson Nano. I find it easiest to work with a custom version of BOINC that allows you to run it from wherever you unpack the BOINC package. I put it on the Desktop in /home and then have full control over it. You could certainly select the external USB drive for the BOINC installation for the unpacking target.

This is the version made for Seti and the special applications. But you can simply delete the Seti project folder and install/join any project you want. It is the 7.16.5 version of BOINC. It is what we called the All-in-One package of BOINC.

http://www.arkayn.us/lunatics/BOINC.7z
35) Message boards : Number crunching : Increase performance (watts)? (Message 69806)
Posted 12 May 2020 by ProfileKeith Myers
Post:
Instead of looking at power consumption, what does card utilization show for 2X or 3X? That would be the limiting factor. Also are you overclocking the card's P2 power state under compute load to get back to what the P0 power state for card would be if not penalized by the drivers?

keith@Serenity:~$ nvidia-smi
Mon May 11 18:17:24 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64       Driver Version: 440.64       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2080    On   | 00000000:08:00.0 Off |                  N/A |
|100%   55C    P2   188W / 225W |    315MiB /  7982MiB |     90%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 2080    On   | 00000000:0A:00.0  On |                  N/A |
|100%   41C    P2   158W / 225W |   1107MiB /  7979MiB |     97%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce RTX 2080    On   | 00000000:0B:00.0 Off |                  N/A |
|100%   38C    P2    98W / 225W |    446MiB /  7982MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1502      G   /usr/lib/xorg/Xorg                             6MiB |
|    0      9408      C   ./keepP2                                     111MiB |
|    0     20858      C   acemd3                                       185MiB |
|    1      1502      G   /usr/lib/xorg/Xorg                           120MiB |
|    1      2229      G   /usr/bin/gnome-shell                         102MiB |
|    1      9409      C   ./keepP2                                     111MiB |
|    1     23651      C   ...86_64-pc-linux-gnu__FGRPopenclTV-nvidia   769MiB |
|    2      1502      G   /usr/lib/xorg/Xorg                             6MiB |
|    2      9410      C   ./keepP2                                     111MiB |
|    2     23709      C   ..._x86_64-pc-linux-gnu__opencl_nvidia_101   157MiB |
|    2     23764      C   ..._x86_64-pc-linux-gnu__opencl_nvidia_101   157MiB |
+-----------------------------------------------------------------------------+


The gpu#2 is running 2X MW separation tasks. 100% utilization at 98W.
36) Questions and Answers : Unix/Linux : GPU tasks error out on Linux (Message 69775)
Posted 8 May 2020 by ProfileKeith Myers
Post:
Hi,

I'm using the distro provided Mesa OpenCL software and all Separation tasks immediately error out on my RX 560. The same setup has no problems chrunching for Einstein@Home

here is the issue: https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=799092

Any idea if there's a fix coming?

Get the latest AMD drivers 20.xx something or other.
37) Message boards : Number crunching : Multiple GPU Problem (Message 69769)
Posted 7 May 2020 by ProfileKeith Myers
Post:
Do both cards show in the BIOS? Maybe the card is bad or incorrectly installed, not pushed all the way into the slot. Is correct PCIe power connected to the new card?
You need to get both cards seen by Windows before tackling BOINC.

You can try the Wagnard DDU utility to completely remove the graphics drivers and then reinstall them.

Once you get Windows to recognize both cards you will have to edit the BOINC cc_config.xml file and change the <use_all_gpus>1</use_all_gpus> parameter from 0 to 1, or BOINC will only use the new 1050 card because the cards are dissimilar and BOINC will only recognize the most powerful card for use.
38) Message boards : Number crunching : No new Milkyway gpu WU's (Message 69755)
Posted 23 Apr 2020 by ProfileKeith Myers
Post:
Speak the devil's name and he shall appear. I see you have the normal cache of gpu work now and have returned validated tasks.

So the usual forum phantom has remedied your post as usual. Looks like he works the same hours as his compatriot at Seti.
39) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69739)
Posted 18 Apr 2020 by ProfileKeith Myers
Post:
What have you done with the other other GPUs?

There still sitting in their hosts gathering dust.
40) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69737)
Posted 18 Apr 2020 by ProfileKeith Myers
Post:
I am running a custom client that our GPUUG developers figured out. It sidesteps the issue at Milkyway in an elegant manner with a simple configuration file. Something along the lines of what JStateson's client does. I simply delay reporting tasks on a 15 minute schedule avoiding the 10 minute dry period. I limit the cache to a fixed 600 tasks for Milkyway and 20/120 tasks cpu/gpu split for Einstein. Just setting the GPUGrid cache limit at what the project would normally send as two tasks per gpu. No spoofing anymore on any project. Just fixed task counts that I feel comfortable with.

Yes, total gpus currently running are 7 in two hosts. Down from 17 gpus when I was running Seti on five hosts. Oh guess I should include the Maxwell gpu in the Jetson Nano I have running the BRP cpu task on the gpu. So eight gpus in total.


Previous 20 · Next 20

©2020 Astroinformatics Group