Welcome to MilkyWay@home

Need help with linux and app_info


Advanced search

Message boards : Number crunching : Need help with linux and app_info
Message board moderation

To post messages, you must log in.

AuthorMessage
Cat22
Avatar

Send message
Joined: 26 May 20
Posts: 12
Credit: 74,145,002
RAC: 542,191
50 million credit badge
Message 69857 - Posted: 27 May 2020, 5:52:38 UTC

Hi,
I have 2 gpu's installed on an i9-9900k system and boinc 7.16.3

    CUDA: NVIDIA GPU 0: GeForce GTX 1660 Ti (driver version 440.31, CUDA version 10.2, compute capability 7.5, 4096MB, 3972MB available, 5714 GFLOPS peak)
    CUDA: NVIDIA GPU 1: GeForce GTX 1660 Ti (driver version 440.31, CUDA version 10.2, compute capability 7.5, 4096MB, 3972MB available, 5668 GFLOPS peak)


I would like to use both GPU's and all my CPU cores to process work but I need a proper app_info.xml to do it
Does anyone have such a thing? I looked quite a bit on the web but couldnt find anything
Oh, and please remember this is a linux box
TIA

ID: 69857 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2408
Credit: 450,647,383
RAC: 27,371
300 million credit badge10 year member badgeextraordinary contributions badge
Message 69858 - Posted: 27 May 2020, 11:23:38 UTC - in response to Message 69857.  

Hi,
I have 2 gpu's installed on an i9-9900k system and boinc 7.16.3

    CUDA: NVIDIA GPU 0: GeForce GTX 1660 Ti (driver version 440.31, CUDA version 10.2, compute capability 7.5, 4096MB, 3972MB available, 5714 GFLOPS peak)
    CUDA: NVIDIA GPU 1: GeForce GTX 1660 Ti (driver version 440.31, CUDA version 10.2, compute capability 7.5, 4096MB, 3972MB available, 5668 GFLOPS peak)


I would like to use both GPU's and all my CPU cores to process work but I need a proper app_info.xml to do it
Does anyone have such a thing? I looked quite a bit on the web but couldnt find anything
Oh, and please remember this is a linux box
TIA



It looks like both Gpu's are being used and it's using all the cpu cores it needs to do it.
Since both gpu's are the same and are found by Boinc you shouldn't need anything else to make them crunch.
Now if you want to crunch more than one at a time that's a different story and not what you asked about.
ID: 69858 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cat22
Avatar

Send message
Joined: 26 May 20
Posts: 12
Credit: 74,145,002
RAC: 542,191
50 million credit badge
Message 69860 - Posted: 27 May 2020, 12:43:52 UTC - in response to Message 69858.  

sorry, I would assume the benefit of having multiple GPU's was to crunch in parallel otherwise - whats the point?
I want both GPU's fully occupied all the time.
As it is right now, only 1 GPU is in use at any given time.
Thanks.
ID: 69860 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nick Name

Send message
Joined: 27 Jul 14
Posts: 22
Credit: 457,236,770
RAC: 1,328
300 million credit badge6 year member badge
Message 69861 - Posted: 27 May 2020, 17:34:48 UTC

It sounds like you need to enable all GPUs in your cc_config.
<use_all_gpus>0|1</use_all_gpus>
If 1, use all GPUs (otherwise only the most capable ones are used). Requires a client restart.

This file should be found in /var/lib/boinc. Edit with a standard text editor, setting use_all_gpus to 1 and make sure it's saved as a .xml file. Restart BOINC. If cc_config doesn't exist, create it via the manager: Options -> Event Log Options, then Save.

You don't need app_info for this case, that would normally be used if you compiled your own app. An app_config will work. Create it with a text editor and save it to the project data folder.
This should be in /var/lib/boinc/projects/milkyway.cs.rpi.edu_milkyway.

<app_config>

 <app>
  <name>milkyway</name>
  <gpu_versions>
   <gpu_usage>0.49</gpu_usage>
   <cpu_usage>0.50</cpu_usage>
  </gpu_versions>
 </app>

</app_config>

This will run two tasks at a time. Adjust gpu_usage and cpu_usage depending on how many tasks you want to run. Make sure it's saved as app_config.xml. Just re-reading the config via the manager - Options -> Read Config Files - will start it working.

Finally, just to be clear re. running in parallel: If you meant Crossfire or SLI, that doesn't work for any BOINC project. You can use all your GPUs but they will run individually on separate tasks, not together on the same task.
Team USA forum | Team USA page
Always crunching / Always recruiting
ID: 69861 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2408
Credit: 450,647,383
RAC: 27,371
300 million credit badge10 year member badgeextraordinary contributions badge
Message 69862 - Posted: 27 May 2020, 22:47:51 UTC - in response to Message 69860.  

sorry, I would assume the benefit of having multiple GPU's was to crunch in parallel otherwise - whats the point?
I want both GPU's fully occupied all the time.
As it is right now, only 1 GPU is in use at any given time.
Thanks.


In Boinc you can use both gpu's at the same time just not on the same workunit, in fact if you have the SLI connector connected and do NOT game then it's best to take it off.

To get both gpu's to crunch a workunit at the same time use a text editor and make a cc-config.xml file like this:

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
</options>
</cc_config>

Put that into the Boinc directory using your admin password and stop and restart Boinc. If you don't know how to fully stop Boinc using the command line in Linux then just restart the pcand it should start using both gpu's.
ID: 69862 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cat22
Avatar

Send message
Joined: 26 May 20
Posts: 12
Credit: 74,145,002
RAC: 542,191
50 million credit badge
Message 69863 - Posted: 28 May 2020, 2:34:40 UTC - in response to Message 69861.  

Apparently there is still some confusion about GPU usage. I think people are over thinking the issue. Just consider the GPU as if it were just another cpu core. Although it takes a special app but it consumes 1 work unit at a time, processes it, spits out the result and then gets another work unit. If you have two GPU's e.g 2 GTX 1660 Ti's like me then each GPU will get a work unit and each GPU does not know or care about the other GPU, thus you get 2 work uints being processed at the exact same time. If you had 6 GPU's you'd be able to process 6 gpu-type work units in parallel. Now, yes, you can run more than 1 WU's on a GPU, simultaneously but you generally take a performance hit when you do, you need to actually test it to be sure. SLI makes 2 GPU's look like 1 and as far as I have heard has no performance benefit for the kind of computational work we do.
I tried the app_confg you posted but what i get then is 2 work units assigned to GPI 0 and none assigned to GPU 1. I think it needs some kind of device line added to it?
ID: 69863 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cat22
Avatar

Send message
Joined: 26 May 20
Posts: 12
Credit: 74,145,002
RAC: 542,191
50 million credit badge
Message 69864 - Posted: 28 May 2020, 2:39:12 UTC - in response to Message 69862.  

On startup boinc reports:
"[---] Config: use all coprocessors"
so we are set there.
ID: 69864 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nick Name

Send message
Joined: 27 Jul 14
Posts: 22
Credit: 457,236,770
RAC: 1,328
300 million credit badge6 year member badge
Message 69865 - Posted: 28 May 2020, 9:01:39 UTC

After reading the thread a bit more closely, the question seems to be why the 2nd GPU is detected but not being used. app_config and app_info are irrelevant in that context. Judging by this:

CUDA: NVIDIA GPU 0: GeForce GTX 1660 Ti
CUDA: NVIDIA GPU 1: GeForce GTX 1660 Ti

Both cards are detected and both should work. This snippet from a job log, "Found 2 CL devices", shows that the MilkyWay app is seeing both cards so I think we can rule out a driver or weird OpenCL problem, or an exclusion in cc_config.

This is just a guess, but your CPU may not have an available thread to support a task on the second GPU. BOINC will typically over commit the CPU when running GPU work. If you've set BOINC to use all 16 threads, it will run 16 CPU tasks and at least one more GPU task. I don't know how much CPU the Nvidia app schedules but generally Nvidia OpenCL tasks take a full thread. I suggest reducing the number of threads BOINC can use and see if that solves the problem. Your GPU task run times are much longer than the corresponding CPU time, that's an indication the CPU is overtaxed so reducing the load is a good idea just to help with that.

Another possibility is you have a CPU project that's gone into high priority mode. If that's the case it's likely keeping the 2nd GPU from running because BOINC is trying to get that work done before the deadline. If this is what's happening, usually the best thing to do is lower your work cache, i.e. Store at least N days of work / Store additional N days, then give it some time to clear out.

I'd also remove the app_config until you get things working, you can delete or rename it and re-read the config files.
Team USA forum | Team USA page
Always crunching / Always recruiting
ID: 69865 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cat22
Avatar

Send message
Joined: 26 May 20
Posts: 12
Credit: 74,145,002
RAC: 542,191
50 million credit badge
Message 69870 - Posted: 29 May 2020, 18:29:41 UTC - in response to Message 69865.  

How do i set the number of cpu's that boinc will use as you mentioned? I want to try cutting it back and see if that gets both cards in use. I think we can tell right away if this is the issue if i set boinc to use say only 8 threads out of thee 16 (temporarily as an experiment), that leaves 4 threads per GPU so if this is the issue I should see both GPU's in use right away.
Sound right?
ID: 69870 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 335
Credit: 215,190,522
RAC: 320,490
200 million credit badge9 year member badgeextraordinary contributions badge
Message 69871 - Posted: 29 May 2020, 19:07:55 UTC - in response to Message 69870.  
Last modified: 29 May 2020, 19:09:13 UTC

In the Computing Preferences in the Options menu in the Manager.
In the Computing tab, select "Use at most 50% of cpus" and you will only use 8 threads out of your 16.
Question - you are running 4 concurrent tasks per gpu? Seems unlikely for Nvidia cards.
ID: 69871 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cat22
Avatar

Send message
Joined: 26 May 20
Posts: 12
Credit: 74,145,002
RAC: 542,191
50 million credit badge
Message 69873 - Posted: 30 May 2020, 1:02:04 UTC - in response to Message 69871.  

Hi Keith,
I want to run 1 task per GPU. right now I am running 1 task on 1 GPU and nothing on the 2nd GPU
ID: 69873 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cat22
Avatar

Send message
Joined: 26 May 20
Posts: 12
Credit: 74,145,002
RAC: 542,191
50 million credit badge
Message 69874 - Posted: 30 May 2020, 1:33:46 UTC - in response to Message 69873.  

Hi again,
I just set both systems to a percentage of CPU so that i would have 2 cores free and after awhile it did start using both GPU's finally.
When I was running SETI we did this via an app info (see below). The SETI app_info.xml allocated however much of a cpu you wanted (.45 in this case) to tend to a single GPU's needs. So far I have not found a linux based working app_info.xml for Milkyway that handles all the apps:
    milkyway_nbody_1.76_x86_64-pc-linux-gnu__mt
    milkyway_1.46_x86_64-pc-linux-gnu__opencl_nvidia_101
    milkyway_1.46_x86_64-pc-linux-gnu


<app_info>
  <app>
     <name>setiathome_v8</name>
  </app>
    <file_info>
      <name>setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101</name>
      <executable/>
    </file_info>
    <app_version>
      <app_name>setiathome_v8</app_name>
      <platform>x86_64-pc-linux-gnu</platform>
      <version_num>801</version_num>
      <plan_class>cuda10.1</plan_class>
      <cmdline>-nobs</cmdline>
      <coproc>
        <type>NVIDIA</type>
        <count>1</count>
      </coproc>
      <avg_ncpus>.45</avg_ncpus>
      <ngpus>1</ngpus>
      <file_ref>
         <file_name>setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101</file_name>
          <main_program/>
      </file_ref>
    </app_version>
  <app>
      <name>setiathome_v8</name>
    </app>
    <file_info>
      <name>MBv8_8.05r3345_avx_linux64</name>
      <executable/>
    </file_info>
    <app_version>
      <app_name>setiathome_v8</app_name>
      <version_num>805</version_num>
      <platform>x86_64-pc-linux-gnu</platform>
      <plan_class>avx</plan_class>
      <cmdline></cmdline>
      <file_ref>
        <file_name>MBv8_8.05r3345_avx_linux64</file_name>
        <main_program/>
      </file_ref>
    </app_version>
</app_info>
ID: 69874 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 335
Credit: 215,190,522
RAC: 320,490
200 million credit badge9 year member badgeextraordinary contributions badge
Message 69875 - Posted: 30 May 2020, 5:43:56 UTC

Most people don't understand the use of <cpu_usage> values in the app_config or app_info files. That can't limit the actual usage of the cpu thread in support of the gpu task.

Only the actual science gpu application itself determines how much cpu support the gpu task needs. Some applications need very little cpu support, the MW ATI application for example. But the
MW Nvidia application uses almost the full cpu thread on the exact same tasks. Just the difference in the applications is what determines the actual cpu usage.

The setting of cpu usage is only for BOINC scheduling purposes, to assist BOINC in determining how much resources to allocate for each project and how much work can be run simultaneously.

In the case of your Seti special sauce application, those tasks actually used almost a full cpu core to support the gpu task. All the 0.45 usage value did was free up a cpu thread to do something else, run another cpu task or another gpu task for another project for example.

You would need to run an app_config with max_concurrent statements to control the Separation tasks and most definitely you would need to run a <nthreads> statement to control and limit the mt tasks which would commandeer all cpu threads if not limited and prevent the other applications from running.

Read the BOINC document for examples of setting up the proper controls. https://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration
ID: 69875 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 335
Credit: 215,190,522
RAC: 320,490
200 million credit badge9 year member badgeextraordinary contributions badge
Message 69876 - Posted: 30 May 2020, 5:54:23 UTC

First question to answer is how many total cpu threads on the host do you want to commit to BOINC.

Second question is how many concurrent mt tasks do you want to run.

Third question is how many cpu threads per mt task do you want to commit to the task.

Fourth question is how many gpu tasks per card do you want to run. I advise to stick to a single task per Nvidia card unless they are very high end like a 2080 or 2080 Ti.

All of those configurations need to be put into an app_config.xml file for the project to control how you want to run the project on your hardware.
ID: 69876 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cat22
Avatar

Send message
Joined: 26 May 20
Posts: 12
Credit: 74,145,002
RAC: 542,191
50 million credit badge
Message 69879 - Posted: 31 May 2020, 18:08:44 UTC - in response to Message 69876.  

hi,
The whole system is dedicated to boinc 24/7/365 so for the i9-99000k the answer is 16 cpu threads
I only want to commit whatever number of cpu threads are required buy each GPU application.
My goal is to run 2 concurrent GPU tasks - 1 task per card (which it seems to be doing now that i set the global cpu % down to 90%)
and have the remainder cpu resources crunching CPU tasks.
So if each nvidia app actually requires a full cpu thread to keep it fed, then the remaining 14 threads should be crunching CPU tasks
What bothers me is that by using the Global "Use at most xx CPU percentage" option I am affecting other projects where if i had a decent app_info.xml
or app_config.xml (whatever i need) it would only apply to Mikyway and leave the other (presently idle) projects alone. I would only be running a single project
not more than that concurrently. e.g I switched to MW only because SETI isnt handing out work while they manage an overwhelming amount of returned results.
TIA
ID: 69879 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nick Name

Send message
Joined: 27 Jul 14
Posts: 22
Credit: 457,236,770
RAC: 1,328
300 million credit badge6 year member badge
Message 69880 - Posted: 31 May 2020, 19:48:35 UTC - in response to Message 69879.  

hi,
The whole system is dedicated to boinc 24/7/365 so for the i9-99000k the answer is 16 cpu threads
I only want to commit whatever number of cpu threads are required buy each GPU application.
My goal is to run 2 concurrent GPU tasks - 1 task per card (which it seems to be doing now that i set the global cpu % down to 90%)
and have the remainder cpu resources crunching CPU tasks.
So if each nvidia app actually requires a full cpu thread to keep it fed, then the remaining 14 threads should be crunching CPU tasks
What bothers me is that by using the Global "Use at most xx CPU percentage" option I am affecting other projects where if i had a decent app_info.xml
or app_config.xml (whatever i need) it would only apply to Mikyway and leave the other (presently idle) projects alone. I would only be running a single project
not more than that concurrently. e.g I switched to MW only because SETI isnt handing out work while they manage an overwhelming amount of returned results.
TIA

I would expect setting 90% would use all 16 threads, 14 for CPU and 2 for GPU, if all you're running is MilkyWay. The % might need tweaked if it's not working as expected.

14/16 = 87.5
Set your % to 88 - generally it's best to round up rather than use a fraction.

You can also set CPU % to 100 and tweak the app_config. The following says to run one task on the GPU with CPU use set to one tenth of a thread. This should get both GPUs working if you have CPU % set to 100, for a total of 18 tasks. As stated above, this doesn't limit what it will actually use, but you can set it to manipulate BOINC scheduling.

<app_config>

 <app>
  <name>milkyway</name>
  <gpu_versions>
   <gpu_usage>1.0</gpu_usage>
   <cpu_usage>0.10</cpu_usage>
  </gpu_versions>
 </app>

</app_config>


Alternatively, set cpu_usage to 1 to keep BOINC from running more than 16 tasks total, and to make sure the GPU has a full thread available for support. You'd have to do some testing on your own to see what works best for you.
Team USA forum | Team USA page
Always crunching / Always recruiting
ID: 69880 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 335
Credit: 215,190,522
RAC: 320,490
200 million credit badge9 year member badgeextraordinary contributions badge
Message 69881 - Posted: 1 Jun 2020, 16:26:17 UTC - in response to Message 69879.  
Last modified: 1 Jun 2020, 16:27:13 UTC

Well if you want to run 14 total cpu threads out of the 16 and assign two threads for the two gpus, that leaves you with 12 threads to run the cpu milkyway nbody tasks. So you should run this app_config.xml file.

<app_config>
<app>
  <name>milkyway_nbody</name>
  <max_concurrent>3</max_concurrent>
 </app>
<app_version>
   <app_name>milkyway_nbody</app_name>
   <plan_class>mt</plan_class>
   <avg_ncpus>4</avg_ncpus>
   <cmdline>--nthreads 4</cmdline>
</app_version>

<app>
    <name>milkyway</name>
    <gpu_versions>
       <gpu_usage>1.0</gpu_usage>
       <cpu_usage>1.0</cpu_usage>
    </gpu_versions>
</app>
</app_config>


This would run 3 concurrent nbody cpu tasks using 4 threads each and two gpu tasks running.
ID: 69881 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Need help with linux and app_info

©2020 Astroinformatics Group