Welcome to MilkyWay@home

Issues with & proper support of AMD R9 290X GPUs

Message boards : Number crunching : Issues with & proper support of AMD R9 290X GPUs
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Michael H.W. Weber

Send message
Joined: 22 Jan 08
Posts: 29
Credit: 242,726,778
RAC: 0
Message 65104 - Posted: 5 Sep 2016, 12:51:39 UTC
Last modified: 5 Sep 2016, 13:34:35 UTC

Due to its superior double precision (DP) performance (second best of AMDs consumer cards), AMDs R9 290X GPU is a valuable card worth being supported properly by the Milkyway@home project.

So far, however, this card is not even recognized as a graphics board by this project when using Windows 7. Instead, one has to manually setup the following app_info.xml file and copy it to the Milkyway@home project folder:

<app_info>
<app>
<name>milkyway_nbody</name>
<user_friendly_name>Milkyway N-Body Sim.</user_friendly_name>
</app>
<file_info>
<name>milkyway_nbody_1.62_windows_x86_64.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>162</version_num>
<platform>windows_x86_64</platform>
<file_ref>
<file_name>milkyway_nbody_1.62_windows_x86_64.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>milkyway</name>
<user_friendly_name>Milkyway</user_friendly_name>
</app>
<file_info>
<name>milkyway_1.36_windows_x86_64.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway</app_name>
<version_num>136</version_num>
<platform>windows_x86_64</platform>
<file_ref>
<file_name>milkyway_1.36_windows_x86_64.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>milkyway</name>
</app>
<file_info>
<name>milkyway_1.36_windows_x86_64__opencl_ati_101.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway</app_name>
<version_num>136</version_num>
<platform>windows_x86_64</platform>
<avg_ncpus>1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<plan_class>opencl_ati_101</plan_class>
<cmdline></cmdline>
<coproc>
<type>ATI</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>milkyway_1.36_windows_x86_64__opencl_ati_101.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>milkyway_separation__modified_fit</name>
<user_friendly_name>Milkyway Sep. (Mod. Fit)</user_friendly_name>
</app>
<file_info>
<name>milkyway_separation__modified_fit_1.36_windows_x86_64.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway_separation__modified_fit</app_name>
<version_num>136</version_num>
<platform>windows_x86_64</platform>
<file_ref>
<file_name>milkyway_separation__modified_fit_1.36_windows_x86_64.exe</file_name>
<main_program/>
</file_ref>
</app_version>
<app>
<name>milkyway_separation__modified_fit</name>
</app>
<file_info>
<name>milkyway_separation__modified_fit_1.36_windows_x86_64__opencl_ati_101.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway_separation__modified_fit</app_name>
<version_num>136</version_num>
<platform>windows_x86_64</platform>
<avg_ncpus>1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<plan_class>opencl_ati_101</plan_class>
<cmdline></cmdline>
<coproc>
<type>ATI</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>milkyway_separation__modified_fit_1.36_windows_x86_64__opencl_ati_101.exe</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>

...followed by manual download of the corresponding executables from the Milkyway@home website (to be also stored in the project folder):

milkyway_1.36_windows_x86_64.exe
milkyway_1.36_windows_x86_64__opencl_ati_101.exe
milkyway_nbody_1.62_windows_x86_64.exe
milkyway_separation__modified_fit_1.36_windows_x86_64.exe
milkyway_separation__modified_fit_1.36_windows_x86_64__opencl_ati_101.exe

After restart of BOINC, Milkyway@home will finally start to compute tasks. Single tasks. One after the other. Task duration is around 16 seconds!

Long ago, I asked for longer GPU tasks here in this forum, because initiation of a task every 16 seconds is a massive waste of compute time and requires permanent internet connection for constant up- and downloads as the number of tasks per machine is severely limited, too. Nothing has happened.

For AMDs R9 280X, which is the same board family and the most performant GPU with respect to DP, GPU recognition by Milkayway@home is, by contrast to the 290X, fully automated. This card can process several tasks in parallel, so I thought, it should also be possible with the 290X.

Pustekuchen!

With the 280X the following app_info.xml does the job to run 8 tasks simultaneously, thereby significantly increasing throughput:

<app_config>
<app>
<name>milkyway</name>
<gpu_versions>
<gpu_usage>0.125</gpu_usage>
<cpu_usage>0.1</cpu_usage>
</gpu_versions>
</app>
<app>
<name>milkyway_separation__modified_fit</name>
<gpu_versions>
<gpu_usage>0.125</gpu_usage>
<cpu_usage>0.1</cpu_usage>
</gpu_versions>
</app>
</app_config>

When I include this file into my work folder for the 290X card, some tasks do validate, others do not. The majority does not validate. They mostly are initially categorized as inconclusive and then are directed into the "bad box".

What I want to know is, why there is this massive fraction of invalid tasks? I have tested a second R9 290X which behaves absolutely identical. Both cards work properly with ALL other tested distributed computing GPU projects. Specifically, I tested Primegrid, Folding@home, POEM@home, Collatz Conjecture, Einstein@home and SETI@home. Since I use the latest AMD drivers (and have also tested older ones from the outdated Catalyst series) I therefore conclude two things: Neither my hardware nor the driver are the cause of the issue. Hence, something is wrong with Milkyway@home or my manual configuration as detailed above.

In order to nail the problem, I will post three exemplary result files from my 290X card as follows.

A valid task:

Aufgabe 1764179469
Michael H.W. Weber · Abmelden
Name de_modfit_fast_15_3s_136_ModfitConstraints5_4_1471352126_22530160_0
Arbeitspaket 1294581971
Erstellt 4 Sep 2016, 20:27:06 UTC
Gesendet 4 Sep 2016, 20:27:34 UTC
Ablaufdatum 16 Sep 2016, 20:27:34 UTC
Empfangen 4 Sep 2016, 20:35:22 UTC
Serverstatus Abgeschlossen
Resultat Erfolgreich
Clientstatus Fertig
Endstatus 0 (0x0)
Computer ID 611995
Laufzeit 1 min. 19 sek.
CPU Zeit 5 sek.
Prüfungsstatus Gültig
Punkte 26.74
Device peak FLOPS 475.20 GFLOPS
Anwendungsversion MilkyWay@Home
Anonyme Plattform (ATI Grafikkarte)
Peak working set size 91.66 MB
Peak swap size 96.41 MB
Peak disk usage 0.01 MB
Stderr Ausgabe

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_separation 1.36 Windows x86_64 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Switching to Parameter File
Using AVX path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 2.0 AMD-APP (2117.9)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'Hawaii' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: AMD Radeon R9 200 Series
Driver version: 2117.9 (VM)
Version: OpenCL 2.0 AMD-APP (2117.9)
Compute capability: 0.0
Max compute units: 44
Clock frequency: 1080 Mhz
Global mem size: 4294967296
Local mem size: 32768
Max const buf size: 65536
Double extension: cl_khr_fp64
Build log:
--------------------------------------------------------------------------------
C:\Users\MW\AppData\Local\Temp\\OCL1492T5.cl:176:72: warning: unknown attribute 'max_constant_size' ignored
__constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))),
^
C:\Users\MW\AppData\Local\Temp\\OCL1492T5.cl:178:62: warning: unknown attribute 'max_constant_size' ignored
__constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))),
^
C:\Users\MW\AppData\Local\Temp\\OCL1492T5.cl:179:67: warning: unknown attribute 'max_constant_size' ignored
__constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))),
^
3 warnings generated.

--------------------------------------------------------------------------------
Estimated AMD GPU GFLOP/s: 475 SP GFLOP/s, 95 DP FLOP/s
Warning: Bizarrely low flops (95). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 11264 with 4 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 800, r_steps = 700 }
Iteration area: 560000
Chunk estimate: 11
Num chunks: 13
Chunk size: 45056
Added area: 25728
Effective area: 585728
Initial wait: 13 ms
Integration time: 74.696196 s. Average time per iteration = 233.425612 ms
Integral 0 time = 75.041790 s
Running likelihood with 108460 stars
Likelihood time = 2.384113 s
<background_integral> 0.000219895184258 </background_integral>
<stream_integral> 26.279213526558141 282.717596669534940 59.916729387043546 </stream_integral>
<background_likelihood> -3.519474843288314 </background_likelihood>
<stream_only_likelihood> -64.978883633533044 -3.731802400470303 -3.737878862213316 </stream_only_likelihood>
<search_likelihood> -2.974664769155546 </search_likelihood>
22:32:49 (1492): called boinc_finish

</stderr_txt>
]]>

An inconclusive task:

Aufgabe 1764185669
Michael H.W. Weber · Abmelden
Name de_modfit_fast_15_3s_136_ModfitConstraints5_3_1471352126_22534138_0
Arbeitspaket 1294586066
Erstellt 4 Sep 2016, 20:31:47 UTC
Gesendet 4 Sep 2016, 20:32:02 UTC
Ablaufdatum 16 Sep 2016, 20:32:02 UTC
Empfangen 4 Sep 2016, 20:39:51 UTC
Serverstatus Abgeschlossen
Resultat Erfolgreich
Clientstatus Fertig
Endstatus 0 (0x0)
Computer ID 611995
Laufzeit 1 min. 4 sek.
CPU Zeit 6 sek.
Prüfungsstatus Überprüft, noch keine Übereinstimmung
Punkte 0.00
Device peak FLOPS 475.20 GFLOPS
Anwendungsversion MilkyWay@Home
Anonyme Plattform (ATI Grafikkarte)
Peak working set size 85.95 MB
Peak swap size 90.49 MB
Peak disk usage 0.01 MB
Stderr Ausgabe

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_separation 1.36 Windows x86_64 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Switching to Parameter File
Using AVX path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 2.0 AMD-APP (2117.9)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'Hawaii' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: AMD Radeon R9 200 Series
Driver version: 2117.9 (VM)
Version: OpenCL 2.0 AMD-APP (2117.9)
Compute capability: 0.0
Max compute units: 44
Clock frequency: 1080 Mhz
Global mem size: 4294967296
Local mem size: 32768
Max const buf size: 65536
Double extension: cl_khr_fp64
Build log:
--------------------------------------------------------------------------------
C:\Users\MW\AppData\Local\Temp\\OCL4924T5.cl:176:72: warning: unknown attribute 'max_constant_size' ignored
__constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))),
^
C:\Users\MW\AppData\Local\Temp\\OCL4924T5.cl:178:62: warning: unknown attribute 'max_constant_size' ignored
__constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))),
^
C:\Users\MW\AppData\Local\Temp\\OCL4924T5.cl:179:67: warning: unknown attribute 'max_constant_size' ignored
__constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))),
^
3 warnings generated.

--------------------------------------------------------------------------------
Estimated AMD GPU GFLOP/s: 475 SP GFLOP/s, 95 DP FLOP/s
Warning: Bizarrely low flops (95). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 11264 with 4 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 800, r_steps = 700 }
Iteration area: 560000
Chunk estimate: 11
Num chunks: 13
Chunk size: 45056
Added area: 25728
Effective area: 585728
Initial wait: 13 ms
Integration time: 58.227728 s. Average time per iteration = 181.961650 ms
Integral 0 time = 58.710419 s
Running likelihood with 108460 stars
Likelihood time = 2.392808 s
<background_integral> 0.000256231358565 </background_integral>
<stream_integral> 36.101376679045323 312.027800009935050 101.204115896450660 </stream_integral>
<background_likelihood> -3.399075315784963 </background_likelihood>
<stream_only_likelihood> -4.530232320863708 -3.982049344512974 -3.885673391537581 </stream_only_likelihood>
<search_likelihood> -2.969699537899006 </search_likelihood>
22:37:18 (4924): called boinc_finish

</stderr_txt>
]]>

An invalid task:

Aufgabe 1764165427
Michael H.W. Weber · Abmelden
Name de_modfit_fast_15_3s_136_fixedangles3_3_1471352126_22478237_2
Arbeitspaket 1294526646
Erstellt 4 Sep 2016, 20:17:26 UTC
Gesendet 4 Sep 2016, 20:17:28 UTC
Ablaufdatum 16 Sep 2016, 20:17:28 UTC
Empfangen 4 Sep 2016, 20:25:19 UTC
Serverstatus Abgeschlossen
Resultat Erfolgreich
Clientstatus Fertig
Endstatus 0 (0x0)
Computer ID 611995
Laufzeit 1 min. 23 sek.
CPU Zeit 6 sek.
Prüfungsstatus Arbeitspaket fehlerhaft - Prüfung übersprungen
Punkte 0.00
Device peak FLOPS 475.20 GFLOPS
Anwendungsversion MilkyWay@Home
Anonyme Plattform (ATI Grafikkarte)
Peak working set size 86.03 MB
Peak swap size 90.61 MB
Peak disk usage 0.01 MB
Stderr Ausgabe

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_separation 1.36 Windows x86_64 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Switching to Parameter File
Using AVX path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 2.0 AMD-APP (2117.9)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'Hawaii' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: AMD Radeon R9 200 Series
Driver version: 2117.9 (VM)
Version: OpenCL 2.0 AMD-APP (2117.9)
Compute capability: 0.0
Max compute units: 44
Clock frequency: 1080 Mhz
Global mem size: 4294967296
Local mem size: 32768
Max const buf size: 65536
Double extension: cl_khr_fp64
Build log:
--------------------------------------------------------------------------------
C:\Users\MW\AppData\Local\Temp\\OCL3612T5.cl:176:72: warning: unknown attribute 'max_constant_size' ignored
__constant real* _ap_consts __attribute__((max_constant_size(18 * sizeof(real)))),
^
C:\Users\MW\AppData\Local\Temp\\OCL3612T5.cl:178:62: warning: unknown attribute 'max_constant_size' ignored
__constant SC* sc __attribute__((max_constant_size(NSTREAM * sizeof(SC)))),
^
C:\Users\MW\AppData\Local\Temp\\OCL3612T5.cl:179:67: warning: unknown attribute 'max_constant_size' ignored
__constant real* sg_dx __attribute__((max_constant_size(256 * sizeof(real)))),
^
3 warnings generated.

--------------------------------------------------------------------------------
Estimated AMD GPU GFLOP/s: 475 SP GFLOP/s, 95 DP FLOP/s
Warning: Bizarrely low flops (95). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 11264 with 4 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 320, mu_steps = 800, r_steps = 700 }
Iteration area: 560000
Chunk estimate: 11
Num chunks: 13
Chunk size: 45056
Added area: 25728
Effective area: 585728
Initial wait: 13 ms
Integration time: 78.214496 s. Average time per iteration = 244.420301 ms
Integral 0 time = 78.526644 s
Running likelihood with 108460 stars
Likelihood time = 2.372197 s
<background_integral> 0.000221023641879 </background_integral>
<stream_integral> 56.528977956441722 341.255842918021300 64.662390777401527 </stream_integral>
<background_likelihood> -3.551674800922283 </background_likelihood>
<stream_only_likelihood> -5.829435611967259 -3.728891679824896 -3.829357027346766 </stream_only_likelihood>
<search_likelihood> -2.993528071709122 </search_likelihood>
22:23:12 (3612): called boinc_finish

</stderr_txt>
]]>

Note that these are results generated when running 8 tasks in parallel.

My system is an Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz [Family 6 Model 42 Stepping 7] CPU running Windows 7 Ultimate x64 with an MSI R9 290X Lightning GPU. The machine is equipped with 16 GB of RAM and one CPU core is kept empty to fire the GPU with maximum performance.

It is an excellent card and I am highly disappointed that this project makes so little out of it. It almost appears as if you guys have enough compute power for free. If that is the case, just let me know and I won't bother you any further as there are many projects out there which are in need of ressources.

I should also note that the configuration file above previously was a different one which I had to manually update. Suddenly the older one did not work anymore. Without notice from the project on its website.

If you expect people to participate in your project in large sums, then you need to take utmost care to keep things as simple as possible. A person new to Milkyway@home will most likely never get an R9 290X to run for you given the manual intervention required to do so.

Michael.
President of Rechenkraft.net e.V. - This planet's first and largest distributed computing organization.

ID: 65104 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Michael H.W. Weber

Send message
Joined: 22 Jan 08
Posts: 29
Credit: 242,726,778
RAC: 0
Message 65114 - Posted: 8 Sep 2016, 7:59:32 UTC

Any comments?

Michael.
President of Rechenkraft.net e.V. - This planet's first and largest distributed computing organization.

ID: 65114 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nick Name

Send message
Joined: 27 Jul 14
Posts: 23
Credit: 921,261,826
RAC: 0
Message 65119 - Posted: 9 Sep 2016, 4:45:15 UTC - in response to Message 65114.  

The problem of some GPUs not working is being worked on.
http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4001&postid=65060

Regarding the number of invalids: Are you successfully running multiple tasks together on those other projects? I've seen comments about not being able to do that with that card. Attempts to run multiple tasks result in what you see here, many invalids. Some think it's a driver issue. I would not be surprised if that problem is also present here.
Team USA forum | Team USA page
Always crunching / Always recruiting
ID: 65119 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Michael H.W. Weber

Send message
Joined: 22 Jan 08
Posts: 29
Credit: 242,726,778
RAC: 0
Message 65123 - Posted: 9 Sep 2016, 10:45:24 UTC - in response to Message 65119.  

Regarding the number of invalids: Are you successfully running multiple tasks together on those other projects? I've seen comments about not being able to do that with that card.

I will look into that.

Some think it's a driver issue. I would not be surprised if that problem is also present here.

A driver issue can be excluded as detailed above.

Michael.
President of Rechenkraft.net e.V. - This planet's first and largest distributed computing organization.

ID: 65123 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Nick Name

Send message
Joined: 27 Jul 14
Posts: 23
Credit: 921,261,826
RAC: 0
Message 65125 - Posted: 9 Sep 2016, 16:50:05 UTC - in response to Message 65123.  

Some think it's a driver issue. I would not be surprised if that problem is also present here.

A driver issue can be excluded as detailed above.

Michael.

Not as it relates to successfully running multiple tasks at a time, but only AMD knows for sure.
See this discussion over at Einstein.
https://einsteinathome.org/content/problem-r9-390x-when-run-2-or-more-wus-time
Team USA forum | Team USA page
Always crunching / Always recruiting
ID: 65125 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Issues with & proper support of AMD R9 290X GPUs

©2024 Astroinformatics Group