Message boards :
Number crunching :
Too Many WU downloaded when additional buffers =0
Message board moderation
Author | Message |
---|---|
Send message Joined: 24 Mar 13 Posts: 11 Credit: 25,297 RAC: 0 |
Greetings, I am new to WU, but have been with BOINC for years. I chose MilkyWay@home to use my GPU to process. I have an 8-core CPU and have set the preferences to run 87.5% ( 7/8) of them. I attached to the MW project and set No allowed tasks until I could set up the 'default' venue preferences. I then waited until all WCG GPU finished only 4 CPU WUs of WCG were running. At first every thing was OK. MW downloaded 4 GPU tasks and 3 or 4 CPU tasks. (There was room for 1 GPU-CPU and 2 CPU tasks to run. Then it got weird. MW downloaded 23 CPU tasks. They knocked the WCG tasks into waiting. Some had around 30 minutes to go until complete. I had cleared the local prefs before this - and I caleed them back up to see what the preferences had for - Minimum work buffer - Max additional work buffer. The values for both were 0.0 I aborted the 23 so all WU could finish up. I'll check the forum for answers before I fire up MW again. I wanted to post in the MW forum first- before going to the boinc forum - to see if this is something related to MW. Thanks in advance!! Oh Yes. Environment stuff Sun 24 Mar 2013 04:04:11 PM EDT | | Starting BOINC client version 7.0.27 for x86_64-pc-linux-gnu Sun 24 Mar 2013 04:04:11 PM EDT | | Libraries: libcurl/7.29.0 OpenSSL/1.0.1c zlib/1.2.7 libidn/1.25 librtmp/2.3 Sun 24 Mar 2013 04:04:11 PM EDT | | Data directory: /var/lib/boinc-client Sun 24 Mar 2013 04:04:11 PM EDT | | Processor: 8 AuthenticAMD AMD FX(tm)-8150 Eight-Core Processor [Family 21 Model 1 Stepping 2] Sun 24 Mar 2013 04:04:11 PM EDT | | Processor: 2.00 MB cache Sun 24 Mar 2013 04:04:11 PM EDT | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold Sun 24 Mar 2013 04:04:11 PM EDT | | OS: Linux: 3.8.0-13-generic Sun 24 Mar 2013 04:04:11 PM EDT | | Memory: 7.70 GB physical, 8.04 GB virtual Sun 24 Mar 2013 04:04:11 PM EDT | | Disk: 18.33 GB total, 16.28 GB free Sun 24 Mar 2013 04:04:11 PM EDT | | Local time is UTC -4 hours Sun 24 Mar 2013 04:04:11 PM EDT | | ATI GPU 0: Capeverde (CAL version 1.4.1741, 2048MB, 1708MB available, 2048 GFLOPS peak) Sun 24 Mar 2013 04:04:11 PM EDT | | OpenCL: ATI GPU 0: Capeverde (driver version 1084.4 (VM), device version OpenCL 1.2 AMD-APP (1084.4), 2048MB, 1708MB available) Sun 24 Mar 2013 04:04:11 PM EDT | | Config: use all coprocessors Sun 24 Mar 2013 04:04:11 PM EDT | | Config: don't compute while brasero is running Sun 24 Mar 2013 04:04:11 PM EDT | | Config: GUI RPC allowed from: I looked up the difference between BOINC 7.0.27 and 7.0.28 the changelog said the difference was a Windows patch, so I should have the latest, stable, Linux release. The OS is the Raring release from Ubuntu that I downloaded on 3/22/2013 it may be overkill, but here is the clinfo... Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 1.2 AMD-APP (1084.4) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices Platform Name: AMD Accelerated Parallel Processing Number of devices: 2 Device Type: CL_DEVICE_TYPE_GPU Device ID: 4098 Board name: AMD Radeon HD 7700 Series Device Topology: PCI[ B#1, D#0, F#0 ] Max compute units: 8 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 4 Preferred vector width short: 2 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 4 Native vector width short: 2 Native vector width int: 1 Native vector width long: 1 Native vector width float: 1 Native vector width double: 1 Max clock frequency: 800Mhz Address bits: 32 Max memory allocation: 536870912 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 2048 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 1508900864 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Kernel Preferred work group size multiple: 64 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 0x00007feadac62e40 Name: Capeverde Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 1.2 Driver version: 1084.4 (VM) Profile: FULL_PROFILE Version: OpenCL 1.2 AMD-APP (1084.4) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_amd_c1x_atomics Device Type: CL_DEVICE_TYPE_CPU Device ID: 4098 Board name: Max compute units: 8 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 1024 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 8 Preferred vector width double: 4 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 8 Native vector width double: 4 Max clock frequency: 3600Mhz Address bits: 64 Max memory allocation: 2147483648 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 8192 Max image 2D height: 8192 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 4096 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 8272465920 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Global Local memory size: 32768 Kernel Preferred work group size multiple: 1 Error correction support: 0 Unified memory for Host and Device: 1 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: Yes Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 0x00007feadac62e40 Name: AMD FX(tm)-8150 Eight-Core Processor Vendor: AuthenticAMD Device OpenCL C version: OpenCL C 1.2 Driver version: 1084.4 (sse2,avx,fma4) Profile: FULL_PROFILE Version: OpenCL 1.2 AMD-APP (1084.4) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cc_config.xml: <cc_config> <log_flags> <file_xfer>0</file_xfer> <sched_ops>0</sched_ops> <task>0</task> <app_msg_receive>0</app_msg_receive> <app_msg_send>0</app_msg_send> <async_file_debug>0</async_file_debug> <benchmark_debug>0</benchmark_debug> <checkpoint_debug>0</checkpoint_debug> <coproc_debug>0</coproc_debug> <cpu_sched>0</cpu_sched> <cpu_sched_debug>0</cpu_sched_debug> <cpu_sched_status>0</cpu_sched_status> <dcf_debug>0</dcf_debug> <disk_usage_debug>0</disk_usage_debug> <priority_debug>0</priority_debug> <file_xfer_debug>0</file_xfer_debug> <gui_rpc_debug>0</gui_rpc_debug> <heartbeat_debug>0</heartbeat_debug> <http_debug>0</http_debug> <http_xfer_debug>0</http_xfer_debug> <mem_usage_debug>0</mem_usage_debug> <network_status_debug>0</network_status_debug> <poll_debug>0</poll_debug> <proxy_debug>0</proxy_debug> <rr_simulation>0</rr_simulation> <rrsim_detail>0</rrsim_detail> <sched_op_debug>0</sched_op_debug> <scrsave_debug>0</scrsave_debug> <slot_debug>0</slot_debug> <state_debug>0</state_debug> <statefile_debug>0</statefile_debug> <suspend_debug>0</suspend_debug> <task_debug>0</task_debug> <time_debug>0</time_debug> <trickle_debug>0</trickle_debug> <unparsed_xml>0</unparsed_xml> <work_fetch_debug>0</work_fetch_debug> <notice_debug>0</notice_debug> </log_flags> <options> <abort_jobs_on_exit>0</abort_jobs_on_exit> <allow_multiple_clients>0</allow_multiple_clients> <allow_remote_gui_rpc>0</allow_remote_gui_rpc> <client_version_check_url>http://boinc.berkeley.edu/download.php?xml=1</client_version_check_url> <client_download_url>http://boinc.berkeley.edu/download.php</client_download_url> <disallow_attach>0</disallow_attach> <dont_check_file_sizes>0</dont_check_file_sizes> <dont_contact_ref_site>0</dont_contact_ref_site> <exclusive_app>brasero</exclusive_app> <exit_after_finish>0</exit_after_finish> <exit_before_start>0</exit_before_start> <exit_when_idle>0</exit_when_idle> <fetch_minimal_work>0</fetch_minimal_work> <force_auth>default</force_auth> <http_1_0>0</http_1_0> <http_transfer_timeout>300</http_transfer_timeout> <http_transfer_timeout_bps>10</http_transfer_timeout_bps> <max_file_xfers>4</max_file_xfers> <max_file_xfers_per_project>2</max_file_xfers_per_project> <max_stderr_file_size>0</max_stderr_file_size> <max_stdout_file_size>0</max_stdout_file_size> <max_tasks_reported>0</max_tasks_reported> <ncpus>-1</ncpus> <network_test_url>http://www.google.com/</network_test_url> <no_alt_platform>0</no_alt_platform> <no_gpus>0</no_gpus> <no_info_fetch>0</no_info_fetch> <no_priority_change>0</no_priority_change> <os_random_only>0</os_random_only> <proxy_info> <socks_server_name></socks_server_name> <socks_server_port>80</socks_server_port> <http_server_name></http_server_name> <http_server_port>80</http_server_port> <socks5_user_name></socks5_user_name> <socks5_user_passwd></socks5_user_passwd> <http_user_name></http_user_name> <http_user_passwd></http_user_passwd> <no_proxy></no_proxy> </proxy_info> <rec_half_life_days>10.0</rec_half_life_days> <report_results_immediately>0</report_results_immediately> <run_apps_manually>0</run_apps_manually> <save_stats_days>14</save_stats_days> <skip_cpu_benchmarks>0</skip_cpu_benchmarks> <simple_gui_only>0</simple_gui_only> <start_delay>10</start_delay> <stderr_head>0</stderr_head> <suppress_net_info>0</suppress_net_info> <unsigned_apps_ok>0</unsigned_apps_ok> <use_all_gpus>1</use_all_gpus> <use_certs>0</use_certs> <use_certs_only>0</use_certs_only> </options> </cc_config> and app_config.xml. <app_config> <app> <name>hcc1</name> <max_concurrent>2</max_concurrent>` <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>0.5</cpu_usage> </gpu_version> </app> <app> <name>milkyway</name> <max_concurrent>1</max_concurrent> <gpu_versions> <gpu_usage>1.0</gpu_usage> <cpu_usage>1.0</cpu_usage> </gpu_versions> </app> </app_config> Note: I wanted to start slow and easy with the MW GPU - doing just 1 WU on thebefore trying to start 2 wu on one GPU card. The Radeon HD 7750 has 2GB of ram on it (it says.) The projects directories looked OK to me. The slots directory was weird. There were 19 slots - zero though 18. Here is the link to my task list. http://milkyway.cs.rpi.edu/milkyway/results.php?userid=834251 There were *many* results with inconclusive results. I do not know if this is in any way related, but here it is... ((I'll be glad to meake them a separate post...)) http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=427073904 says Stderr output <core_client_version>7.0.27</core_client_version> <![CDATA[ <stderr_txt> <search_application> milkyway_separation 1.02 Linux x86_64 double OpenCL </search_application> Unrecognized XML in project preferences: max_gfx_cpu_pct Skipping: 20 Skipping: /max_gfx_cpu_pct Unrecognized XML in project preferences: nbody_graphics_poll_period Skipping: 30 Skipping: /nbody_graphics_poll_period Unrecognized XML in project preferences: nbody_graphics_float_speed Skipping: 10 Skipping: /nbody_graphics_float_speed Unrecognized XML in project preferences: nbody_graphics_textured_point_size Skipping: 250 Skipping: /nbody_graphics_textured_point_size Unrecognized XML in project preferences: nbody_graphics_point_point_size Skipping: 40 Skipping: /nbody_graphics_point_point_size BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.' Setting process priority to 0 (13): Permission denied Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' Error reading astronomy parameters from file 'astronomy_parameters.txt' Trying old parameters file Using AVX path Found 1 platform Platform 0 information: Name: AMD Accelerated Parallel Processing Version: OpenCL 1.2 AMD-APP (1084.4) Vendor: Advanced Micro Devices, Inc. Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices Profile: FULL_PROFILE Using device 0 on platform 0 Found 1 CL device Device 'Capeverde' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU) Driver version: 1084.4 (VM) Version: OpenCL 1.2 AMD-APP (1084.4) Compute capability: 0.0 Max compute units: 8 Clock frequency: 800 Mhz Global mem size: 1696595968 Local mem size: 32768 Max const buf size: 65536 Double extension: cl_khr_fp64 Build log: -------------------------------------------------------------------------------- "/tmp/OCLMvuER2.cl", line 30: warning: OpenCL extension is now part of core #pragma OPENCL EXTENSION cl_khr_fp64 : enable ^ LOOP UNROLL: pragma unroll (line 288) Unrolled as requested! LOOP UNROLL: pragma unroll (line 280) Unrolled as requested! LOOP UNROLL: pragma unroll (line 273) Unrolled as requested! LOOP UNROLL: pragma unroll (line 244) Unrolled as requested! LOOP UNROLL: pragma unroll (line 202) Unrolled as requested! -------------------------------------------------------------------------------- Build log: -------------------------------------------------------------------------------- "/tmp/OCLv06wLL.cl", line 27: warning: OpenCL extension is now part of core #pragma OPENCL EXTENSION cl_khr_fp64 : enable ^ -------------------------------------------------------------------------------- Estimated AMD GPU GFLOP/s: 64 SP GFLOP/s, 13 DP FLOP/s Warning: Bizarrely low flops (12). Defaulting to 100 Using a target frequency of 30.0 Using a block size of 2048 with 47 blocks/chunk Using clWaitForEvents() for polling (mode -1) Range: { nu_steps = 640, mu_steps = 1600, r_steps = 1400 } Iteration area: 2240000 Chunk estimate: 23 Num chunks: 24 Chunk size: 96256 Added area: 70144 Effective area: 2310144 Initial wait: 25 ms Integration time: 787.871171 s. Average time per iteration = 1231.048704 ms Integral 0 time = 790.630396 s Running likelihood with 107122 stars Likelihood time = 2.478142 s <background_integral> 0.000035500276475 </background_integral> <stream_integral> 0.594973184597082 1555.204911632331687 211.955616287150974 </stream_integral> <background_likelihood> -3.857876557072216 </background_likelihood> <stream_only_likelihood> -67.838435191593831 -3.537271642180458 -6.646429787745577 </stream_only_likelihood> <search_likelihood> -2.633785974768273 </search_likelihood> 23:09:13 (5035): called boinc_finish </stderr_txt> ]]> THANKS in ADVANCE!!! Need anything else? Please let me know. throwing tomatoes is allowed. Thanks, Jay PS the WCG GPU tasks worked OK. {edit: fix typos.} |
Send message Joined: 18 Jul 09 Posts: 300 Credit: 303,562,776 RAC: 0 |
It would be simplest to just allow the other tasks to finish and then download MW units. |
Send message Joined: 8 May 09 Posts: 3315 Credit: 519,946,411 RAC: 22,500 |
I would think it has to do with your time to run one project before switching to the next project. Add in the the fact that MW units take very little time to run and therefore if you let a few units come thru your very fast 8 core cpu is just responding to your settings. |
Send message Joined: 24 Mar 13 Posts: 11 Credit: 25,297 RAC: 0 |
Greetings.. Thank you for your responses. I tried to recreate the problem this morning. The problem did not repeat. I made a small difference. The WU from all other projects completed first. This time, when allowing MW work, the correct number of WU were started (7). I waited for 2 GPU WU to complete and looked at the stderr that was reported. No errors. Previously, I had neglected to report that the additional 27 WU were downloaded within 5 minutes of starting the project. So, I agree with your comments and assume that this probably has to do with BOINC scheduling and nothing to do with MilkyWay. BOINC has a lot of fixes coded into recent releases effecting priority of scheduling. There is one more item - but it is not related to this topic. Thanks again!! Jay |
Send message Joined: 24 Mar 13 Posts: 11 Credit: 25,297 RAC: 0 |
I would think it has to do with your time to run one project before switching to the next project. Add in the the fact that MW units take very little time to run and therefore if you let a few units come thru your very fast 8 core cpu is just responding to your settings. Hi, I pondered over the BOINC scheduling Wiki http://boinc.berkeley.edu/trac/wiki/ClientSched This problem went away as WU were processed. I assume that it just took time for BOINC to set the correlation factor. Now, I'm focusing on the other problem where I have set the preference for 50% of CPU and MW uses 100% every other time!. I posted in this thread because I thought it had to do with the N-Body 1.08 release: Thanks again! Jay http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=3171&sort=6 starting at http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=3171&nowrap=true#57666 |
Send message Joined: 24 Mar 13 Posts: 11 Credit: 25,297 RAC: 0 |
It would be simplest to just allow the other tasks to finish and then download MW units. Yes! I was alarmed that BOINC did not follow the Project Resource Share. I wanted to have all (3) projects enabled. But BOINC looks at past performance as well: http://boinc.berkeley.edu/trac/wiki/ClientSched says: Project scheduling priority Both scheduling policies involve a notion of project scheduling priority, a dynamic quantity that reflects how much processing has been done recently by the project's tasks relative to its resource share. It didn't say how far back it looks for "recently". By experience I have found that if I manually enable one project at a time - then allow all projects - the project that had the lease run-time comes back with a vengeance and overshadows the current project-share ratios. Just not intuitively obvious. Thanks for your response!!! Jay |
Send message Joined: 8 May 09 Posts: 3315 Credit: 519,946,411 RAC: 22,500 |
It would be simplest to just allow the other tasks to finish and then download MW units. Read here: http://setiathome.berkeley.edu/forum_thread.php?id=69330 by Richard Haselgrove. Don't get hung up on the fact that is at Seti, Boinc is Boinc and the same software is used in ALL Boinc Projects. I believe Richard is one of the 'inner circle' of guys who works on Boinc. |
Send message Joined: 4 Sep 12 Posts: 219 Credit: 456,474 RAC: 0 |
Read here: No, I'm not in the 'inner circle', and I have no special access - I'm just a volunteer, same as you. But I do a lot of watching, a lot of reading, and a lot of head-scratching. Hopefully, some of my conclusions are helpful. |
Send message Joined: 24 Mar 13 Posts: 11 Credit: 25,297 RAC: 0 |
Mikey, Richard, Thanks to you both! I really appreciate the info. I have been trying to read other posts before asking the same question again. I am truly amazed how much you both contribute to answering the questions of others! Thanks again, Jay ps will have to get a nvidia card so I can crunch for seti too. |
Send message Joined: 8 May 09 Posts: 3315 Credit: 519,946,411 RAC: 22,500 |
Read here: Your posts are ALWAYS helpful Richard!!! Thank you!!! |
©2024 Astroinformatics Group