21)
Message boards :
Number crunching :
Finally getting new tasks only seconds after running out. May not be worth the hassle.
(Message 69357)
Posted 18 Dec 2019 by jpmboy Post: I'm trying... Here's the cc_config file as I have it - added the flags and options as you recommend. This is the config file in place when I posted the earlier eventlog. Check your PMs for a question regarding the other edits you recommended. <cc_config> <log_flags> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> <task>1</task> <app_msg_receive>0</app_msg_receive> <app_msg_send>0</app_msg_send> <async_file_debug>0</async_file_debug> <benchmark_debug>1</benchmark_debug> <checkpoint_debug>0</checkpoint_debug> <coproc_debug>0</coproc_debug> <cpu_sched>0</cpu_sched> <cpu_sched_debug>0</cpu_sched_debug> <cpu_sched_status>0</cpu_sched_status> <dcf_debug>0</dcf_debug> <disk_usage_debug>0</disk_usage_debug> <file_xfer>1</file_xfer> <file_xfer_debug>0</file_xfer_debug> <gui_rpc_debug>0</gui_rpc_debug> <heartbeat_debug>0</heartbeat_debug> <http_debug>0</http_debug> <http_xfer_debug>0</http_xfer_debug> <idle_detection_debug>1</idle_detection_debug> <mem_usage_debug>0</mem_usage_debug> <mw_debug>0</mw_debug> <network_status_debug>0</network_status_debug> <notice_debug>0</notice_debug> <poll_debug>0</poll_debug> <priority_debug>0</priority_debug> <proxy_debug>0</proxy_debug> <rr_simulation>0</rr_simulation> <rrsim_detail>0</rrsim_detail> <sched_ops>1</sched_ops> <sched_op_debug>1</sched_op_debug> <scrsave_debug>0</scrsave_debug> <slot_debug>0</slot_debug> <state_debug>0</state_debug> <statefile_debug>0</statefile_debug> <suspend_debug>0</suspend_debug> <task>0</task> <task_debug>0</task_debug> <time_debug>0</time_debug> <trickle_debug>0</trickle_debug> <unparsed_xml>0</unparsed_xml> <work_fetch_debug>0</work_fetch_debug> </log_flags> <options> <abort_jobs_on_exit>0</abort_jobs_on_exit> <allow_multiple_clients>0</allow_multiple_clients> <allow_remote_gui_rpc>1</allow_remote_gui_rpc> <disallow_attach>0</disallow_attach> <dont_check_file_sizes>0</dont_check_file_sizes> <dont_contact_ref_site>0</dont_contact_ref_site> <lower_client_priority>0</lower_client_priority> <dont_suspend_nci>0</dont_suspend_nci> <dont_use_vbox>0</dont_use_vbox> <dont_use_wsl>0</dont_use_wsl> <exit_after_finish>0</exit_after_finish> <exit_before_start>0</exit_before_start> <exit_when_idle>0</exit_when_idle> <fetch_minimal_work>0</fetch_minimal_work> <fetch_on_update>0</fetch_on_update> <force_auth>default</force_auth> <http_1_0>0</http_1_0> <http_transfer_timeout>300</http_transfer_timeout> <http_transfer_timeout_bps>10</http_transfer_timeout_bps> <max_event_log_lines>2000</max_event_log_lines> <max_file_xfers>20</max_file_xfers> <max_file_xfers_per_project>20</max_file_xfers_per_project> <max_stderr_file_size>0</max_stderr_file_size> <max_stdout_file_size>0</max_stdout_file_size> <max_tasks_reported>0</max_tasks_reported> <mw_low_water_pct>1</mw_low_water_pct> <mw_high_water_pct>16</mw_high_water_pct> <mw_wait_interval>512</mw_wait_interval> <ncpus>-1</ncpus> <no_alt_platform>0</no_alt_platform> <no_gpus>0</no_gpus> <no_info_fetch>0</no_info_fetch> <no_opencl>0</no_opencl> <no_priority_change>0</no_priority_change> <os_random_only>0</os_random_only> <process_priority>-1</process_priority> <process_priority_special>-1</process_priority_special> <use_all_gpus>1</use_all_gpus> <proxy_info> <socks_server_name></socks_server_name> <socks_server_port>80</socks_server_port> <http_server_name></http_server_name> <http_server_port>80</http_server_port> <socks5_user_name></socks5_user_name> <socks5_user_passwd></socks5_user_passwd> <socks5_remote_dns>0</socks5_remote_dns> <http_user_name></http_user_name> <http_user_passwd></http_user_passwd> <no_proxy></no_proxy> <no_autodetect>0</no_autodetect> </proxy_info> <rec_half_life_days>10.000000</rec_half_life_days> <report_results_immediately>0</report_results_immediately> <run_apps_manually>0</run_apps_manually> <save_stats_days>30</save_stats_days> <skip_cpu_benchmarks>0</skip_cpu_benchmarks> <simple_gui_only>0</simple_gui_only> <start_delay>0.000000</start_delay> <stderr_head>0</stderr_head> <suppress_net_info>0</suppress_net_info> <unsigned_apps_ok>0</unsigned_apps_ok> <use_all_gpus>0</use_all_gpus> <use_certs>0</use_certs> <use_certs_only>0</use_certs_only> <vbox_window>0</vbox_window> </options> </cc_config> |
22)
Message boards :
Number crunching :
Finally getting new tasks only seconds after running out. May not be worth the hassle.
(Message 69355)
Posted 18 Dec 2019 by jpmboy Post: THanks. "good copy". :) when this last loaded batch finishes I'll make the changes you suggested here and in the reg... and PM you back. |
23)
Message boards :
News :
30 Workunit Limit Per Request - Fix Implemented
(Message 69353)
Posted 18 Dec 2019 by jpmboy Post: I posted back in that thread ... please take a look. :) |
24)
Message boards :
Number crunching :
Finally getting new tasks only seconds after running out. May not be worth the hassle.
(Message 69352)
Posted 18 Dec 2019 by jpmboy Post:
Thank you for posting this. I tried using your mod'd boinc.exe as you described... initially there was no change in the number of task downloaded at any one time (always 300-ish with 3 GPUs installed), ot in the accumulation of tasks over a day (gpus remain in active for roughly 30-40% of time). So I them added some of the "options" and settings from the included cc_config.xml file in cluded in the down load and viola... I actually got 425 tasks... once, then the eventlog shows we're back to 300-ish tasks after waiting several 91 sec cycles, and occasionally, no "fetch" for 10 min. Most times the fetch command is as follows: 12/17/2019 8:00:35 PM | Milkyway@Home | Sending scheduler request: To fetch work. 12/17/2019 8:00:35 PM | Milkyway@Home | Reporting 34 completed tasks 12/17/2019 8:00:35 PM | Milkyway@Home | Requesting new tasks for CPU and NVIDIA GPU 12/17/2019 8:00:35 PM | Milkyway@Home | [sched_op] CPU work request: 16278725.09 seconds; 0.00 devices 12/17/2019 8:00:35 PM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1553755.01 seconds; 0.00 devices 12/17/2019 8:00:37 PM | Milkyway@Home | Scheduler request completed: got 0 new tasks 12/17/2019 8:00:37 PM | Milkyway@Home | [sched_op] Server version 713 12/17/2019 8:00:37 PM | Milkyway@Home | No tasks sent 12/17/2019 8:00:37 PM | Milkyway@Home | Project requested delay of 91 seconds once it receives the allocation of tasks, the GPU count drops to 0.00 and no further tasks are downloaded (but, the CPU task list adds one or more tasks, accumulating 2-3 days of work according to BoincTasks): 12/17/2019 7:51:12 PM | Milkyway@Home | [sched_op] Starting scheduler request 12/17/2019 7:51:12 PM | Milkyway@Home | Sending scheduler request: To fetch work. 12/17/2019 7:51:12 PM | Milkyway@Home | Reporting 1 completed tasks 12/17/2019 7:51:12 PM | Milkyway@Home | Requesting new tasks for CPU and NVIDIA GPU 12/17/2019 7:51:12 PM | Milkyway@Home | [sched_op] CPU work request: 16283322.16 seconds; 0.00 devices 12/17/2019 7:51:12 PM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1555200.00 seconds; 3.00 devices 12/17/2019 7:51:14 PM | Milkyway@Home | Scheduler request completed: got 328 new tasks 12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] Server version 713 12/17/2019 7:51:14 PM | Milkyway@Home | Project requested delay of 91 seconds 12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] estimated total CPU task duration: 0 seconds 12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] estimated total NVIDIA GPU task duration: 19186 seconds 12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_80_bundle4_4s_south4s_bgset_2_1574164502_15520017_1 12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] Deferring communication for 00:01:31 12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] Reason: requested by project 12/17/2019 7:52:45 PM | Milkyway@Home | [sched_op] Starting scheduler request 12/17/2019 7:52:45 PM | Milkyway@Home | Sending scheduler request: To fetch work. 12/17/2019 7:52:45 PM | Milkyway@Home | Reporting 19 completed tasks 12/17/2019 7:52:45 PM | Milkyway@Home | Requesting new tasks for CPU and NVIDIA GPU 12/17/2019 7:52:45 PM | Milkyway@Home | [sched_op] CPU work request: 16277298.29 seconds; 0.00 devices 12/17/2019 7:52:45 PM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1552420.62 seconds; 0.00 devices 12/17/2019 7:52:47 PM | Milkyway@Home | Scheduler request completed: got 2 new tasks 12/17/2019 7:52:47 PM | Milkyway@Home | [sched_op] Server version 713 12/17/2019 7:52:47 PM | Milkyway@Home | Project requested delay of 91 seconds So... I still can't get 200 tasks per GPU, and ceratinly not picking up GPU tasks until several "fetch" requests after the last GPU tyask has uploaded. I'm running win10, a 7980XE, 3 Titan Vs... running 6 tasks per GPU. So unbelievably, adding the third titan V, rather than getting more tasks and processing more tasks, is actually resulting in more idle time as the lot of 300 tasks process in 2/3 the time. Crazy! |
25)
Message boards :
News :
30 Workunit Limit Per Request - Fix Implemented
(Message 69345)
Posted 16 Dec 2019 by jpmboy Post: I appreciate your sentiment, though I don't (yet) share it. Which thread did you post the windows exec just to fix MW? ... plz. ;) |
26)
Message boards :
News :
30 Workunit Limit Per Request - Fix Implemented
(Message 69344)
Posted 15 Dec 2019 by jpmboy Post: I appreciate your sentiment, though I don't (yet) share it. Okay, extracted, and ... then what? Is there an installer or a readme file in there somewhere? :) |
27)
Message boards :
News :
30 Workunit Limit Per Request - Fix Implemented
(Message 69343)
Posted 15 Dec 2019 by jpmboy Post: Thank you ! I'll give it a try tonight! |
28)
Message boards :
News :
30 Workunit Limit Per Request - Fix Implemented
(Message 69340)
Posted 15 Dec 2019 by jpmboy Post: I appreciate your sentiment, though I don't (yet) share it. Still looking for a FIX to this problem.... |
29)
Message boards :
News :
30 Workunit Limit Per Request - Fix Implemented
(Message 69338)
Posted 15 Dec 2019 by jpmboy Post: Hey Guys, I've read thru this thread trying to find a solution to an issue I've recently experienced... I have been running 2 Titan Vs dedicated to MW which have been able to post 3.5-4M as a RAC over the past few months. I "repurposed" a 3rd TV to this rig expecting to increase the productivity, but that;s not what I see. Each TV is set to run 6 tasks and each task completes in about 55-60sec (so ~ 18 tasks per minute, each task has 0.5 cpu which has been sufficient for 2 TVs). Adding a 3rd I expected to see at least 4.5-5.5M RAC, but I still only get about 4M RAC over the past 3 days. I took a look at the Event log and see that the server loads about 300 tasks each time it sends work - these complete in less than 20 min. Then the rig sits idle for 10-20 min with the relevant event log section c-p below: 12/15/2019 9:38:24 AM | Milkyway@Home | Computation for task de_modfit_14_bundle5_testing_4s3f_2_1574164502_14593798_0 finished 12/15/2019 9:39:52 AM | Milkyway@Home | [sched_op] Starting scheduler request 12/15/2019 9:39:52 AM | Milkyway@Home | Sending scheduler request: To fetch work. 12/15/2019 9:39:52 AM | Milkyway@Home | Reporting 4 completed tasks 12/15/2019 9:39:52 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 12/15/2019 9:39:52 AM | Milkyway@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices 12/15/2019 9:39:52 AM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1814400.00 seconds; 3.00 devices 12/15/2019 9:39:53 AM | Milkyway@Home | Scheduler request completed: got 0 new tasks 12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] Server version 713 12/15/2019 9:39:53 AM | Milkyway@Home | Project requested delay of 91 seconds 12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_82_bundle4_4s_south4s_bgset_2_1574164502_14359817_1 12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_86_bundle4_4s_south4s_bgset_2_1574164502_14593783_0 12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_14_bundle5_testing_4s3f_3_1574164502_14555903_1 12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_14_bundle5_testing_4s3f_2_1574164502_14593798_0 12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] Deferring communication for 00:01:31 12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] Reason: requested by project 12/15/2019 9:52:34 AM | Milkyway@Home | [sched_op] Starting scheduler request 12/15/2019 9:52:34 AM | Milkyway@Home | Sending scheduler request: To fetch work. 12/15/2019 9:52:34 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU 12/15/2019 9:52:34 AM | Milkyway@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices 12/15/2019 9:52:34 AM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1814400.00 seconds; 3.00 devices 12/15/2019 9:52:36 AM | Milkyway@Home | Scheduler request completed: got 308 new tasks 12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] Server version 713 12/15/2019 9:52:36 AM | Milkyway@Home | Project requested delay of 91 seconds 12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] estimated total CPU task duration: 0 seconds 12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] estimated total NVIDIA GPU task duration: 17919 seconds 12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] Deferring communication for 00:01:31 12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] Reason: requested by project At 9:39:52 the last 4 tasks report as completed from the previous task-basket received. The server "sees" 3.00 Devices" and "got 0 new tasks" at 9:39:53, delays 91 sec... but the automatic request for more work fails to fetch any work until 9:52:36 when it gets 308 tasks (which complete in less than 20 min.). The system returns the work, and sits for at least 10-15min before more work arrives. So in effect, this rig only works about 50% of the time available to do work. It seems that adding a 3rd Titan V simply processed the task dump faster leading to a longer idle time? This is a home/personal PC, running windows 10 (https://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=776231 ). I'm not a programmer (just an enthusiast) but I would be able to do things like "make a B-tasks Rule" with specific instruction... Hoping the experienced users here can help: Is there anyway to get this 3rd Titan V to contribute in a meaningful way?? |
30)
Message boards :
News :
New Separation Runs [UPDATE]
(Message 68930)
Posted 26 Jul 2019 by jpmboy Post: This looks like it is still from the old bugged runs. It is a "...bgset" run, and the new runs will have 26 parameters, not 25. Can you clear your queue and see if that helps? I've clear my queue several times and still get errors across hosts... then it puts in a 24 hour update delay. Hit "Update" and it downloads a few hundred work units, which proceed to error at about 50% of those that don't. |
31)
Message boards :
News :
New Separation Runs [UPDATE]
(Message 68925)
Posted 25 Jul 2019 by jpmboy Post: Also seeing ~ 30% errors on all hosts here: 2 Titan Vs, radeon VII, 295x2 and Radeon 290. |
32)
Message boards :
News :
30 Workunit Limit Per Request - Fix Implemented
(Message 68469)
Posted 2 Apr 2019 by jpmboy Post: Thank you for the reply. I'll try jerry-rigging something when I get a chance. |
33)
Message boards :
News :
30 Workunit Limit Per Request - Fix Implemented
(Message 68467)
Posted 1 Apr 2019 by jpmboy Post: I'm having this issue, and my two Titan Vs sit idle for most of the day waiting for a task-batch download. Each task completes in less than 1 min, I run 12 tasks at a time (6 per GPU). 200 tasks complete in ~ 17 min. WE really need a fix to this which accounts for GPUs that are not DP crippled. ;) |
©2024 Astroinformatics Group