Welcome to MilkyWay@home

Posts by jpmboy

21) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69357)
Posted 18 Dec 2019 by jpmboy
Post:
I'm trying... Here's the cc_config file as I have it - added the flags and options as you recommend. This is the config file in place when I posted the earlier eventlog. Check your PMs for a question regarding the other edits you recommended.
<cc_config>
<log_flags>
<file_xfer>1</file_xfer>
<sched_ops>1</sched_ops>
<task>1</task>
<app_msg_receive>0</app_msg_receive>
<app_msg_send>0</app_msg_send>
<async_file_debug>0</async_file_debug>
<benchmark_debug>1</benchmark_debug>
<checkpoint_debug>0</checkpoint_debug>
<coproc_debug>0</coproc_debug>
<cpu_sched>0</cpu_sched>
<cpu_sched_debug>0</cpu_sched_debug>
<cpu_sched_status>0</cpu_sched_status>
<dcf_debug>0</dcf_debug>
<disk_usage_debug>0</disk_usage_debug>
<file_xfer>1</file_xfer>
<file_xfer_debug>0</file_xfer_debug>
<gui_rpc_debug>0</gui_rpc_debug>
<heartbeat_debug>0</heartbeat_debug>
<http_debug>0</http_debug>
<http_xfer_debug>0</http_xfer_debug>
<idle_detection_debug>1</idle_detection_debug>
<mem_usage_debug>0</mem_usage_debug>
<mw_debug>0</mw_debug>
<network_status_debug>0</network_status_debug>
<notice_debug>0</notice_debug>
<poll_debug>0</poll_debug>
<priority_debug>0</priority_debug>
<proxy_debug>0</proxy_debug>
<rr_simulation>0</rr_simulation>
<rrsim_detail>0</rrsim_detail>
<sched_ops>1</sched_ops>
<sched_op_debug>1</sched_op_debug>
<scrsave_debug>0</scrsave_debug>
<slot_debug>0</slot_debug>
<state_debug>0</state_debug>
<statefile_debug>0</statefile_debug>
<suspend_debug>0</suspend_debug>
<task>0</task>
<task_debug>0</task_debug>
<time_debug>0</time_debug>
<trickle_debug>0</trickle_debug>
<unparsed_xml>0</unparsed_xml>
<work_fetch_debug>0</work_fetch_debug>
</log_flags>
<options>
<abort_jobs_on_exit>0</abort_jobs_on_exit>
<allow_multiple_clients>0</allow_multiple_clients>
<allow_remote_gui_rpc>1</allow_remote_gui_rpc>
<disallow_attach>0</disallow_attach>
<dont_check_file_sizes>0</dont_check_file_sizes>
<dont_contact_ref_site>0</dont_contact_ref_site>
<lower_client_priority>0</lower_client_priority>
<dont_suspend_nci>0</dont_suspend_nci>
<dont_use_vbox>0</dont_use_vbox>
<dont_use_wsl>0</dont_use_wsl>
<exit_after_finish>0</exit_after_finish>
<exit_before_start>0</exit_before_start>
<exit_when_idle>0</exit_when_idle>
<fetch_minimal_work>0</fetch_minimal_work>
<fetch_on_update>0</fetch_on_update>
<force_auth>default</force_auth>
<http_1_0>0</http_1_0>
<http_transfer_timeout>300</http_transfer_timeout>
<http_transfer_timeout_bps>10</http_transfer_timeout_bps>
<max_event_log_lines>2000</max_event_log_lines>
<max_file_xfers>20</max_file_xfers>
<max_file_xfers_per_project>20</max_file_xfers_per_project>
<max_stderr_file_size>0</max_stderr_file_size>
<max_stdout_file_size>0</max_stdout_file_size>
<max_tasks_reported>0</max_tasks_reported>
<mw_low_water_pct>1</mw_low_water_pct>
<mw_high_water_pct>16</mw_high_water_pct>
<mw_wait_interval>512</mw_wait_interval>
<ncpus>-1</ncpus>
<no_alt_platform>0</no_alt_platform>
<no_gpus>0</no_gpus>
<no_info_fetch>0</no_info_fetch>
<no_opencl>0</no_opencl>
<no_priority_change>0</no_priority_change>
<os_random_only>0</os_random_only>
<process_priority>-1</process_priority>
<process_priority_special>-1</process_priority_special>
<use_all_gpus>1</use_all_gpus>
<proxy_info>
<socks_server_name></socks_server_name>
<socks_server_port>80</socks_server_port>
<http_server_name></http_server_name>
<http_server_port>80</http_server_port>
<socks5_user_name></socks5_user_name>
<socks5_user_passwd></socks5_user_passwd>
<socks5_remote_dns>0</socks5_remote_dns>
<http_user_name></http_user_name>
<http_user_passwd></http_user_passwd>
<no_proxy></no_proxy>
<no_autodetect>0</no_autodetect>
</proxy_info>
<rec_half_life_days>10.000000</rec_half_life_days>
<report_results_immediately>0</report_results_immediately>
<run_apps_manually>0</run_apps_manually>
<save_stats_days>30</save_stats_days>
<skip_cpu_benchmarks>0</skip_cpu_benchmarks>
<simple_gui_only>0</simple_gui_only>
<start_delay>0.000000</start_delay>
<stderr_head>0</stderr_head>
<suppress_net_info>0</suppress_net_info>
<unsigned_apps_ok>0</unsigned_apps_ok>
<use_all_gpus>0</use_all_gpus>
<use_certs>0</use_certs>
<use_certs_only>0</use_certs_only>
<vbox_window>0</vbox_window>
</options>
</cc_config>
22) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69355)
Posted 18 Dec 2019 by jpmboy
Post:
THanks. "good copy". :) when this last loaded batch finishes I'll make the changes you suggested here and in the reg... and PM you back.
23) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69353)
Posted 18 Dec 2019 by jpmboy
Post:
I posted back in that thread ... please take a look. :)
24) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69352)
Posted 18 Dec 2019 by jpmboy
Post:


The latest version available for download is 7.14.2

Where can i download these new binaries?

http://stateson.net/bthistory/boinc_x64_for_milkyway.zip
The following procedure assumes that your original boinc.exe is at "/Program Files/boinc"
I do not have an install procedure so it must be installed manually
Extract the boinc.exe file from the zip archive and save it at /Downloads or where convenient
It can only be executed from the program directory so trying "boinc.exe --version" will tell you files are missing
You must stop boinc from executing before replacing it.
To stop boinc, First bring up the boinc manager, then exit the boinc manager and specify to stop programs from executing
After stopping boinc you should rename the original program from boinc.exe to old_boinc.exe
Copy the new program into the /Program Files/boinc folder
Starting up the boinc manager should also start up boinc. Check to see if the version is 7.15.0 for the new program. After a few minutes of looking at the event message you should notice a download of a few files. Eventually the number of work units waiting to be processed will rise up and hover near the maximum. The only time it will drop to 0 is when the project goes off-line. On my system the count stays between 850 - 890 all the time.
I have shortcuts for starting and stopping boinc but the normal startup for boinc must be removed from the windows registry or a conflict arises. PM me if you want to do this. They are not needed to get this milkyway version to work.
Let me know if a problem and I can put together a better set of instructions.
[EDIT]
for 32 bit systems (I have no way of testing this and no longer have xp, vista, 7 or 8)
http://stateson.net/bthistory/boinc_x32_for_milkyway.zip

Thank you for posting this. I tried using your mod'd boinc.exe as you described... initially there was no change in the number of task downloaded at any one time (always 300-ish with 3 GPUs installed), ot in the accumulation of tasks over a day (gpus remain in active for roughly 30-40% of time). So I them added some of the "options" and settings from the included cc_config.xml file in cluded in the down load and viola... I actually got 425 tasks... once, then the eventlog shows we're back to 300-ish tasks after waiting several 91 sec cycles, and occasionally, no "fetch" for 10 min.
Most times the fetch command is as follows:
12/17/2019 8:00:35 PM | Milkyway@Home | Sending scheduler request: To fetch work.
12/17/2019 8:00:35 PM | Milkyway@Home | Reporting 34 completed tasks
12/17/2019 8:00:35 PM | Milkyway@Home | Requesting new tasks for CPU and NVIDIA GPU
12/17/2019 8:00:35 PM | Milkyway@Home | [sched_op] CPU work request: 16278725.09 seconds; 0.00 devices
12/17/2019 8:00:35 PM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1553755.01 seconds; 0.00 devices
12/17/2019 8:00:37 PM | Milkyway@Home | Scheduler request completed: got 0 new tasks
12/17/2019 8:00:37 PM | Milkyway@Home | [sched_op] Server version 713
12/17/2019 8:00:37 PM | Milkyway@Home | No tasks sent
12/17/2019 8:00:37 PM | Milkyway@Home | Project requested delay of 91 seconds


once it receives the allocation of tasks, the GPU count drops to 0.00 and no further tasks are downloaded (but, the CPU task list adds one or more tasks, accumulating 2-3 days of work according to BoincTasks):

12/17/2019 7:51:12 PM | Milkyway@Home | [sched_op] Starting scheduler request
12/17/2019 7:51:12 PM | Milkyway@Home | Sending scheduler request: To fetch work.
12/17/2019 7:51:12 PM | Milkyway@Home | Reporting 1 completed tasks
12/17/2019 7:51:12 PM | Milkyway@Home | Requesting new tasks for CPU and NVIDIA GPU
12/17/2019 7:51:12 PM | Milkyway@Home | [sched_op] CPU work request: 16283322.16 seconds; 0.00 devices
12/17/2019 7:51:12 PM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1555200.00 seconds; 3.00 devices
12/17/2019 7:51:14 PM | Milkyway@Home | Scheduler request completed: got 328 new tasks
12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] Server version 713
12/17/2019 7:51:14 PM | Milkyway@Home | Project requested delay of 91 seconds
12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] estimated total CPU task duration: 0 seconds
12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] estimated total NVIDIA GPU task duration: 19186 seconds
12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_80_bundle4_4s_south4s_bgset_2_1574164502_15520017_1
12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] Deferring communication for 00:01:31
12/17/2019 7:51:14 PM | Milkyway@Home | [sched_op] Reason: requested by project
12/17/2019 7:52:45 PM | Milkyway@Home | [sched_op] Starting scheduler request
12/17/2019 7:52:45 PM | Milkyway@Home | Sending scheduler request: To fetch work.
12/17/2019 7:52:45 PM | Milkyway@Home | Reporting 19 completed tasks
12/17/2019 7:52:45 PM | Milkyway@Home | Requesting new tasks for CPU and NVIDIA GPU
12/17/2019 7:52:45 PM | Milkyway@Home | [sched_op] CPU work request: 16277298.29 seconds; 0.00 devices
12/17/2019 7:52:45 PM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1552420.62 seconds; 0.00 devices
12/17/2019 7:52:47 PM | Milkyway@Home | Scheduler request completed: got 2 new tasks
12/17/2019 7:52:47 PM | Milkyway@Home | [sched_op] Server version 713
12/17/2019 7:52:47 PM | Milkyway@Home | Project requested delay of 91 seconds


So... I still can't get 200 tasks per GPU, and ceratinly not picking up GPU tasks until several "fetch" requests after the last GPU tyask has uploaded.

I'm running win10, a 7980XE, 3 Titan Vs... running 6 tasks per GPU. So unbelievably, adding the third titan V, rather than getting more tasks and processing more tasks, is actually resulting in more idle time as the lot of 300 tasks process in 2/3 the time.
Crazy!
25) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69345)
Posted 16 Dec 2019 by jpmboy
Post:
I appreciate your sentiment, though I don't (yet) share it.

Still looking for a FIX to this problem....



I have a "fix" for the "milkyway" problem and in addition it can be used to get more than the 200 max per GPU if so desired.
https://github.com/JStateson/BoincMasterSlave
I posted windows executable for just the milkyway fix on another thread.
For Linux you will have to build the above as I don't have an install for various Linux systems

Which thread did you post the windows exec just to fix MW? ... plz. ;)
26) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69344)
Posted 15 Dec 2019 by jpmboy
Post:
I appreciate your sentiment, though I don't (yet) share it.

Still looking for a FIX to this problem....



I have a "fix" for the "milkyway" problem and in addition it can be used to get more than the 200 max per GPU if so desired.
https://github.com/JStateson/BoincMasterSlave
I posted windows executable for just the milkyway fix on another thread.
For Linux you will have to build the above as I don't have an install for various Linux systems

Okay, extracted, and ... then what? Is there an installer or a readme file in there somewhere? :)
27) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69343)
Posted 15 Dec 2019 by jpmboy
Post:
Thank you ! I'll give it a try tonight!
28) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69340)
Posted 15 Dec 2019 by jpmboy
Post:
I appreciate your sentiment, though I don't (yet) share it.

Still looking for a FIX to this problem....
29) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69338)
Posted 15 Dec 2019 by jpmboy
Post:
Hey Guys,
I've read thru this thread trying to find a solution to an issue I've recently experienced... I have been running 2 Titan Vs dedicated to MW which have been able to post 3.5-4M as a RAC over the past few months. I "repurposed" a 3rd TV to this rig expecting to increase the productivity, but that;s not what I see. Each TV is set to run 6 tasks and each task completes in about 55-60sec (so ~ 18 tasks per minute, each task has 0.5 cpu which has been sufficient for 2 TVs). Adding a 3rd I expected to see at least 4.5-5.5M RAC, but I still only get about 4M RAC over the past 3 days. I took a look at the Event log and see that the server loads about 300 tasks each time it sends work - these complete in less than 20 min. Then the rig sits idle for 10-20 min with the relevant event log section c-p below:

12/15/2019 9:38:24 AM | Milkyway@Home | Computation for task de_modfit_14_bundle5_testing_4s3f_2_1574164502_14593798_0 finished
12/15/2019 9:39:52 AM | Milkyway@Home | [sched_op] Starting scheduler request
12/15/2019 9:39:52 AM | Milkyway@Home | Sending scheduler request: To fetch work.
12/15/2019 9:39:52 AM | Milkyway@Home | Reporting 4 completed tasks
12/15/2019 9:39:52 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU
12/15/2019 9:39:52 AM | Milkyway@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
12/15/2019 9:39:52 AM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1814400.00 seconds; 3.00 devices
12/15/2019 9:39:53 AM | Milkyway@Home | Scheduler request completed: got 0 new tasks
12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] Server version 713
12/15/2019 9:39:53 AM | Milkyway@Home | Project requested delay of 91 seconds
12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_82_bundle4_4s_south4s_bgset_2_1574164502_14359817_1
12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_86_bundle4_4s_south4s_bgset_2_1574164502_14593783_0
12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_14_bundle5_testing_4s3f_3_1574164502_14555903_1
12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] handle_scheduler_reply(): got ack for task de_modfit_14_bundle5_testing_4s3f_2_1574164502_14593798_0
12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] Deferring communication for 00:01:31
12/15/2019 9:39:53 AM | Milkyway@Home | [sched_op] Reason: requested by project
12/15/2019 9:52:34 AM | Milkyway@Home | [sched_op] Starting scheduler request
12/15/2019 9:52:34 AM | Milkyway@Home | Sending scheduler request: To fetch work.
12/15/2019 9:52:34 AM | Milkyway@Home | Requesting new tasks for NVIDIA GPU
12/15/2019 9:52:34 AM | Milkyway@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
12/15/2019 9:52:34 AM | Milkyway@Home | [sched_op] NVIDIA GPU work request: 1814400.00 seconds; 3.00 devices
12/15/2019 9:52:36 AM | Milkyway@Home | Scheduler request completed: got 308 new tasks
12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] Server version 713
12/15/2019 9:52:36 AM | Milkyway@Home | Project requested delay of 91 seconds
12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] estimated total CPU task duration: 0 seconds
12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] estimated total NVIDIA GPU task duration: 17919 seconds
12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] Deferring communication for 00:01:31
12/15/2019 9:52:36 AM | Milkyway@Home | [sched_op] Reason: requested by project


At 9:39:52 the last 4 tasks report as completed from the previous task-basket received. The server "sees" 3.00 Devices" and "got 0 new tasks" at 9:39:53, delays 91 sec... but the automatic request for more work fails to fetch any work until 9:52:36 when it gets 308 tasks (which complete in less than 20 min.). The system returns the work, and sits for at least 10-15min before more work arrives. So in effect, this rig only works about 50% of the time available to do work. It seems that adding a 3rd Titan V simply processed the task dump faster leading to a longer idle time?
This is a home/personal PC, running windows 10 (https://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=776231 ). I'm not a programmer (just an enthusiast) but I would be able to do things like "make a B-tasks Rule" with specific instruction... Hoping the experienced users here can help:
Is there anyway to get this 3rd Titan V to contribute in a meaningful way??
30) Message boards : News : New Separation Runs [UPDATE] (Message 68930)
Posted 26 Jul 2019 by jpmboy
Post:
This looks like it is still from the old bugged runs. It is a "...bgset" run, and the new runs will have 26 parameters, not 25. Can you clear your queue and see if that helps?

Tom

I've clear my queue several times and still get errors across hosts... then it puts in a 24 hour update delay. Hit "Update" and it downloads a few hundred work units, which proceed to error at about 50% of those that don't.
31) Message boards : News : New Separation Runs [UPDATE] (Message 68925)
Posted 25 Jul 2019 by jpmboy
Post:
Also seeing ~ 30% errors on all hosts here: 2 Titan Vs, radeon VII, 295x2 and Radeon 290.
32) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68469)
Posted 2 Apr 2019 by jpmboy
Post:
Thank you for the reply. I'll try jerry-rigging something when I get a chance.
33) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68467)
Posted 1 Apr 2019 by jpmboy
Post:
I'm having this issue, and my two Titan Vs sit idle for most of the day waiting for a task-batch download. Each task completes in less than 1 min, I run 12 tasks at a time (6 per GPU). 200 tasks complete in ~ 17 min.

WE really need a fix to this which accounts for GPUs that are not DP crippled. ;)


Previous 20

©2024 Astroinformatics Group