Welcome to MilkyWay@home

Posts by jdzukley

1) Message boards : MilkyWay@home Science : How many tasks +/- does it take to complete a "run"? (Message 65163)
Posted 17 Sep 2016 by jdzukley
Post:
I would appreciate to understand how many tasks does it take to make a single composite "run"? While I suspect this can may vary based on the type of analysis being run, and may further evolve as time goes by... I know I "crunch" thousands of tasks, and I see the task id's increment... But an overall +/- current snapshot by application would be appreciated.

Sorry if this has already been published, I have searched the forums - unsuccessfully.

Inquiring minds...
2) Message boards : Number crunching : Aborted by user. (Message 65152)
Posted 15 Sep 2016 by jdzukley
Post:
I am getting these errors too!. The only thing done to the system was to perform windows 10 updates. After the last update I have been getting these errors. What am I suppose to do to have openCL installed - and to a lessor point, why did windows uninstall it?

NVidia says I have the latest drivers installed, so the discussion in the thread below???
3) Message boards : Number crunching : Milkyway@Home | [error] Missing coprocessor for task de_modfit_fast_15_3s_136_fixedangles3_2_1471352126_36518394_0; aborting (Message 65146)
Posted 15 Sep 2016 by jdzukley
Post:
FYI, all current downloads are getting this message....

seams like someone is working on it,,, noticed a new note in the log:

9/15/2016 10:02:10 AM | | App version needs OpenCL but GPU doesn't support it
4) Message boards : MilkyWay@home Science : N-Body Plots (Message 64532)
Posted 3 May 2016 by jdzukley
Post:
Please advise how stars with different life spans, size - weights, "burning"... effects the calculations? If the plot below represents 4 billion years, I would think there would be some "changes" do to star life, size, and weight spans. Or in the scheme of things are these factors considered too small - insignificant to consider?
5) Message boards : Number crunching : Aborted by User, but not (Message 64112)
Posted 17 Nov 2015 by jdzukley
Post:
Just did a cold boot, and tried again, and it is working. Did not do anything else to the system... go figure. So, end of issue as I know it...
6) Message boards : Number crunching : Aborted by User, but not (Message 64108)
Posted 17 Nov 2015 by jdzukley
Post:
And I just downloaded and reinstalled the current driver. Allowed new tasks from Milkyway, and they continued to bomb. I then released GPU tasks for Seti and Asteroid's and both are crunching normally.
7) Message boards : Number crunching : Aborted by User, but not (Message 64107)
Posted 17 Nov 2015 by jdzukley
Post:
Nvidia, I always use Nvidia GeForce Experience icon (prior to that direct from the Nvidia web site.

I never let windows update the Nvidia drivers. And the GeForce Experience alerts me when new updates are available, which is ahead of the windows updates!
8) Message boards : Number crunching : Aborted by User, but not (Message 64103)
Posted 16 Nov 2015 by jdzukley
Post:
Current Status, still need help...

Navida driver was and is current ver 358.91 dated 9 NOV 2015.

System was rebooted.

Boinc runs the GPUs successfully on SETI, POEM, Asteroids, and GPUGrid.

All other system drivers were checked for current and found to be so by a non windows driver update program.

I have unattached and reattachecd to Milkyway.

And I just allowed Milkyway to get new tasks, and they still error out with the 201 (0xc9) EXIT_MISSING_COPROC

This computer was running Milkyway successfully a number of days ago... The only major changes was to eliminate and reinstall Microsoft SQL 2012, which I am still having issues with on the reinstall. Why this would effect only Milkyway???

So, I am ready for new ideas to try.
9) Message boards : Number crunching : Aborted by User, but not (Message 64100)
Posted 15 Nov 2015 by jdzukley
Post:
FYI, I am getting "aborted by user" errors the last few days computer #410714 with 2 GTX 570's, These historically have done a lot of work for the site, .... If anyone has any feedback, please advise. I have searched the message boards for this error, and nothing turned up... Other sites, Seti, GPUGrid... crunching well and have no issues...
10) Message boards : News : nbody fpops calculation updated (Message 59194)
Posted 2 Jul 2013 by jdzukley
Post:
Thanks for the update on the estimated time to complete estimates. Per my observations, they are much improved. The only very variance is that some tasks do take 5-6 time longer than estimated. This is +/- insignificant as the original estimate my be 2.5 minutes, with actual run times of 15 minutes.

I have a question based on observing total installed CPU processor efficiencies. For this statement, I am basing my comments on a computer that has 12 cores. Most MT tasks run at an average of +/- 80% total efficiency as observed using the Windows Resource Manager. Therefore logic suggest that it would be better to run 12 separate nbody tasks at +/- 100% rather than a single MT task at 80%? Consider that I also have 2 GPU cards installed, and with those cards running at 95 +/- GPU loads (as observed with GPU Z) it is rare that the total CPU machine loads get above 85%.

I look forward to any discussion about MT tasks verses single run tasks.

I suppose the answer is to compare run time averages verses CPU seconds used. Is there any current review of these results?
11) Message boards : Number crunching : MT and single-CPU point differences (Message 59157)
Posted 28 Jun 2013 by jdzukley
Post:
appreciate that there are bigger fish to fry, but also I would appreciate that someone take just a bit of time to adjust the point calcs for mt tasks... and I know I do not do this for the points (so at least I tell my self)....

12 cores for 59 minutes run time = 1.55 points... 24k, 35k and 39k CPU seconds for the 3 different computers that crunched this task...

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=382923304

like I keep telling myself, your not in this for the points...
12) Message boards : News : N-Body 1.18 (Message 58867)
Posted 14 Jun 2013 by jdzukley
Post:
To help avoid High Priority MT Tasks and the slew of problems presented below, consider a quick temporary fix on the task generator server to update the estimated run time for the task to be no greater than (required completion date & time - (now() + say 24hours)). This way the task most often should not become High Priority.

estimated task hours = min (system generated estimated task hours, (required completion date - (now() + 24 hours)))

And but, I do not understand why a task that start out as High Priority, after it become Normal Priority, at no time can a GPU task can run until the specific MT task has completed... Any other MT task will allow GPU's to come and go...

Alternately, MT tasks can be set / sent to be # of processors - 1 and then this whole discussion thread is mute...

The CC config file for Boinc probably needs an optional setting for how many cores can be made available for MT processing.

My third long run High Priority MT task has just started (consecutively, no other short runs this afternoon after the first long started). I will be setting No New Tasks on Milkyway as I have better things to do on a Friday night than baby sit long run MT tasks that suspend most work on my system...
13) Message boards : News : N-Body 1.18 (Message 58866)
Posted 14 Jun 2013 by jdzukley
Post:
A snapshots of the debugs, first here is the job in execution:

6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] scheduling de_separation_21_2s_sscon_1_1371083750_657599_0 (coprocessor job, FIFO) (prio -1.000000)
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] scheduling de_separation_79_DR8_rev_3_1371083750_657610_0 (coprocessor job, FIFO) (prio -1.020216)
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] scheduling ps_nbody_06_11_dark_4_1371083750_62813_0 (CPU job, priority order) (prio -1.040431)
6/14/2013 3:44:39 PM | | [cpu_sched_debug] enforce_schedule(): start
6/14/2013 3:44:39 PM | | [cpu_sched_debug] preliminary job list:
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] 0: de_separation_21_2s_sscon_1_1371083750_657599_0 (MD: no; UTS: yes)
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] 1: de_separation_79_DR8_rev_3_1371083750_657610_0 (MD: no; UTS: no)
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] 2: ps_nbody_06_11_dark_4_1371083750_62813_0 (MD: no; UTS: no)
6/14/2013 3:44:39 PM | | [cpu_sched_debug] final job list:
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] 0: de_separation_21_2s_sscon_1_1371083750_657599_0 (MD: no; UTS: yes)
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] 1: de_separation_79_DR8_rev_3_1371083750_657610_0 (MD: no; UTS: no)
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] 2: ps_nbody_06_11_dark_4_1371083750_62813_0 (MD: no; UTS: no)
6/14/2013 3:44:39 PM | Milkyway@Home | [coproc] NVIDIA instance 1: confirming for de_separation_21_2s_sscon_1_1371083750_657599_0
6/14/2013 3:44:39 PM | Milkyway@Home | [coproc] Assigning NVIDIA instance 0 to de_separation_79_DR8_rev_3_1371083750_657610_0
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] scheduling de_separation_21_2s_sscon_1_1371083750_657599_0
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] scheduling de_separation_79_DR8_rev_3_1371083750_657610_0
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] avoid MT overcommit: skipping ps_nbody_06_11_dark_4_1371083750_62813_0
6/14/2013 3:44:39 PM | | [cpu_sched_debug] using 0.79 out of 12 CPUs
6/14/2013 3:44:39 PM | climateprediction.net | [cpu_sched_debug] hadcm3n_o3e8_1940_40_008382078_1 sched state 1 next 1 task state 9
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] ps_nbody_06_11_dark_4_1371083750_62813_0 sched state 1 next 1 task state 9
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] de_separation_21_2s_sscon_1_1371083750_657599_0 sched state 2 next 2 task state 1
6/14/2013 3:44:39 PM | Milkyway@Home | [cpu_sched_debug] de_separation_79_DR8_rev_3_1371083750_657610_0 sched state 0 next 2 task state 0
6/14/2013 3:44:39 PM | Milkyway@Home | Starting task de_separation_79_DR8_rev_3_1371083750_657610_0 using milkyway version 102 (opencl_nvidia) in slot 0
6/14/2013 3:44:39 PM | | [cpu_sched_debug] enforce_schedule: end



Now is the shot where the priority changed and the job was suspended:
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] scheduling de_separation_21_2s_sscon_1_1371083750_657599_0 (coprocessor job, FIFO) (prio -1.000000)
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] scheduling de_separation_79_DR8_rev_3_1371083750_657610_0 (coprocessor job, FIFO) (prio -1.020216)
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] scheduling ps_nbody_06_11_dark_4_1371083750_62813_0 (CPU job, priority order) (prio -1.040431)
6/14/2013 3:44:55 PM | | [cpu_sched_debug] enforce_schedule(): start
6/14/2013 3:44:55 PM | | [cpu_sched_debug] preliminary job list:
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] 0: de_separation_21_2s_sscon_1_1371083750_657599_0 (MD: no; UTS: yes)
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] 1: de_separation_79_DR8_rev_3_1371083750_657610_0 (MD: no; UTS: no)
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] 2: ps_nbody_06_11_dark_4_1371083750_62813_0 (MD: no; UTS: no)
6/14/2013 3:44:55 PM | | [cpu_sched_debug] final job list:
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] 0: de_separation_21_2s_sscon_1_1371083750_657599_0 (MD: no; UTS: yes)
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] 1: de_separation_79_DR8_rev_3_1371083750_657610_0 (MD: no; UTS: yes)
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] 2: ps_nbody_06_11_dark_4_1371083750_62813_0 (MD: no; UTS: no)
6/14/2013 3:44:55 PM | Milkyway@Home | [coproc] NVIDIA instance 1: confirming for de_separation_21_2s_sscon_1_1371083750_657599_0
6/14/2013 3:44:55 PM | Milkyway@Home | [coproc] NVIDIA instance 0: confirming for de_separation_79_DR8_rev_3_1371083750_657610_0
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] scheduling de_separation_21_2s_sscon_1_1371083750_657599_0
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] scheduling de_separation_79_DR8_rev_3_1371083750_657610_0
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] avoid MT overcommit: skipping ps_nbody_06_11_dark_4_1371083750_62813_0
6/14/2013 3:44:55 PM | | [cpu_sched_debug] using 0.79 out of 12 CPUs
6/14/2013 3:44:55 PM | climateprediction.net | [cpu_sched_debug] hadcm3n_o3e8_1940_40_008382078_1 sched state 1 next 1 task state 9
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] ps_nbody_06_11_dark_4_1371083750_62813_0 sched state 1 next 1 task state 9
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] de_separation_21_2s_sscon_1_1371083750_657599_0 sched state 2 next 2 task state 1
6/14/2013 3:44:55 PM | Milkyway@Home | [cpu_sched_debug] de_separation_79_DR8_rev_3_1371083750_657610_0 sched state 2 next 2 task state 1
6/14/2013 3:44:55 PM | | [cpu_sched_debug] enforce_schedule: end


Unfortunately for this log capture, I did not have other CPU tasks ready for execution (the one job was suspended.


Next, I allowed the 2 GPU tasks in execution to finish, and suspended all remaining GPU tasks que.. this is the log right after the last GPU task finished showing the start of the MT tasks:

6/14/2013 3:51:49 PM | Milkyway@Home | [cpu_sched_debug] scheduling ps_nbody_06_11_dark_4_1371083750_62813_0 (CPU job, priority order) (prio -1.000000)
6/14/2013 3:51:49 PM | | [cpu_sched_debug] enforce_schedule(): start
6/14/2013 3:51:49 PM | | [cpu_sched_debug] preliminary job list:
6/14/2013 3:51:49 PM | Milkyway@Home | [cpu_sched_debug] 0: ps_nbody_06_11_dark_4_1371083750_62813_0 (MD: no; UTS: no)
6/14/2013 3:51:49 PM | | [cpu_sched_debug] final job list:
6/14/2013 3:51:49 PM | Milkyway@Home | [cpu_sched_debug] 0: ps_nbody_06_11_dark_4_1371083750_62813_0 (MD: no; UTS: no)
6/14/2013 3:51:49 PM | Milkyway@Home | [cpu_sched_debug] scheduling ps_nbody_06_11_dark_4_1371083750_62813_0
6/14/2013 3:51:49 PM | climateprediction.net | [cpu_sched_debug] hadcm3n_o3e8_1940_40_008382078_1 sched state 1 next 1 task state 9
6/14/2013 3:51:49 PM | Milkyway@Home | [cpu_sched_debug] ps_nbody_06_11_dark_4_1371083750_62813_0 sched state 1 next 2 task state 9
6/14/2013 3:51:49 PM | Milkyway@Home | [cpu_sched] Resuming ps_nbody_06_11_dark_4_1371083750_62813_0
6/14/2013 3:51:49 PM | Milkyway@Home | Resuming task ps_nbody_06_11_dark_4_1371083750_62813_0 using milkyway_nbody version 118 (mt) in slot 3
6/14/2013 3:51:49 PM | | [cpu_sched_debug] enforce_schedule: end


Advise if you folks want a copy of the full log file for this time frame.
14) Message boards : News : N-Body 1.18 (Message 58865)
Posted 14 Jun 2013 by jdzukley
Post:
Are these the log flags wanted:

<log_flags>
<coproc_debug>1</coproc_debug>
<cpu_sched>1</cpu_sched>
<cpu_sched_debug>1</cpu_sched_debug>
<priority_debug>1</priority_debug>
</log_flags>


I have these running at the moment, advise if I should have others...
15) Message boards : News : N-Body 1.18 (Message 58863)
Posted 14 Jun 2013 by jdzukley
Post:
FYI, I have the conditions currently executing so that in the next 30 minutes +/- I believe I will be in the "hung" position as being discussed... Advise if there is anyway I can assist....
16) Message boards : News : N-Body 1.18 (Message 58862)
Posted 14 Jun 2013 by jdzukley
Post:
So, serious question, what happens when this occurs on your system? Does it just run the GPU task, and leave the other 3.96 CPUs completely idle?

On my system, all CPU cores other than those that support GPU's go idle, EVEN if there are other CPU tasks available to run.
17) Message boards : News : N-Body 1.18 (Message 58861)
Posted 14 Jun 2013 by jdzukley
Post:
Richard, great information. Thank you. Now what happens?

Do we have to wait for the next version of Boinc? Would a temporary band aid fix be to modify the MT task to be Number of available core processors - 1 (My case 12-1=11)?
18) Message boards : News : N-Body 1.18 (Message 58855)
Posted 14 Jun 2013 by jdzukley
Post:
My observations, it appears to me that the more CPU cores you have the more "ok" it is to overcommit. For reference, I have 12 cores, and the MT tasks can run 2 GPU tasks quite well with total machine CPU loads at +/- 85% total CPU percent used. The scheduler does allow GPU tasks to load / execute while MT tasks are running, with the apparent exception of when a High Priority MT task changes to Normal priority. WHEN THIS HAPPENS, then the SCHEDULER stops the MT tasks puts the task in "wait" status and will not allow any other CPU based tasks to execute until ALL other work - INCLUDIONG ALL GPU tasks are complete, Including GPU tasks in QUE, which can mean if Boinc scheduler is obtaining more GPU tasks, CPU task will never commence. The Scheduler will not allow other any CPU work to start until the MT task is complete. Remember it is in "wait" state, waiting for 100% CPU availability. Essentially all 12 cores are available and only are supporting GPU work.

Bottom line, my opinion is that the more cores you have, it is GOOD to allow overcommitting. This allows cores that are underperforming because they have to wait to contribute to GPU activity. I also agree that the Sum GPU CPU core requirements needs to be <= 1 CPU core OR perhaps <= .1 * # of cores (.1*12 cores = 1.2 provided that no GPU task requires a 1, a whole core).

I would be glad to record or snapshot total CPU use on this computer with 12 cores. Please advise...

There is a number of issues here, the most serious being that the MT task when it changes status from High Priority to normal, then goes into "Wait" and holds all remaining CPU tasks hostage.
19) Message boards : News : N-Body 1.18 (Message 58772)
Posted 12 Jun 2013 by jdzukley
Post:
I am moving on to other projects as I received 3 different MT tasks tonight which halted all work on my 12 core CPU when the MT task reached 9x% complete. I suspend all GPU tasks to allow the MT task to complete and then released all GPU tasks. Boinc returned to normal operations at this time including downloading more tasks. All worked well for many MT tasks cycles. the suspended MT task in every case over the last few days always had hundreds of estimated hours to go... In all cases, task work continued without incident concerning GPU work. I'll look forward to when this condition becomes fixed, and I will be back for more.
20) Message boards : News : N-Body 1.18 (Message 58717)
Posted 11 Jun 2013 by jdzukley
Post:
If you have a Graphics card, you must manually shut down "suspend" all graphics jobs and wait for a few moments for the mt task to complete. As soon as the mt task completes restart all graphics jobs. Note you must hold ALL graphic card jobs, not just the current jobs running. The error was noted below and turned in as a problem. for the few MT tasks that did this, they all had something OTHER than dark or nodark in the task name, and all had estimated run times in the thousands (xxxx. hours) and all arrived at 98% after about 1 hour run time.


Next 20

©2019 Astroinformatics Group