Welcome to MilkyWay@home

Posts by wb8ili

1) Message boards : Number crunching : MilkyWay takes a backseat to Einstein ??? (Message 68974)
Posted 14 Aug 2019 by wb8ili
Post:
Aurum -

This problem started near the end of March 2019 after some kind of meltdown of the Milkyway system. In the "News" section look for the topic "30 Workunit Limit Per Request - Fix Implemented" for a discussion.

The issue has never been fixed.

As far as I can tell, it impossible to run both Einstein and Milkyway projects using the same resources (CPU or GPU or both) on the same computer because of the issue you identified.

On the systems where I run Milkyway only, I just let it run out of tasks because Milkyway doesn't fill the cache until it is empty for a couple of minutes. Then it will send me 300 tasks.
2) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68447)
Posted 28 Mar 2019 by wb8ili
Post:
Jake -

It is timing thing. Somewhere between 4.5 minutes and 14 minutes (6 minutes?), there is a "timer" in Milkyway that stops new tasks from being downloaded unless it is more than the "timer" since the last request. All other projects that I am familiar with have the feature. In CPDN it is one hour. And, all projects that I use reset the timer to the "max" if you request task before the "timer" has expired.

I have 6 machines running Milkyway GPU tasks. Only my fastest has this issue (running out of work).

Edit: I just checked Einstein and it looks like the "timer" is about 1 minute.
3) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68442)
Posted 28 Mar 2019 by wb8ili
Post:
Here is the rest of my story -

28 Mar 2019 110629 I finished my last task and was out of work

28 Mar 2019 111308 The BOINC Manager (I assume) sent a request for new tasks. Got 200. I am good for the next 15 hours.

I was out of work for 6.5 minutes with no user intervention.

Jake - something has changed in the recent past (new server?) that MY queue isn't being maintained. I still think it has something to do with requesting tasks too frequently. And, I still think around 6 minutes is the cutoff. However, I don't get a message "Too recent since last request" or similiar.

Edit: I checked the log on another of my computers that takes 14 minutes to complete a task and the log shows that that computer gets a new task pretty much every time it finishes one.
4) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68440)
Posted 28 Mar 2019 by wb8ili
Post:
Jake -

27 Mar 2019 20:44:37 Received 51 tasks via a User Requested Update. Now have approx. 200 tasks on-hand.

28 Mar 2019 10:26:00 Since previous time have completed 190 tasks. Automatically reported each completed task and asked for new tasks each time. Received none. Now have 10 tasks on-hand. Will probably run out of work less than an hour. Will report back what happens then.
5) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68435)
Posted 28 Mar 2019 by wb8ili
Post:
Jake -

I just did a "user request for tasks" and received 51 new ones.

As I have written before, every 4 minutes or so I complete a task, report it, request new tasks, and get nothing. I have about 200 tasks now from my user requests so I will have to wait until tomorrow and see if my stock pile bleeds down.
6) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68429)
Posted 27 Mar 2019 by wb8ili
Post:
Jake -

Just tried a "user update" and got another 30 tasks.

And still, every time I complete a task and try to "replace it", no tasks are down loaded.
7) Message boards : News : New Server Update (Message 68426)
Posted 27 Mar 2019 by wb8ili
Post:
Jake -

I get 30 tasks when I manually request an update. Always 30.

I can manually request tasks every 90 seconds. Less than 90 seconds gets a "last request too recent" message.

Every 90+ seconds I can get 30 new tasks.

My theory (below) has to be modified to indicate "user requested" requests for work give different results that reporting/requests.

Shown below is a typical sequence (I added the --->).

Task ends.
Request for work
No tasks downloaded.
And then two messages which I think might be important.



--> 3/27/2019 2:38:58 PM | Milkyway@Home | Computation for task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1553630102_244511_0 finished
3/27/2019 2:38:59 PM | Milkyway@Home | Starting task de_modfit_82_bundle5_3s_NoContraintsWithDisk200_6_1553630102_244495_0
3/27/2019 2:39:00 PM | | [work_fetch] ------- start work fetch state -------
3/27/2019 2:39:00 PM | | [work_fetch] target work buffer: 432000.00 + 0.00 sec
3/27/2019 2:39:00 PM | | [work_fetch] --- project states ---
3/27/2019 2:39:00 PM | climateprediction.net | [work_fetch] REC 1130.804 prio -0.034 can request work
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] REC 105434.987 prio -99.943 can request work
3/27/2019 2:39:00 PM | SETI@home | [work_fetch] REC 29324.189 prio -0.000 can't request work: "no new tasks" requested via Manager
3/27/2019 2:39:00 PM | World Community Grid | [work_fetch] REC 65.675 prio -0.000 can't request work: "no new tasks" requested via Manager
3/27/2019 2:39:00 PM | | [work_fetch] --- state for CPU ---
3/27/2019 2:39:00 PM | | [work_fetch] shortfall 0.00 nidle 0.00 saturated 764204.67 busy 0.00
3/27/2019 2:39:00 PM | climateprediction.net | [work_fetch] share 1.000
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences
3/27/2019 2:39:00 PM | SETI@home | [work_fetch] share 0.000 blocked by project preferences
3/27/2019 2:39:00 PM | World Community Grid | [work_fetch] share 0.000
3/27/2019 2:39:00 PM | | [work_fetch] --- state for NVIDIA GPU ---
3/27/2019 2:39:00 PM | | [work_fetch] shortfall 395988.98 nidle 0.00 saturated 36011.02 busy 0.00
3/27/2019 2:39:00 PM | climateprediction.net | [work_fetch] share 0.000 no applications
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] share 1.000
3/27/2019 2:39:00 PM | SETI@home | [work_fetch] share 0.000
3/27/2019 2:39:00 PM | World Community Grid | [work_fetch] share 0.000
3/27/2019 2:39:00 PM | | [work_fetch] ------- end work fetch state -------
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] set_request() for NVIDIA GPU: ninst 1 nused_total 139.00 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 395988.98
3/27/2019 2:39:00 PM | Milkyway@Home | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (395988.98 sec, 0.00 inst)
--> 3/27/2019 2:39:00 PM | Milkyway@Home | Sending scheduler request: To fetch work.
--> 3/27/2019 2:39:00 PM | Milkyway@Home | Reporting 1 completed tasks
--> 3/27/2019 2:39:00 PM | Milkyway@Home | Requesting new tasks for NVIDIA GPU
--> 3/27/2019 2:39:01 PM | Milkyway@Home | Scheduler request completed: got 0 new tasks
3/27/2019 2:39:01 PM | Milkyway@Home | [work_fetch] backing off NVIDIA GPU 873 sec
3/27/2019 2:39:01 PM | | [work_fetch] Request work fetch: RPC complete
3/27/2019 2:39:06 PM | | [work_fetch] ------- start work fetch state -------
3/27/2019 2:39:06 PM | | [work_fetch] target work buffer: 432000.00 + 0.00 sec
3/27/2019 2:39:06 PM | | [work_fetch] --- project states ---
3/27/2019 2:39:06 PM | climateprediction.net | [work_fetch] REC 1130.804 prio -1.023 can request work
--> 3/27/2019 2:39:06 PM | Milkyway@Home | [work_fetch] REC 105434.987 prio -3331.879 can't request work: scheduler RPC backoff (85.82 sec)
3/27/2019 2:39:06 PM | SETI@home | [work_fetch] REC 29324.189 prio -0.000 can't request work: "no new tasks" requested via Manager
3/27/2019 2:39:06 PM | World Community Grid | [work_fetch] REC 65.675 prio -0.000 can't request work: "no new tasks" requested via Manager
3/27/2019 2:39:06 PM | | [work_fetch] --- state for CPU ---
3/27/2019 2:39:06 PM | | [work_fetch] shortfall 0.00 nidle 0.00 saturated 764197.27 busy 0.00
3/27/2019 2:39:06 PM | climateprediction.net | [work_fetch] share 1.000
3/27/2019 2:39:06 PM | Milkyway@Home | [work_fetch] share 0.000 blocked by project preferences
3/27/2019 2:39:06 PM | SETI@home | [work_fetch] share 0.000 blocked by project preferences
3/27/2019 2:39:06 PM | World Community Grid | [work_fetch] share 0.000
3/27/2019 2:39:06 PM | | [work_fetch] --- state for NVIDIA GPU ---
3/27/2019 2:39:06 PM | | [work_fetch] shortfall 395994.02 nidle 0.00 saturated 36005.98 busy 0.00
3/27/2019 2:39:06 PM | climateprediction.net | [work_fetch] share 0.000 no applications
--> 3/27/2019 2:39:06 PM | Milkyway@Home | [work_fetch] share 0.000 project is backed off (resource backoff: 867.71, inc 600.00)
3/27/2019 2:39:06 PM | SETI@home | [work_fetch] share 0.000
3/27/2019 2:39:06 PM | World Community Grid | [work_fetch] share 0.000
3/27/2019 2:39:06 PM | | [work_fetch] ------- end work fetch state -------
3/27/2019 2:39:06 PM | | [work_fetch] No project chosen for work fetch
8) Message boards : News : New Server Update (Message 68422)
Posted 27 Mar 2019 by wb8ili
Post:
Vortac - Do you have a fast gpu?

I have theory on what is happening.

I finish a GPU task every 4 minutes.
Milkyway reports it and requests new tasks.
There is a "timer" in Milkyway that only lets you download new tasks every 600 seconds (I think). It is called "backoiff".
If a new download request was made before the 600 seconds is up, no tasks get downloaded and the timer is reset to 600 seconds.
Since I am finishing a task every 240 seconds, I eventually run out of work.

Now the requests for new work every 240 seconds stop.
Eventually, the Boinc Manager will request new work. If no work is downloaded, the Bonic Manager keeps increasing the time between requests until the 600 seconds threshold is exceeded.
Then new works starts flowing again.
9) Message boards : News : New Server Update (Message 68419)
Posted 27 Mar 2019 by wb8ili
Post:
Jake -

I have the same issue a Vortac. For what ever reason I run out of tasks (GPU). The Boinc Manager keep requesting new tasks but it just says "Got 0 tasks". If I manually ask for a project update, I get 30 new tasks. The a couple of minutes later 30 more. Then 30 more.

I don't think the "30" is the issue.

Maybe there is a "debug" parm that might indicate WHY it is not downloading new tasks.
10) Message boards : Number crunching : Benchmark thread 1-2019 on - GPU & CPU times wanted for new WUs, old & new hardware! (Message 68190)
Posted 27 Feb 2019 by wb8ili
Post:
Assimilator1 -

Have you looked at the Milkyway main mage, Community, Statistics, and then either CPU Models or GPU Models? That shows the relative rating of all of the CPU and GPU models.

What are you trying to do that is not shown there?
11) Message boards : Number crunching : Did I take the wrong off ramp? (Message 68189)
Posted 27 Feb 2019 by wb8ili
Post:
Surfelvis -

On the BOINC Manager main page, click on the Projects tab.

You should see Milkway listed.

Make sure Milkyway does NOT show "No New Tasks".

Highlight Milkway and click on "Update" on the left.

Click on the Tools tab, then Event Log.

Highlight the last 10 or 12 lines (all that show Milkyway and the current time).

Paste them into a message in this thread.
12) Message boards : News : Database Maintenance 9-4-2014 (Message 67777)
Posted 5 Sep 2018 by wb8ili
Post:
Jake wrote -

"We have to walk a fine line with the number of workunits we allow users to download and their deadlines. We have both CPUs and GPUs that we have to balance with vastly different work times. I think what we have now is a reasonable compromise, but I would be open to hearing your suggestions."

I understand you have to vary the number of workunits a user can download. But, the number should be based on the capabilities of the users' computer not some arbitrary number (80) that implies that one number fits all, whether it refers to CPU or GPU workunits.

Your scheduling (workunit dispersal) algorithm knows everything about a user's computer (average computational time, number of invalid returns, up-time, etc.).

It shouldn't be that hard for someone at a prestigious university like RPI to figure out a more equitable way of dispersing workunits. Fast computers get more, slow computers get less, "bad actors" get few.

If my computer is returning valid results, and each workunit (GPU) takes 3 minutes, what is the problem with giving me 480 units (1 day), or 960 units (2 days), or more?

The algorithm, if properly done, should work for CPU and GPU workunits.
13) Message boards : Number crunching : Maximum number of task (Message 67658)
Posted 6 Jul 2018 by wb8ili
Post:
Create a file app_config.xml in the milkyway.cs.rpi.edu_milkway directory.

Place the following 3 lines of text in the file and save it.

<app_config>
<project_max_concurrent>4</project_max_concurrent>
</app_config>


On the BOINC Manager screen, under Options, click on Read config files.



Alternatively, on the BOINC Manager screen, under Options, Computing Preferences, Computing, set Use at most 50% of the CPU's
14) Message boards : Number crunching : Does anyone else see GPU behavior like this? (Message 66804)
Posted 22 Nov 2017 by wb8ili
Post:
I haven't seen this, but what popped into my mind was you may be suspending computation because of settings.

Go to Options, Computing Preferences, under the "When to Suspend" area, may sure EVERYTHING is UNCHECKED.

If this isn't it, can't help.
15) Message boards : Number crunching : Suddenly lot of validate errors on Nvidia? (Message 66707)
Posted 19 Oct 2017 by wb8ili
Post:
hsdecalc -

You have upwards of 200 that aren't validated yet because they have to be compared to other users results. They are not "can't be validated".

Of the few tasks that you had that were "invalid" (approx. 10), you will notice that every other user shows them "invalid" too. This suggests there was something wrong in the data not your computer.

You also have about 4 that were "error". Sometimes that happens.
16) Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new! (Message 66597)
Posted 8 Sep 2017 by wb8ili
Post:
GTX970 (4GB) 227 sec.
GTX650Ti (2GB) 1100
17) Message boards : Number crunching : New Benchmark Thread - times wanted for any hardware, CPU or GPU, old or new! (Message 66594)
Posted 7 Sep 2017 by wb8ili
Post:
GTX660 (2GB) 900 sec.
GTX1060(6GB) 275
GTX560 (1GB) 415
GTX650-Ti(1GB) 1115
GT720(1GB) 3000
18) Message boards : News : GPU Issues Mega Thread (Message 66089)
Posted 6 Jan 2017 by wb8ili
Post:
captianjack and [AF>EDLS]GuL -

For captianjack in particular -

I finally got around to installing BOINC from the UBUNTU SYNAPTIC. YOU WERE CORRECT! That fixed my "invalid work unit" problem on one computer. And the "unknown driver" message disappeared. Two more computers to go as soon as the work runs out on those.

When I installed BOINC there must have been 20 additional libraries installed. I bet one of those was the reason the CL program didn't compile.

Thanks for the help.
19) Message boards : News : GPU Issues Mega Thread (Message 66083)
Posted 4 Jan 2017 by wb8ili
Post:
[AF>EDLS]GuL

Thanks for the tips.

The first link, in the fist line, clearly indicates that the article applies to AMD Processing Cores. Since I have a NVIDIA, I don't think that applies to me.

The second link looked more promising. However, I have all of the "ICD" and "OPENCL" installed (at least I think I do).

I wonder if there are any "debug" switches that might give more verbose output indicating why my CL programs don't compile.

In the meantime, I am racking up lots of credit on Einstein.
20) Message boards : News : GPU Issues Mega Thread (Message 66078)
Posted 3 Jan 2017 by wb8ili
Post:
As I stated before, I thought it is a UBUNTU (64bit?) and NVIDIA card issue. But that was proven incorrect.

Now I will propose a another factor -

If I looked correctly, both captianjack and [AF>EDLS]GuL have all Intel CPU's. mmonninn has the computers hidden. So, I can't tell.

The 3 computers that I can't run Milkyway GPU on all have AMD processors.

Could the manufacturer of the CPU make a difference whether a CL program compiles or not?


Next 20

©2020 Astroinformatics Group