Welcome to MilkyWay@home

Delay in getting new work units untill all work units have cleared

Message boards : Number crunching : Delay in getting new work units untill all work units have cleared
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
AT Hiker

Send message
Joined: 14 Aug 12
Posts: 10
Credit: 10,052,995
RAC: 0
Message 69057 - Posted: 18 Sep 2019, 22:18:29 UTC

Is there some reason for this problem to exist without the Administrator making some change to fix it?

Given that volunteers are offering free computing to the project, which cost the volunteer some amount of money, I simply can't understand why this situation is allowed to exist.

Given that I arrived only recently to the discussion I must be missing something.
ID: 69057 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 554,886,953
RAC: 36,413
Message 69060 - Posted: 19 Sep 2019, 1:33:51 UTC

The system administrators just recently updated the server software to version 1.04 during the summer and it took them a month to reconfigure everything to somewhat working condition again from the previous server code. Give them some time to figure things out again.

They will need to update again since the BOINC server code branch 1.20 has been released.
ID: 69060 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Oct 16
Posts: 167
Credit: 1,008,060,949
RAC: 9,927
Message 69064 - Posted: 19 Sep 2019, 10:55:39 UTC

ID: 69064 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AT Hiker

Send message
Joined: 14 Aug 12
Posts: 10
Credit: 10,052,995
RAC: 0
Message 69068 - Posted: 19 Sep 2019, 14:29:30 UTC - in response to Message 69064.  

I read the pages associated with the link. Interesting but I am not geek enough to understand all that.

Keith Myers actually answered the question.
ID: 69068 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 554,886,953
RAC: 36,413
Message 69072 - Posted: 19 Sep 2019, 18:18:21 UTC - in response to Message 69064.  

https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4424

Already discussed.

Can you try the latest 7.15.0 client artifact if you are running Windows. It has the most recent work_fetch module.

https://ci.appveyor.com/api/buildjobs/4bvvgoug1ej0x5mh/artifacts/deploy%2Fwin-client%2Fwin-client_master_2019-09-18_15ffc98a.7z
ID: 69072 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chooka
Avatar

Send message
Joined: 13 Dec 12
Posts: 101
Credit: 1,782,758,310
RAC: 0
Message 69097 - Posted: 20 Sep 2019, 23:05:19 UTC
Last modified: 20 Sep 2019, 23:19:49 UTC

I think this topic is why I came looking on the forum. I've noticed I keep running out of WU's and more aren't being pushed out? (or my pc isn't fetching more...however it works)
So quite often I have one or more PC's just sitting idle until I manually hit refresh.....then it fetches more work. All my PC's are running 7.14.2.

It's a bit annoying.

Edit - I see people know about the issue. Oh well....I'll just keep truckin.

ID: 69097 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 69104 - Posted: 21 Sep 2019, 23:01:24 UTC - in response to Message 69097.  

I think this topic is why I came looking on the forum. I've noticed I keep running out of WU's and more aren't being pushed out? (or my pc isn't fetching more...however it works)
So quite often I have one or more PC's just sitting idle until I manually hit refresh.....then it fetches more work. All my PC's are running 7.14.2.

It's a bit annoying.

Edit - I see people know about the issue. Oh well....I'll just keep truckin.


Setup a 2nd gpu project with a zero resource share so when MW runs out it will get tasks and your gpu won't be idle while waiting for MW to refill the cache. PrimeGrid and Collatz are two projects that almost always have tasks, at PrimeGrid you can pick wu's that run very quickly like the MW wu's do and at Collatz if you do use the optimization codes the wu's will run much faster as well.
ID: 69104 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gambatesa
Avatar

Send message
Joined: 23 Feb 18
Posts: 26
Credit: 4,744,416,145
RAC: 0
Message 69105 - Posted: 22 Sep 2019, 11:34:26 UTC - in response to Message 69104.  

Setup a 2nd gpu project with a zero resource share so when MW runs out it will get tasks and your gpu won't be idle while waiting for MW to refill the cache. PrimeGrid and Collatz are two projects that almost always have tasks, at PrimeGrid you can pick wu's that run very quickly like the MW wu's do and at Collatz if you do use the optimization codes the wu's will run much faster as well.


This is just a workaround, not a solution.. developers must fix it

This is for sure a server-side misconfiguration, in 15years of boinc it never happened with any other project
Want your Kids stay off from Drugs? Get them building Crunching PC's and they'll never have enough money for drugs
ID: 69105 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gambatesa
Avatar

Send message
Joined: 23 Feb 18
Posts: 26
Credit: 4,744,416,145
RAC: 0
Message 69106 - Posted: 22 Sep 2019, 11:34:31 UTC - in response to Message 69104.  
Last modified: 22 Sep 2019, 11:38:13 UTC

Double post
Want your Kids stay off from Drugs? Get them building Crunching PC's and they'll never have enough money for drugs
ID: 69106 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Oct 16
Posts: 167
Credit: 1,008,060,949
RAC: 9,927
Message 69109 - Posted: 23 Sep 2019, 1:02:51 UTC - in response to Message 69105.  

Setup a 2nd gpu project with a zero resource share so when MW runs out it will get tasks and your gpu won't be idle while waiting for MW to refill the cache. PrimeGrid and Collatz are two projects that almost always have tasks, at PrimeGrid you can pick wu's that run very quickly like the MW wu's do and at Collatz if you do use the optimization codes the wu's will run much faster as well.


This is just a workaround, not a solution.. developers must fix it

This is for sure a server-side misconfiguration, in 15years of boinc it never happened with any other project


I do this for every client and its always a good idea no matter your main project.
ID: 69109 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 69117 - Posted: 24 Sep 2019, 10:27:12 UTC - in response to Message 69105.  

Setup a 2nd gpu project with a zero resource share so when MW runs out it will get tasks and your gpu won't be idle while waiting for MW to refill the cache. PrimeGrid and Collatz are two projects that almost always have tasks, at PrimeGrid you can pick wu's that run very quickly like the MW wu's do and at Collatz if you do use the optimization codes the wu's will run much faster as well.


This is just a workaround, not a solution.. developers must fix it

This is for sure a server-side misconfiguration, in 15years of boinc it never happened with any other project


Of course it is but your post said your gpu was idle while waiting for MW workunits, my solution solves the gpu idle problem. MW must fix their own Project.
ID: 69117 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chooka
Avatar

Send message
Joined: 13 Dec 12
Posts: 101
Credit: 1,782,758,310
RAC: 0
Message 69121 - Posted: 26 Sep 2019, 12:11:59 UTC - in response to Message 69104.  

Hi Mikey,

So can you set 0 resource share for a specific project?
At the moment I have my BOINC settings to store 10 days work + another days days work (probably excessive) but it's all I could think of to try and download as many WU's as possible. The Radeon VII just eats them for breakfast.Are you suggesting that I set this to 0 ? If I did this, would Milkyway keep running out of work or choose Primegrid wu's over Milkyway?

I understand what you are suggesting and I'd probably pick PPS Sieve WU's as they take no time at all but I'm not sure how to set it up correctly. I've checked the Primegrid settings and can't see that I can set 0 resource share there.

TIA

ID: 69121 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 69122 - Posted: 26 Sep 2019, 14:52:52 UTC - in response to Message 69121.  

I've checked the Primegrid settings and can't see that I can set 0 resource share there.


At www.primegrid.com log in to your account and click on "project preferences". At the top under "Primary" you should see

Resource share 100

It can be changed by scrolling down and looking for "Edit PrimeGrid Preferences"

Change the 100 to 0 and scroll to the very bottom and click on "update project preferences"
ID: 69122 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chooka
Avatar

Send message
Joined: 13 Dec 12
Posts: 101
Credit: 1,782,758,310
RAC: 0
Message 69124 - Posted: 26 Sep 2019, 22:54:01 UTC - in response to Message 69122.  

LOL...right at the top of the page! I completely overlooked it.
Thank you !

ID: 69124 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chooka
Avatar

Send message
Joined: 13 Dec 12
Posts: 101
Credit: 1,782,758,310
RAC: 0
Message 69125 - Posted: 27 Sep 2019, 2:14:26 UTC - in response to Message 69124.  

Hmm.... with resource share set to 0, I'm finding on 2 of my PC's that BOINC is pausing the Milkyway@Home wu's and running the Primegrid work instead. Should that happen?
It's not what we want is it.

TIA

ID: 69125 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 69126 - Posted: 27 Sep 2019, 5:12:41 UTC - in response to Message 69125.  
Last modified: 27 Sep 2019, 5:27:32 UTC

Hmm.... with resource share set to 0, I'm finding on 2 of my PC's that BOINC is pausing the Milkyway@Home wu's and running the Primegrid work instead. Should that happen?
It's not what we want is it.

TIA


I assume you are not using a project manager..

When resources are set to 0 for a project then only one WU is downloaded and only if the primary project is out of data. If the primary project has suspended tasks then there is no download either.

That work unit (prime), once started, will run to completion unless it takes more than 60 minutes in which case it will be suspended.
Once suspended, there is a possibility it will never complete in time, as it will be serviced only when the primary project is out of data.
If it finishes only an upload will occur unless the primary project is still out of data.

You should have only one prime work unit total (for each GPU). If you have any queued up to run that is another problem. I assume you have only one prime grid task running per gpu and none queued up to run.

Prime grid takes an entire GPU. When I used to run Prime I had to force nVidia fans at %100 with the side case off and a box or tornado fan blowing air in. I assume nothing has changed since then other than Prime opting out of gridcoin mining. Suggest you run Einstein as secondary project rather than prime. It will not need a box fan or an open mining rig for cooling unlike prime and is also eligible for gridcoin rewards. My understating is the gridcoin people are allowing miners to keep their membership in local clubs. I had to drop my Texas A&M membership to mine while crunching.

Note that once milkyway downloads new work units, it will NOT stop the secondary task from running

You mentioned that a few MW tasks were suspended. That can happen when attaching for first time or changing venue where the new venue has 0 for resources. I have found it requires two "project updates" the first time a connection is made (new project) The first connection gets task but does not set the resource to 0. The second "project update" gets "resource = 0" . Unfortunately, you will be stuck with an unwanted task or two. Using a project manager like !BAM can complicate things as BAM must be told to set the resource to 0 using a "project manager sync".

Once a few unwanted Prime tasks arrive they will get serviced because they are there already but eventually they will finish or get suspended and MW tasks will take over.
ID: 69126 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 554,886,953
RAC: 36,413
Message 69127 - Posted: 27 Sep 2019, 16:23:01 UTC - in response to Message 69126.  

My understating is the gridcoin people are allowing miners to keep their membership in local clubs

Not fully implemented yet. Still in beta testing for a few outside teams. Going well. Probably won't have the new general release open team clients available till the first of the new year.
ID: 69127 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JHMarshall

Send message
Joined: 24 Jul 12
Posts: 40
Credit: 7,123,301,054
RAC: 0
Message 69128 - Posted: 27 Sep 2019, 18:39:31 UTC

Has anyone seen any response from a Milkyway administrator/developer/moderator on this issue since Jake left the project?

Joe
ID: 69128 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 69131 - Posted: 27 Sep 2019, 20:56:29 UTC

We have been monitoring the situation, and it seems like the community has found fixes to some of the problems you are experiencing.

Jake said that the problem appeared to be some obscure BOINC setting somewhere, and had asked BOINC forums about it. It looks like this issue disappears in the new beta of a BOINC client, so they must have patched whatever was causing problems. When that is released, hopefully the problem will be resolved.

- Tom
ID: 69131 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 554,886,953
RAC: 36,413
Message 69133 - Posted: 27 Sep 2019, 23:22:20 UTC - in response to Message 69131.  

JStateson tried the Windows AppVeyor artifact for the 7.15.0 development client at my suggestion and ran into the same issue of work unit starvation. The work_fetch.cpp module (4/20/2019) that controls work fetch has not changed between the 7.15.0 development branch and the latest 7.16.2 development branch in the past 5 months and which is scheduled to go mainline hopefully soon once the translations are finished. There may be more interactions with the other modules in the client involved with work fetch that have been patched or changed in the current development branch that may fix the issue. But I have my doubts.

I believe there are still server side misconfigurations that the new expected client will not fix.
ID: 69133 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Delay in getting new work units untill all work units have cleared

©2024 Astroinformatics Group