Welcome to MilkyWay@home

Collatz and MW togetther


Advanced search

Message boards : Number crunching : Collatz and MW togetther
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 286,221,306
RAC: 23,611
200 million credit badge12 year member badgeextraordinary contributions badge
Message 36331 - Posted: 8 Feb 2010, 17:45:36 UTC

Is there some additional setting I need to put in place to allow Collatz GPU and MW GPU to play well together?

Running XP and a 4850, I can only get one GPU Workunit from MW -- by suspending Collatz, I can get that work unit to complete properly, but cannot download any additional work. To get another single workunit, I need to detach and then attach -- rather tedious and I am certain there is some setting I need to put in place to handle this properly.

ID: 36331 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileThe Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
200 million credit badge13 year member badge
Message 36332 - Posted: 8 Feb 2010, 18:17:15 UTC

Can you tell us your cache setting preferences and resource shares and what the estimated to completion times are?
ID: 36332 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 286,221,306
RAC: 23,611
200 million credit badge12 year member badgeextraordinary contributions badge
Message 36334 - Posted: 8 Feb 2010, 18:42:16 UTC - in response to Message 36332.  
Last modified: 8 Feb 2010, 18:43:10 UTC

Resource share Collatz 8000, MW 10000 (I also have CPU only projects on this AMD quad -- including Aqua, Spinhenge and Poem - resource shares there are 4000, 8000, 8000).

I've set Collatz to no new work -- so it is working thru its cached work units -- due date for the remaining work there is 2/20/10 to 2/22. Each work unit takes 20 minutes. When I suspend Collatz, MW does not go looking for new work (I get 'report, not requesting new work).

The single MW work units take less than 4 minutes to complete and currently have 2/16 due dates.

My cache is set to 5 days
ID: 36334 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileThe Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
200 million credit badge13 year member badge
Message 36340 - Posted: 9 Feb 2010, 3:16:24 UTC - in response to Message 36334.  
Last modified: 9 Feb 2010, 3:51:25 UTC


My cache is set to 5 days

There's your problem. With MW I think even with the new extended wu deadlines you shouldn't cache more than 2 to 3 days worth. Try 1 day and work up.
ID: 36340 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfilePaul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
100 million credit badge13 year member badge
Message 36403 - Posted: 10 Feb 2010, 22:45:56 UTC - in response to Message 36334.  

My cache is set to 5 days

With the lunatic Strict FIFO rule in place you will do 5 days of MW followed by 5 days of Collatz unless one project or the other is down.

I have 1.5 days set and have had this setting for some time and for all GPU work have watched this pattern for months now. In that strict FIFO was put in to prevent tasks from being suspended partly done (caused by a multitude of other bugs), "fix" was tried to ease that problem... even though it did not really cure the problem UCB for whatever reason is in love with this rule and won't remove it ...

The really bad thing is that for GPUs it means that your caching ability is shot because, as you note, the "off" project simply will not store up work under most conditions...
ID: 36403 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 286,221,306
RAC: 23,611
200 million credit badge12 year member badgeextraordinary contributions badge
Message 36405 - Posted: 11 Feb 2010, 0:28:02 UTC - in response to Message 36403.  
Last modified: 11 Feb 2010, 0:28:46 UTC

Interesting discussion regarding cache -- but it doesn't seem to help in my case. As a test, I set my cache to 1 day. Then I set Collatz to no new work. Then I *reset* Collatz -- clearing its entire work queue.

Then I detached MW, and reattached. Same thing, with the reattach I could get *one* work unit only. And when that work unit was completed it would sit and not report until I did a update. At which time it reported the one work unit and did NOT request new work.

So notwithstanding this interesting discussion of work cache size, I believe that to be NOT the factor involved here. So, realizing that work cache size is not the key variable, is there some other variable I (we) am (are) missing here.

Note, the short cache rule doesn't have any effect on the CPU work units -- I set a longer cache there and MW plays reasonably nicely with other CPU projects. Oh, I've already configured this workstation as 'school' and GPU only for MW. My default settings allow CPU for MW since that's what I've run (and run) on the rest of my MW workstations. I run Collatz mostly (along with GPU grid on my 9800GT workstations) for GPU.

So I suspect there is another setting that needs to be set to let MW pick up work, rather than have me *manually* report one work unit, wait one minute, detach, and rejoin - as THAT is not going to happen -- in that set up, Collatz gets this 4850 for all it needs.
ID: 36405 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 286,221,306
RAC: 23,611
200 million credit badge12 year member badgeextraordinary contributions badge
Message 36444 - Posted: 12 Feb 2010, 15:57:08 UTC - in response to Message 36405.  

OK - so I take from the lack of follow up response that there is no solution to having both project run cooperatively on the same workstation. Fair enough
ID: 36444 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Clark

Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0
50 million credit badge12 year member badge
Message 36445 - Posted: 12 Feb 2010, 16:28:43 UTC

Barry

As you mentioned, the 2 projects will co-operate on CPUs. But, I have never been able to get them to run and behave on a GPU.

My solution is to run them by themselves and use separate GPUs. I use the CPUs for other projects.
Go away, I was asleep


ID: 36445 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfilePaul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
100 million credit badge13 year member badge
Message 36451 - Posted: 12 Feb 2010, 20:36:39 UTC - in response to Message 36444.  

OK - so I take from the lack of follow up response that there is no solution to having both project run cooperatively on the same workstation. Fair enough

Um, that is not what I said at all ...

On my windows ATI workstations (2 each, 3 GPUs) I run MW and Collatz and have for months.

On my windows Nvidia workstations (2 each, 3 GPUs - 4 GPU cores) I run MW, Collatz, GPU Grid and Prime Grid

On OS-X I run Collatz on my GPU

All run well ...

So, I don't know why you are having issues or even understand what exactly they are ... but, because of the current rules for GPUs you can see odd behavior and the projects do NOT run like tasks on the CPU ...

If you mix GPU and CPU tasks on the same systems from these two project that is another whole can of worms... so ... for me, I don't understand the problem and it works for me ...

So, if you are willing to start at the beginning and don't mind my slowness ... :)
ID: 36451 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 286,221,306
RAC: 23,611
200 million credit badge12 year member badgeextraordinary contributions badge
Message 36467 - Posted: 13 Feb 2010, 3:48:54 UTC - in response to Message 36451.  

Fair enough -- here's what I have

Two systems where running MW GPU tasks is possible.

One -- Vista 32 bit. ATI 4770 GPU.

This one has both Collatz and MW for GPU projects AND Climate, POEM, Spinhenge, Einstein and SETI as CPU projects.

The second -- XP 32 bit. ATI 4850

Collatz and MW GPU, AND Spinhenge and POEM CPU.

In both systems, Collatz works fine with the CPU projects in the mix, gets workunits easily. MW is a different story. If I detach MW and then rejoin, it will get ONE, and ONE only work unit. It will complete that unit but not report it until I do a manual update. Then it reports one work unit but won't request any work. The ONLY way I get another MW work unit is to detatch and reattach.

This occurs if I have a 1 day cache, a 2 day cache or a 5 day cache setting.

It EVEN occurred when I detached Collatz on the XP workstation.

Needless to say, with an under 5 minute workunit cycle, this is a tedious way to get MW work processed.

My assumption is that there is some setting (MW specific) that is wrong -- so I'm seeking help to resolve that.

Thanks for jumping in.




OK - so I take from the lack of follow up response that there is no solution to having both project run cooperatively on the same workstation. Fair enough

Um, that is not what I said at all ...

On my windows ATI workstations (2 each, 3 GPUs) I run MW and Collatz and have for months.

On my windows Nvidia workstations (2 each, 3 GPUs - 4 GPU cores) I run MW, Collatz, GPU Grid and Prime Grid

On OS-X I run Collatz on my GPU

All run well ...

So, I don't know why you are having issues or even understand what exactly they are ... but, because of the current rules for GPUs you can see odd behavior and the projects do NOT run like tasks on the CPU ...

If you mix GPU and CPU tasks on the same systems from these two project that is another whole can of worms... so ... for me, I don't understand the problem and it works for me ...

So, if you are willing to start at the beginning and don't mind my slowness ... :)


ID: 36467 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 286,221,306
RAC: 23,611
200 million credit badge12 year member badgeextraordinary contributions badge
Message 36469 - Posted: 13 Feb 2010, 4:12:30 UTC

I did a further test on the XP system.

I detached all projects, then uninstalled BOINC. I then deleted the 'leftover' program and project data folders.

I then reinstalled 6.10.18.

I then attached to MW and ONLY MW.

The same problem. One unit downloads, completes and SITS. I manually update and 'no new work requested'.

Not wanting to leave that workstation fallow for what really appears to be a MW specific conundrum, I reattached to the other three projects and let the downloads procede normally as they have.

I suppose I could create a new user name and start from absolute scratch -- but that strikes me as a LOT effort to troubleshoot MW.

Note, I have NOT had problems with MW CPU projects -- all my completed work into this month for MW has been CPU work.
ID: 36469 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfilePaul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
100 million credit badge13 year member badge
Message 36501 - Posted: 13 Feb 2010, 20:34:47 UTC
Last modified: 13 Feb 2010, 20:37:38 UTC

Ok, first thing to try is to make a cc_config file and add the report results immediately and see if that helps some... there is something else going on though.

The other critical difference I think is that I ONLY run GPU versions of MW and Collatz and I run them with the stock downloads (no anon platform) and I DO NOT see this issue.

So, other things to try are to set prefs to only run GPU tasks and see if we are getting a collision on resource/DCF screw-ups because of CPU and GPU use on the same system (I have not tested this theory, but based on UCB reluctance to fix DCF so it is application specific vs project specific is long standing, even though it has already been demonstrated to be a problem for other things).

<cc_config>
   <log_flags>
   </log_flags>
   <options>
       <report_results_immediately>1</report_results_immediately>
       <use_all_gpus>1</use_all_gpus>
   </options>
</cc_config>


{edit}
can you link the two troublesome systems... with 33 to choose from I likely will guess wrong ... :)
ID: 36501 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 286,221,306
RAC: 23,611
200 million credit badge12 year member badgeextraordinary contributions badge
Message 36517 - Posted: 13 Feb 2010, 23:42:23 UTC - in response to Message 36501.  

OK - I will try that -- but that last test I did with ONLY MW GPU running -- that is, a full prior clear out, detaching all projects, then uninstalling BOINC, then clearing out the leftover folders (Program and data), then reinstalling BOINC 6,10.18 then ONLY installing MW and running GPU only, didn't change things at all.

In that scenario there were no CPU projects in the mix and Collatz wasn't in the mix either.

It is as if the project was running ONLY as 'No New Tasks'

Interestingly enough, I have one configuration (Work) set for CPU and GPU with MW -- when I ran that way, I did get multiple CPU tasks, but only the one GPU task.

Sort of frustrating here.
ID: 36517 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileThe Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
200 million credit badge13 year member badge
Message 36524 - Posted: 14 Feb 2010, 5:24:29 UTC

What is the estimated time to completion for each freshly downloaded wu?
ID: 36524 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 286,221,306
RAC: 23,611
200 million credit badge12 year member badgeextraordinary contributions badge
Message 36546 - Posted: 14 Feb 2010, 21:05:30 UTC - in response to Message 36524.  

Between three and four minutes -- a bit longer on the 4770 versus the 4850.

Again, it seems as though some flag is set for 'no new tasks' -- which forces the manual report.

Further to this, if I set the workstation to be part of the group which allows both GPU and CPU tasks, it WILL get additional CPU tasks.

Since it isn't workstation specific (happens on both workstations), since it doesn't 'not work' -- that is the single work units process properly and get credit, since even in a 'start totally clean' (clearing off the existing client install entirely and reinstalling with ONLY MW as the project doesn't behave differently), and since I've NOT had this problem with CPU workunits, I'm a bit at a loss as to what further to try.

Thanks for the reply.


ID: 36546 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
j2satx

Send message
Joined: 3 Nov 07
Posts: 13
Credit: 122,114,444
RAC: 0
100 million credit badge13 year member badge
Message 36561 - Posted: 15 Feb 2010, 15:59:14 UTC


I have Intel quad and ATI 4850
W7 RC 64 with 9-11 video driver
BOINC 6.10.18

PrimeGrid share 50 (project selected runs on CPU only)
Collatz share 5 (GPU only selected on project)
MW share 5 (GPU only selected on project)

cache 2.25 (usually 0, but participating in PG challenge)

When MW completes WU, a new WU downloads if cache is deficient.

The cache has WUs from all three projects...400 PG, 23 MW and 129 Collatz (PG is limited by project to 400).

I did not start from scratch.....I was running PG and MW, then added Collatz to see how it reacted after reading this thread. Collatz downloaded WUs and it all seems to be working automagically.

What else can we compare to help make sense of the issue?




ID: 36561 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfilePaul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
100 million credit badge13 year member badge
Message 36564 - Posted: 15 Feb 2010, 19:18:38 UTC - in response to Message 36517.  

OK - I will try that -- but that last test I did with ONLY MW GPU running -- that is, a full prior clear out, detaching all projects, then uninstalling BOINC, then clearing out the leftover folders (Program and data), then reinstalling BOINC 6,10.18 then ONLY installing MW and running GPU only, didn't change things at all.

It looks like 6.10.18 is your "standard" GPU version of BOINC so that does not look like the cause either if MW and Collatz are working fine on multiple other systems with similar other configuration.

So I am at a loss also ...

Oh, wait, I do have a question... are these dual GPU systems? And are the GPUs matched? Or different by some significant amount?
ID: 36564 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 519
Credit: 286,221,306
RAC: 23,611
200 million credit badge12 year member badgeextraordinary contributions badge
Message 36565 - Posted: 15 Feb 2010, 20:56:01 UTC - in response to Message 36564.  

No -- single GPU plus quad CPU.

The thing is, I did what I would have thought would be a variable killing test by clearing everything out including a full uninstall of BOINC and a delete of the leftover program and project folders and a clean new install with ONLY MW GPU configured as GPU only. I really don't know what more I can do there to set up a fresh start.

Of course I didn't run that way for long -- a half hour or so -- simply because I didn't see the point of 'punishing' the CPU projects or the 'good citizen' Collatz project.

Note, I have some Cuda 9800 cards and Collatz and GPUGrid coexist there just fine on those systems. For that matter, SETI GPU lives with them as well.

The only 'bad boy' is MW -- specifically MW GPU as MW CPU plays just fine with other CPU projects.

I can't see that I've not checked the 'be a good citizen' check box -- couldn't find it.




OK - I will try that -- but that last test I did with ONLY MW GPU running -- that is, a full prior clear out, detaching all projects, then uninstalling BOINC, then clearing out the leftover folders (Program and data), then reinstalling BOINC 6,10.18 then ONLY installing MW and running GPU only, didn't change things at all.

It looks like 6.10.18 is your "standard" GPU version of BOINC so that does not look like the cause either if MW and Collatz are working fine on multiple other systems with similar other configuration.

So I am at a loss also ...

Oh, wait, I do have a question... are these dual GPU systems? And are the GPUs matched? Or different by some significant amount?


ID: 36565 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
j2satx

Send message
Joined: 3 Nov 07
Posts: 13
Credit: 122,114,444
RAC: 0
100 million credit badge13 year member badge
Message 36568 - Posted: 16 Feb 2010, 0:05:15 UTC - in response to Message 36565.  


Just for the "halibut", try reducing your share for collatz and MW to about 1/10th of your CPU projects.

You can use smaller numbers, rather than thousands to get the same percentages.
ID: 36568 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfilePaul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
100 million credit badge13 year member badge
Message 36581 - Posted: 16 Feb 2010, 15:29:32 UTC - in response to Message 36565.  

The thing is, I did what I would have thought would be a variable killing test by clearing everything out including a full uninstall of BOINC and a delete of the leftover program and project folders and a clean new install with ONLY MW GPU configured as GPU only. I really don't know what more I can do there to set up a fresh start.

That should do it on the client side. The only question I have is if the computer trashed so many tasks that its daily quota is down in the dirt. That is the last thing I can think of. But my recollection is that you say that the tasks are completed and validated so that does not make much sense either.

My last suggestion is to try 6.10.28 to see if anything changes... sadly I have not been impressed with the work fetch and Resource Scheduler in the post GPU era because UCB did not really consider the impacts of so many design choices on other issues and they have been very reluctant to address issues or to even acknowledge that they exist.
ID: 36581 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Collatz and MW togetther

©2021 Astroinformatics Group