rpi_logo
Conflict MW (ATI) & Aqua CPU
Conflict MW (ATI) & Aqua CPU
log in

Advanced search

Message boards : Number crunching : Conflict MW (ATI) & Aqua CPU

1 · 2 · Next
Author Message
Profile Lord Tedric
Avatar
Send message
Joined: 9 Nov 07
Posts: 151
Credit: 8,391,608
RAC: 0

Message 29024 - Posted: 8 Aug 2009, 8:18:48 UTC

Attempting to run these two projects simultaneously but am running into difficulties whereby Aqua will use all 4 of my processors and then go into High Priority thus, not allowing MW to crunch.

This is just a single Aqua cpu wu with a three week deadline so no problems finishing on time.
Anyone have any suggestions?

I've done a quick search of the boards and initially find no other comparision.
____________

Profile Stefan Ledwina
Avatar
Send message
Joined: 28 Aug 07
Posts: 16
Credit: 27,436,750
RAC: 16

Message 29025 - Posted: 8 Aug 2009, 8:30:19 UTC - in response to Message 29024.

I see you are using BOINC 6.6.20...
Maybe try 6.5.0 or one of the later 6.4.x clients. 6.5.0 works pretty good for my with AQUA on the CPU and Milkyway on the GPU.
____________


pixelicious.at - My little photoblog

Profile kashi
Send message
Joined: 30 Dec 07
Posts: 311
Credit: 148,905,504
RAC: 0

Message 29028 - Posted: 8 Aug 2009, 11:13:57 UTC

I don't how it would work on a quad to change to 5 BOINC CPUs but for my W3520 (=i7) I change BOINC CPUs to 9 with <ncpus>9</ncpus> in a cc_config.xml file. This allows MilkyWay ATI tasks to continue processing with AQUA on the other 8 "cores". If I don't do this, when AQUA is running MilkyWay ATI stops after it completes the currently running tasks. I am using BOINC version 6.6.31.

When I manually switch to another CPU project that is not multi-threaded I change back to 8 CPUs and process 7 tasks of CPU project. I prefer to leave a core free for MilkyWay ATI because otherwise it slows down a great deal if I leave ncpus at 9 and process 8 intensive tasks such as Einstein. You don't need to stop BOINC to do this, just click "Read config file" in Advanced section of BOINC Manager after you have made a change. The only small delay is that BOINC runs a benchmark every time the number of cores changes. I suppose I should try starting BOINC with the --skip_cpu_benchmarks command but I think that may only work in Linux.

Recently I have been swapping between 8 and 9 cores a fair bit as I switch between AQUA and WCG, so I have a shortcut to cc_config.xml on my desktop to allow a quick right click, open in Notepad and change number of cores.

Profile Conan
Avatar
Send message
Joined: 2 Jan 08
Posts: 105
Credit: 65,396,973
RAC: 0

Message 29029 - Posted: 8 Aug 2009, 11:39:29 UTC

Well both Milkyway and Aqua both run in High Priority on my machines.
They still both seem to work switching when required.

The Linux machine I have left 'as is' and it runs CPU Milkyway WUs OK as well as AQUA, Climate Prediction, Docking and QMC.

The Windows machine runs Milkyway on an ATI HD4870 with AQUA, Docking and Spinhenge on the CPUs.

I modified the cc_config.xml file on the Windows machine to show 5 CPUs (I have 4) and use 6.4.7 Boinc Client with 8.12 ATI drivers on an Win XP SP3 install.
Milkyway runs 3 jobs at once as per default settings for the card.

With AQUA using up to 4 CPUs it still leaves room for Milkyway to run on the ATI GPU. Has been working a treat so far.

The High Priority appears to have begun with the last couple of batches of AQUA work units, but for me is not causing a problem.

Conan.
____________

John Clark
Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0

Message 29031 - Posted: 8 Aug 2009, 12:41:25 UTC - in response to Message 29029.
Last modified: 8 Aug 2009, 12:48:35 UTC

I modified the cc_config.xml file on the Windows machine to show 5 CPUs (I have 4) and use 6.4.7 Boinc Client with 8.12 ATI drivers on an Win XP SP3 install.
Milkyway runs 3 jobs at once as per default settings for the card.

With AQUA using up to 4 CPUs it still leaves room for Milkyway to run on the ATI GPU. Has been working a treat so far.
Conan.


Your modification to the cc_config.xml file - I am a little mystified as to where this file is located. Can you enlighten me, please?

When I go to the projects folder and open the MW one, I can only see the app_info file which has the following CPU related statements -

<app_version>
<app_name>milkyway</app_name>
<version_num>19</version_num>
<flops>1.0e11</flops>
<avg_ncpus>0.1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<cmdline>n2 f20 w0.80</cmdline>

<file_ref>
<file_name>astronomy_0.19_ATI_SSE2f.exe</file_name>
<main_program/>

If this is the file you refer to, then which of the CPU references do you alter?

<max_ncpus>1</max_ncpus> from 1 to 5

or

<cmdline>n2 f20 w0.80</cmdline> nX from, in my case, 2 to 5?
____________
Go away, I was asleep


Profile kashi
Send message
Joined: 30 Dec 07
Posts: 311
Credit: 148,905,504
RAC: 0

Message 29033 - Posted: 8 Aug 2009, 13:16:09 UTC

It's not the app_info.xml file in C:\ProgramData\BOINC\projects\milkyway.cs.rpi.edu_milkyway
It's the cc_config.xml file in C:\ProgramData\BOINC

If you haven't created one it may not exist. Just create one in Notepad and save as cc_config.xml
Make sure extension is saved as .xml not .txt on the end.

Here's the current contents of mine for example:

<cc_config>
<options>
<zero_debts>1</zero_debts>
<ncpus>9</ncpus>
</options>
</cc_config>

John Clark
Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0

Message 29039 - Posted: 8 Aug 2009, 15:49:45 UTC
Last modified: 8 Aug 2009, 15:54:10 UTC

Thanks Tombei for the clarification.

I looked in /C/programme files/BOINC/ and saw I have no cc_config.xml file (as you had surmised).

The contents of your version, modified to fit a Penryn quad and a dual Prestonia Xeon with HT activated (acts like a quad) should be OK. Just changing the 9 to 5 in ncpus.

I will report back on the effect on a quad, and whether it is equally applicable to a dual HTed P4 machine.
____________
Go away, I was asleep


John Clark
Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0

Message 29044 - Posted: 8 Aug 2009, 16:59:30 UTC

Looks like in my specific set up (2 ATI GPUs) that cc_config.xml files appear to stop Milkyway running.

All the MW work in the BOINC caches is marked ready to run, including those that were crunching.

After 50 minutes the crunching has not yet restarted, so I have shutdown BOINC removed the cc_config files and restarted BOINC.

Unfortunately the work is still waiting to crunch, and I hope it will do so in time.
____________
Go away, I was asleep


Profile kashi
Send message
Joined: 30 Dec 07
Posts: 311
Credit: 148,905,504
RAC: 0

Message 29047 - Posted: 8 Aug 2009, 17:15:39 UTC
Last modified: 8 Aug 2009, 17:19:01 UTC

Hope you can get it to work. I didn't think of it but I probably shouldn't have included the zero debts part. I just copied and pasted my cc_config.xml as an example. Using the zero debts option may not be compatible with how you have BOINC configured to enable scheduling that works.

At least there are tasks available again now for when it starts and you now know how to create a cc_config.xml file if you need one in the future.

Bill
Send message
Joined: 3 Oct 07
Posts: 21
Credit: 49,862
RAC: 0

Message 29052 - Posted: 8 Aug 2009, 19:06:21 UTC - in response to Message 29044.

so I have shutdown BOINC removed the cc_config files and restarted BOINC.

You have to shutdown Boinc before modifying the file so it can read it on start up.

John Clark
Send message
Joined: 4 Oct 08
Posts: 1734
Credit: 64,228,409
RAC: 0

Message 29057 - Posted: 8 Aug 2009, 21:29:32 UTC
Last modified: 8 Aug 2009, 21:46:35 UTC

Did that before including the cc_config file, and again on removal.

The AGP HD3850 on the dual HT Xeon is working OK, but the HD4850 on the quaddy is still frozen with work to crunch after 2 hours.

When FreeHAL digs out a new bunch of work I will reboot the system.

That reboot of the PC sorted things out. Work is flying theough the GPU, which suggests it is the short WUs being talked about.

Yup!

27.80 and 37.18 credited WUs currently flying through. They are taking 31 seconds of CPU time as against 81- 84 seconds for the 74.24 credit WUs.

I think I will let the RAC build for a day or so, then try and copy the cc_config file again. This time I will take out the 0 debits, as suggested by Tombei.
____________
Go away, I was asleep


Profile KSMarksPsych
Avatar
Send message
Joined: 9 Sep 07
Posts: 22
Credit: 320,035
RAC: 0

Message 29139 - Posted: 10 Aug 2009, 8:55:29 UTC

Check your DCF for Aqua. If the DCF > 90 the task will run in high priority. There's a thread on the Aqua forms about it.
____________
Kathryn :o)
The BOINC FAQ Service
The Unofficial BOINC Wiki
The Trac System
More BOINC information than you can shake a stick of RAM at.

Vid Vidmar*
Avatar
Send message
Joined: 29 Aug 07
Posts: 81
Credit: 60,360,858
RAC: 0

Message 29140 - Posted: 10 Aug 2009, 9:01:17 UTC - in response to Message 29139.

As we all know it, boinc devs are more into facebook and other socialnetworkingbuzzcrap, that they totally forgot what boinc is about: using available resources for distributed computing. I dont know what exactly NVIDIA did, to get into boinc before facebook, but IMO boinc's CUDA support is just a mere coincidence, considering its poor implementation and scheduling (which are only now beginning to work). So, ATI... Well in next millennium we might get something resembling ATI support and I am afraid that, human race will never see boinc supporting OpenCL; however I am not willing to wait that long.

Therefore here is what I plan on trying:

1st: I'll modify boinc client, to allow more instances running, and compile it.
2nd: I'll divide projects by resources (CUDA, ATI, CPU, multi-threaded) and run a separate instance of boinc client for each of the resources with appropriate project set and resource limitations (ncpus = 1 for GPU instances, etc.). One thing I have not resolved yet is how to deal with multi-threaded apps - suggestions are more than welcome.

This way I hope to maximize resource utilization, what ATM the latest and most socialnetworkedfacebookedrepublic boinc client just doesn't know how to do. Also, it wouldn't hurt if anyone from boinc_dev "team" stumbled upon my idea and consider it as a way of resource utilization scheduling in upcoming clients.

Suggestions, comments?

BR,
____________

YoDude.SETI.USA [TopGun]
Send message
Joined: 29 May 09
Posts: 37
Credit: 34,016,951
RAC: 0

Message 29280 - Posted: 13 Aug 2009, 2:02:58 UTC
Last modified: 13 Aug 2009, 2:12:07 UTC

Well guys, I may have come up with a simple "work around" for the MW/Aqua problem.

It seems there's a few other folks that are having a lot of trouble keeping MW WUs running in companion with Aqua. I too suffer from this problem.

After reading this thread (and others) and trying MANY different configuration setups, I seem to have finally come up with something that seems to actually (sorta) work. Please keep in mind, this is not a "fix all" for this problem, but so far, seems to allow MW and Aqua to run together in seeming harmony.

Currently, I had two Aqua WUs. One, running in "high priority" the other, "waiting to run". Additionally, I have several MW WUs running. When the MW WUs finish, in this situation, the manager does not fetch new work (usually) and even if it does, they just sit there "waiting to run" forever (if they are newly fetched WUs) or, they sit there "ready to report", forever. We all hate this and yes, manually it can be overcome if you plan to be a boinc manager, babysitter. Screw that!

Here's what I've done:

1. In the cc_config.xml file, add or edit the line with ncpus to read:
<ncpus>5</ncpus> Thank you goes to someone else for suggesting this idea.

I'm running a quad core so this reflects one more cpu than actually exists. Doing just this alone, made the manager start running a second Aqua in parallel with the first (that was already running), both running in "high priority". Of course, at this point, there's no hope at all for any MW WUs to ever get done AND the second Aqua WU runs HORRIBLY slow, though it does progress and almost seems to be stalled, but it is not. Next........

2. Suspend one of the two running Aquas. Being that I already had one of the Aqua WU at about 25% completion, I suspended the newly started WU.

At this point, I let MW continue to run thinking, "yeah.....right". To my surprise after all the MW WUs completed, uploaded and were reported, the manager actually went and got more MW WUs and they actually STARTED on their own!

Currently, I have the one Aqua WU running in "high priority", the second "suspended", a full cache of MW WUs and three WUs running (I only have a single ATI 4870 in the system ATM) my settings are to run 3 WUs on the single card and I can only cache 16 WUs at a time.

Eleven MW WUs (normally twelve) are being reported as "running" while only three (which is normal for my setup) are actually gaining progression.
WUs are being reported, new WUs are being downloaded and set as "Ready to start". When the currently running WUs complete, "Running" WUs that were not gaining progress begin to show they are (only three at a time, but that is still normal for my setup). "Ready to start" WUs switch to "Running" and everything looks to be quite normal and running very smoothly.

On a final note, I don't claim to be a "guru" or "know it all" and someone else may have already suggested this idea, I don't know.

What I can tell you, is this, for the last hour as I've watched the progression of WUs through my system, MW WUs and Aquas are running together without complication.

If something falters, I'll let ya'll know. Meanwhile, happy crunching!

Yo-


On edit, here's my MY app_info.xml file:

- <app_info>
- <app>
<name>milkyway</name>
</app>
- <file_info>
<name>astronomy_0.19_ATI_x64f.exe</name>
<executable />
</file_info>
- <file_info>
<name>brook.dll</name>
<executable />
</file_info>
- <app_version>
<app_name>milkyway</app_name>
<version_num>19</version_num>
<flops>1.0e11</flops>
<avg_ncpus>0.1</avg_ncpus>
<max_ncpus>1</max_ncpus>
<cmdline>n3 w1.1</cmdline>
- <file_ref>
<file_name>astronomy_0.19_ATI_x64f.exe</file_name>
<main_program />
</file_ref>
- <file_ref>
<file_name>brook.dll</file_name>
</file_ref>
</app_version>
</app_info>

and my cc_config.xml file:

- <cc_config>
- <options>
<exclusive_app>ehshell.exe</exclusive_app>
<exclusive_app>Crysis64.exe</exclusive_app>
<no_gpus>0</no_gpus>
<ncpus>5</ncpus>
</options>
</cc_config>

Bill592
Avatar
Send message
Joined: 19 May 09
Posts: 30
Credit: 1,062,540
RAC: 0

Message 29283 - Posted: 13 Aug 2009, 6:35:06 UTC - in response to Message 29140.

YoDude.SETI.USA [TopGun]
Send message
Joined: 29 May 09
Posts: 37
Credit: 34,016,951
RAC: 0

Message 29292 - Posted: 13 Aug 2009, 10:17:54 UTC
Last modified: 13 Aug 2009, 10:26:26 UTC

As a follow up on my previous post on this thread, I have the following to report.

After several hours of flawless crunching of MW WUs and Aqua running, I have noted that upon completion of the Aqua WU, the Boinc manager would not report the completed task even though the upload was successful and without any intervention.

I had hoped that the manager would simply report the completed task and fetch a new Aqua but it just didn't happen. At this point I removed the remaining Aqua from suspension thinking that that may do the trick but alas, that didn't work either. The Aqua WU started running at this point, but the manager still wouldn't report the other completed WU.

This left only one choice. I had to suspend MW for a brief period, click the "Update" button to force the manager to report the Aqua WU and fetch another WU. Which is exactly what happened. Once the new Aqua was downloaded, I suspended it and then removed MW from suspension. Everything went back to running as I had made note of in my previous post.

Having to do this "trickery" every eight to ten hours is much more manageable than having to have to do it every twenty to forty minutes by manipulating MW to do the same thing.

I noticed that after doing this, some of the MW WUs were sitting there "waiting to run" which had me a little worried, but eventually, everything came back into play and now I can go back to bed.

Yo-

Profile Nightlord
Avatar
Send message
Joined: 29 Jul 08
Posts: 12
Credit: 60,445,018
RAC: 0

Message 29295 - Posted: 13 Aug 2009, 11:42:00 UTC

It is also possible to overcome many of these problems by running a virtual machine with VMWare, Virtual Box or even Virtual PC. Then you have to independant Boinc installations that know nothing about eachother and will happily crunch one project on each.

MW on the native OS with ATI, second project under VM. The result is two or more projects from the single physical host reported of course as two hosts in your accounts and stats.

You need to accept a small overhead loss due to the VM, but personally I've found this better than having to micromanage Boinc.
____________


Symington weather report and video feed

Vid Vidmar*
Avatar
Send message
Joined: 29 Aug 07
Posts: 81
Credit: 60,360,858
RAC: 0

Message 29302 - Posted: 13 Aug 2009, 13:54:25 UTC - in response to Message 29295.

It is also possible to overcome many of these problems by running a virtual machine with VMWare, Virtual Box or even Virtual PC. Then you have to independant Boinc installations that know nothing about eachother and will happily crunch one project on each.

MW on the native OS with ATI, second project under VM. The result is two or more projects from the single physical host reported of course as two hosts in your accounts and stats.

You need to accept a small overhead loss due to the VM, but personally I've found this better than having to micromanage Boinc.


Hey.
I already tried that, but linux running BOINC in VirtualBox would crash every time I assign more than one CPU to it. Now, I wouldn't like to run a VM for each processor. (If anyone is interested I tried it on my q9450@3.2GHz, 2G RAM, ATI 4870 1G, VBox 3, WinXp 64 host OS, Ubuntu 9.04 guest OS)

I am now in process of acquainting myself with BOINC code, however progress is slow, as I have a lot of work at job and this week I am also dog/house-sitting for a friend. Anyway, there seems to be mo work available here again, so I guess, there is no need for me to rush.
BR,

____________

YoDude.SETI.USA [TopGun]
Send message
Joined: 29 May 09
Posts: 37
Credit: 34,016,951
RAC: 0

Message 29304 - Posted: 13 Aug 2009, 14:21:43 UTC - in response to Message 29295.

Personally, I hate to have to micromanage the Boinc manager. Apparently, this really seems to be a manager "conflict of interest" or some such thing. Some would call it a bug. Others tend to think it's the WUs of different projects causing the problems, but I am convinced it's strictly a problem to do only with the manager itself and how it handles the projects.

Might there be an alternative manager that we can run these projects on that doesn't exhibit these problems? Or, is this something that simply just needs to be fixed?

In any case, we shouldn't have to deal with these headaches. In fact the Boinc people and the project people that come up with stuff should be paying us for doing their work for them. I have to pay dearly for the electricity to run these projects and I do this out of the kindness of my heart and the belief that these projects will one day help the world as a whole, if not already.

The least they could do is give us a manager that actually runs the projects they way we'd like it to. Half the setting don't seem to do anything, the rest seem to be broken and those that actually Do, do something, seem to do very little.

Rant - disabled

Yo- :)

Vid Vidmar*
Avatar
Send message
Joined: 29 Aug 07
Posts: 81
Credit: 60,360,858
RAC: 0

Message 29318 - Posted: 13 Aug 2009, 20:06:19 UTC - in response to Message 29302.
Last modified: 13 Aug 2009, 20:09:02 UTC

Look what I just found in checkin notes:

David May 12 2008
- client: add <allow_multiple_clients> cc_config.xml option
- client: remove stress_shmem code
...
David May 12 2008
- client: change --allow_multiple_clients to a command line option
(it can't go in the config file)


Now, on to try it out. [edit]checkin quote[/edit]
BR,
____________

1 · 2 · Next
Post to thread

Message boards : Number crunching : Conflict MW (ATI) & Aqua CPU


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group