 
    
            Message boards : 
            Number crunching : 
        Finally getting new tasks only seconds after running out.  May not be worth the hassle.
Message board moderation
    
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
| Author | Message | 
|---|---|
|  Joseph Stateson  Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,463,985,753 RAC: 25     | 
 I did disable the N-body sims and aborted any tasks I had "Ready to run. Same issue. Running separation and gpu tasks the thread count never exceeds 32. There was a few instances where two 16C N-body tasks would start up if there were no GPU tasks running (which can be for 10-20min before new tasks DL after the last is uploaded). As mentioned in another thread, my program will not calculate the delay time correctly if you are running CPU tasks from Milkyway. I just check for the word "milkway" in the name of the task and assume they are GPU. Do not download nor run nor suspend any CPU tasks from Milkyway. Get rid of them all and don't let them show up. N-Body are not the only CPU ones from Milkyway Lemme know if that fixes the problem. | 
|  Joseph Stateson  Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,463,985,753 RAC: 25     | 
 
 I agree, however, putting it nicely, I suspect the project does not know how to fix it and probably does not have the staff. GPUgrid also has problem: Their resume from suspend fails if the graphic boards are not the same class (long discussion over at Boinc / Project) At least every resume from suspension (checkpoint rather) at milkyway works on any board as long as the manufacturer is the same AFAICT. The Milkyway client code is at GitHub but not their server code. I asked to look at the server code and Richard also offed to look at it but no one responded. We just mentioned it in a thread here and did not attempt to actually contact a principal. As far as homemade versions ... I have my code sources at GitHub under my name, unlike the folks at SETI who put together a secret private version for their club to bypass the limit on the number of GPU per system. [edit] Actually, I did send a message to one of the principals here and asked them to comment on the thread where we were wondering what the source code looked like and if we could fix it but no one from here posted on that thread | 
|  Joseph Stateson  Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,463,985,753 RAC: 25     | 
 OK, got a PM that it seems to be working. Going to post some info here and maybe someone can comment on it and add their own results I ran a test a year ago and ran first 1 WU at a time, then 2Wu concurrently, then up to 10Wu and carefully recorded the increase in performance. This was on an S9100 which has 12gb of ram compared to only 6gb on my five S9000. The S9000 is simply an HD7950 w/o a fan but with a lot of ECC memory. Things improved up to about 5 concurrent work units. Starting at 6 I noticed I was starting to get invalid results. Running 10 concurrent the invalid results were so bad I was better off just running 1 or 2 tasks at a time. I think the problem was that OpenCL has a 4GB address space and probably only about 3gb is actually available for the project so none of the extra memory above 4gb is even available. You mentioned bumping up to 7 work units per board. Be sure to check your invalid tasks count and watch the heat on the boards if running 7 at a time. I just checked your system and see only 1 invalid which is a really good sign. If you want to get some statistics you can use my performance tool at https://stateson.net/hostprojectstats I used its "create query" to put the following url together. If you click on it and then click "calculate" It will download the last 20 results of your fastest system and will do a credit prediction https://stateson.net/HostProjectStats/default.aspx?url=https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=776231&offset=0&show_names=0&state=4&appid=;nCon=1;nDev=1;Wu=20;lw=0;iw=0; It shows your system can generate about 14,500 credits in an hour with 1 GPU and 1 concurrent tasks. Change the values to 3 GPU and change the number of concurrent tasks to the actual number that corresponds to the data then select "calculate" Note that idle time between WU's is not account for so the value is the theoretical maximum you can get. My Boinctasks history reader can calculate actual thruput. Suggest you run 1 Wu a time and make a note of the average elapsed time, say "50 seconds After an hour or so then do 2 at a time. If you then get "90"s seconds then the average is really 45 which is an improvement. Then try 4 at time. If don't see an improvement then go back to the previous else you are overheating the GPU to no advantage. Come back and post your results. If you have a watt meter you can enter in the Idle and Load values and get a feel for KWH expenses. I have several plots of performance and power consumption for a number of projects and co-processors if I can find where I saved them. | 
|  mikey  Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0       | 
 
 I did NOT mean "homemade version" to indicate you weren't on the up and up or were a dishonorable person, what I meant was that unless it comes from Boinc the vast majority of crunchers won't use it. I use your other stuff BoincTasks and it works great for me!! | 
|  Send message Joined: 29 Apr 17 Posts: 33 Credit: 7,041,502,264 RAC: 0     | 
 mikey, If you can deal with not running any CPU tasks, JStateson's fix to the delayed GPU task-fetching issue does work!  I have implemented it on two machines now and both are fetching GPU tasks while completed tasks are "Ready to report".  The GPUs have not sat idle at all since, It's pretty easy, just requires edits to the cc file, a simple regedit and replacing boinc.exe with his complied version (v7.15), renamed boinc.exe, boincmgr and boinctray exe's to " .origexe" and did not have to delete the reg strings for tray and mgr, only so I could switch back to boinc v7.14 easily if needed[/quote] | 
|  Send message Joined: 29 Apr 17 Posts: 33 Credit: 7,041,502,264 RAC: 0     | 
 3 Titan Vs running in P2 state at 1466MHz 0.75V. 7 tasks per GPU. I did not disable the P2 undervolt when cuda is detected in the stack, which would allow for P1 cuda calculations. Each card would then pull ~ 250W.     | 
|  Joseph Stateson  Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,463,985,753 RAC: 25     | 
 I am glad it is working!  Will have to look at your Koolance and System information viewer as I am not familiar with those.  I am using Alienware's "ThermalControl".  Probably a subject for another thread. I have a Linux version of this program and built it under ubuntu 18.04. I have no idea if it works on any other version of Linux and I do not know how to get it into a ppa to for easy install. I call it 7.14.666 and I am using it on SETI but it also handles Milkyway. I also put together a temperature reporting app for Ubuntu as describe here. Anyone using Boinctasks and has Linux crunchers can use it to display temperatures similar to how TThrottle routes temperatures from windows to Boinctasks. It is to be posted over at the 3rd party forum of boinctasks after Fred reviews it. I do not think I can get a combination of CPU + GPU version to work for Milkyway as I don't have a good system to test it on. My system with 6 GPUs has a fully loaded 4 core cpu so it would be difficult to get enough CPU tasks to debug your problem. I do have a 24 thread dual xeon but it has only 3 working GPU slots and is dedicated to WCG. I put in an RX-570 and tested my Linux 7.14.666 to make sure it worked under milkyway but otherwise it has no GPU. | 
|  Send message Joined: 29 Apr 17 Posts: 33 Credit: 7,041,502,264 RAC: 0     | 
 I'm amazed that you can keep the S9100s from catching fire!  Must have a some serious air flow on them Wow! Yeah, the Koolance software works with their internal fan/pump controller (it's discontinued unfortunately - it really work better than the new new bay-mount controller. I have a couple of Aquaeros here, and even one of Aquacomputer's 720XTs and one of their GiGant 1680 rad "things". The Aquaeros are the best controller available, expensive though. If U used the stock air coolers on the gear here, my wife would not tolerate this vice, tho I would be able to hear here complain about it. :) SIV64's author really did a deep dive on the SIO and other signal busses... to the point where the software scares off a lot of potential users. That said, once you learn it, I have found nothing comparable (AID64, HWi, etc are all good, just not as good as SIV64 IMO). Here's a link to some rig pics on G-drive just for grins: https://drive.google.com/open?id=19nHZg1I-PAoCmL56VgnEFLsxYKlszldO Thanks again for sorting this GPU idle time thing out Outstanding work!   | 
| Send message Joined: 19 Apr 18 Posts: 5 Credit: 1,536,603,202 RAC: 0     | 
 I had this working a few months ago, but ended up having to re-format the windows in my computer and now when i try to add this custom version of boinc my boinc manager will not connect to the client even after restart.  what can i do to fix this?  (if i re-run the boinc install and "repair" things work again but i am no longer running the custom version at that point) | 
|  Joseph Stateson  Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,463,985,753 RAC: 25     | 
 I had this working a few months ago, but ended up having to re-format the windows in my computer and now when i try to add this custom version of boinc my boinc manager will not connect to the client even after restart. what can i do to fix this? (if i re-run the boinc install and "repair" things work again but i am no longer running the custom version at that point) I do not have an install for my "mod". You got to install 7.14.2 from Berkeley and after it is running OK then 1. stop boinc by exiting the manage 2. rename boinc.exe to old_boinc.exe at "Program Files\boinc" 3. rename my mod to boinc.exe | 
| Send message Joined: 19 Apr 18 Posts: 5 Credit: 1,536,603,202 RAC: 0     | 
 I did all of this and when i try to run my boinc manager after replacing the original Boinc executable the boinc manager launches but can not connect to the client. As i mentioned i have had this working previously a few weeks ago so i am sure i am following all the steps correctly and yes my boinc was already installed 64bit version 7.14.2 and working properly running several projects including milkyway. | 
|  Joseph Stateson  Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,463,985,753 RAC: 25     | 
 I did all of this and when i try to run my boinc manager after replacing the original Boinc executable the boinc manager launches but can not connect to the client. Try the following (1) Boincmgr: Bring up boincmgr and click on the Help and then the About and make a note of the version number you should have orig_boinc.exe and also boinccmd.exe run the original using orig_boinc --version then run boinccmd --version All should be 7.14.2 except the mod which is 7.15.0 (2) Make sure you are not running two copies of the boinc client. Bring up the windows tasks manager and look for boinc. If you see more than one then terminate them both. I assume you are not manually executing the boinc program as that would be a problem. Make sure that boincmgr is not running. Bring up the task manager and look for boinc. It should not show up. Then run boincmgr. It should automatically start boinc and you should see both boinc and the manager. I am not an expert on firewalls. Conceivable, a firewall product might keep the manager from seeing the client but that seems unlikely when both are on the same system. (3) put the original version of the boinc client back at \C:\Program Files\boinc and see if it works. I assume you have the 64 bit version. If it does not work then suggest download and use the free Revo Uninstaller. Dell support uses this product for remote repairs. During the uninstall Revo will ask if you want to remove all traces of boinc from the register and the disk so be sure to select the advance scan and then click on first "select" and then "Delete". Suggest you put in 7.16.3. but my program works with both. If you cannot get my "mod" to work then just use the batch file that Peter posted here with the original 7.14.2 or 7.16.3 https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4424&postid=69200 [edit] actually, peter did not create that script. I don't remember who did but it works fine though it is a PITA as it needs to be started every time a reboot. | 
|  Send message Joined: 23 Feb 18 Posts: 26 Credit: 4,744,416,145 RAC: 0     | 
 I installed the first clients . It's too early to evaluate the improvements but in the next days i'll keep monitored logs and RAC A quick followup.. since i installed the modded 7.15 from Joseph Stateson my RAC increased consistenly of 8-10%.. No more idling.. all gpu back to work 24/7 Thanks Joseph for your work Now we still hope in an official fix (but wait will be easyer) Want your Kids stay off from Drugs? Get them building Crunching PC's and they'll never have enough money for drugs | 
|  Wij31s6SD19aNbLrtAydbva42tL Send message Joined: 1 Jul 19 Posts: 6 Credit: 5,436,903,016 RAC: 143,933       | 
 I did all of this and when i try to run my boinc manager after replacing the original Boinc executable the boinc manager launches but can not connect to the client. As your little fix sounded nice I tried to apply it. Unfortunately it didn't work on my two computers. What did I do: 1. Downloaded the boinc.exe via the link you provided earlier 2. Shutdown of boin and boinc mgr 3. Renamed boinc.exe in boinc_old.exe in the original boinc folder 4. Copied your boinc.exe from step 1 to the exact same folder 5. started boinc manager Result: exact same outcome like some posts above: boincmanager was not able to connect to the client. So i made sure no other copies of boinc were running. I even deacivated boinc in the autostart menu and restarted the pc. So boinc was not running at all. I repeated steps 3 to 5. Result: no connection to client When I change back original boinc.exe everything works fine (except the issue with getting new tasks instantly). So is there a thing that I did obviously wrong? Did I take an old version of boinc.exe or additionly do I have to edit some of the config files in order to get your fix working? Thank you in advance for any answers! | 
|  Joseph Stateson  Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,463,985,753 RAC: 25     | 
 
 By AutoStart I assume you mean the windows tasks manager "Startup". I assume the following: 1. Boinc is not running as you rebooted with it disabled in that task manager startup tab 2 At "c:\Program Files" you have the following: boinc.exe (the 7.15.0 version) boinc_orig.exe (the 7.14.x or later) boinccmd.exe (the one that matches the 7.14) 3. At "c:\ProgramData\boinc" you have all the boinc project stuff 4. You have brought up the task manager and neither boinc nor boincmgr is running We need to verify the program will run. This requires the adminstrator command prompt Type "cmd" into the windows search panel and select "run as administrator" Check out each program as follows. Be sure to use the ".\" else windows might run a program in another folder instead of the current folder. C:\WINDOWS\system32>cd "\Program Files\BOINC" C:\Program Files\BOINC>.\boinc.exe --version 7.15.0 windows_x86_64 C:\Program Files\BOINC>.\boinccmd --version boinccmd, built from BOINC 7.14.2 c:\Program Files\BOINC>.\boinc_original.exe --version 7.14.2 windows_x86_64 Now verify that boinc is not running by asking for a few things D:\ProgramFiles\Boinc>boinccmd --quit can't connect to local host D:\ProgramFiles\Boinc>boinccmd --status can't connect to local host So far so good, boinc is not running but it can be executed without getting an error message such as a dll is missing or older version is not compatible. Now try to run the new program as follows C:\Program Files\BOINC>.\boinc.exe --dir c:\ProgramData\boinc 14-Feb-2020 06:38:21 [---] Starting BOINC client version 7.15.0 for windows_x86_64 14-Feb-2020 06:38:21 [---] This a development version of BOINC and may not function properly 14-Feb-2020 06:38:21 [---] log flags: file_xfer, sched_ops, task, milkyway_debug 14-Feb-2020 06:38:21 [---] Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8 14-Feb-2020 06:38:21 [---] Data directory: C:\ProgramData\BOINC 14-Feb-2020 06:38:21 [---] Running under account jstateson 14-Feb-2020 06:38:22 [---] OpenCL: AMD/ATI GPU 0: Radeon RX 570 Series (driver version 2906.10, device version OpenCL 2.0 AMD-APP (2906.10), 4096MB, 4096MB available, 5095 GFLOPS peak) 14-Feb-2020 06:38:22 [---] OpenCL: AMD/ATI GPU 1: Radeon RX 570 Series (driver version 2906.10, device version OpenCL 2.0 AMD-APP (2906.10), 4096MB, 4096MB available, 5095 GFLOPS peak) 14-Feb-2020 06:38:22 [---] app version refers to missing GPU type NVIDIA 14-Feb-2020 06:38:22 [Milkyway@Home] Application uses missing NVIDIA GPU 14-Feb-2020 06:38:22 [---] All projects have zero resource share; setting to 100 14-Feb-2020 06:38:22 [---] Host name: newxps 14-Feb-2020 06:38:22 [---] Processor: 12 GenuineIntel Intel(R) Xeon(R) CPU X5675 @ 3.07GHz [Family 6 Model 44 Stepping 2] 14-Feb-2020 06:38:22 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx smx tm2 dca pbe 14-Feb-2020 06:38:22 [---] OS: Microsoft Windows 10: Professional x64 Edition, (10.00.18363.00) 14-Feb-2020 06:38:22 [---] Memory: 23.99 GB physical, 47.99 GB virtual 14-Feb-2020 06:38:22 [---] Disk: 465.21 GB total, 353.50 GB free 14-Feb-2020 06:38:22 [---] Local time is UTC -6 hours 14-Feb-2020 06:38:22 [---] No WSL found. 14-Feb-2020 06:38:22 [Milkyway@Home] Found app_config.xml 14-Feb-2020 06:38:22 [---] Config: use all coprocessors 14-Feb-2020 06:38:22 [World Community Grid] General prefs: from World Community Grid (last modified 07-Feb-2020 21:58:20) 14-Feb-2020 06:38:22 [World Community Grid] Host location: none 14-Feb-2020 06:38:22 [World Community Grid] General prefs: using your defaults 14-Feb-2020 06:38:22 [---] Reading preferences override file ... You should see something like the above. Lemme know if it worked. | 
| Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0     | 
 Just implemented you modded boinc.exe and I have it working well on 3 machines right now (thank you for this!).  They all also have the coproc_info.xml hack to show multi-GPU setups.  So I should pick up some more PPD. I'm using this in my cc_config.xml for all of them: <cc_config>
  <options>
    <start_delay>30</start_delay>
    <report_results_immediately>1</report_results_immediately>
    <max_file_xfers>20</max_file_xfers>
    <max_file_xfers_per_project>20</max_file_xfers_per_project>
    <use_all_gpus>1</use_all_gpus>
    <allow_multiple_clients>1</allow_multiple_clients>
    <mw_low_water_pct>1</mw_low_water_pct>
    <mw_high_water_pct>16</mw_high_water_pct>
    <mw_wait_interval>256</mw_wait_interval>
  </options>
</cc_config>Staying full with ~900 WUs per machine consistently now :) Now to get my S9100 up and running next week too! | 
|  Wij31s6SD19aNbLrtAydbva42tL Send message Joined: 1 Jul 19 Posts: 6 Credit: 5,436,903,016 RAC: 143,933       | 
 
 Thank you for your long answer :-) I followed your instructions. Here are my results :-) Your assumptions 1. to 4. are right (exception: german system, so c:\program files\boinc is c:\programme\boinc on my system; but the relative location matches your assumptions) So BOINC not running, (double checked it in task manager) I went to the command promt as administrator c:\programme\BOINC>.\boinc.exe --version --> I got a german error message stating "execution of the code can not be continued because MSCR120.dll was not found. Problem migth be solved be reinstalling the program" So obviously your boinc.exe needs MSVCR120.dll. My system only has MSVCR100.dll (installation was in Dec 2019) c:\programme\BOINC>.\boinccmd.exe --version --> built from BOINC 7.14.2 (like you already stated) c:\programme\BOINC>.\boinc_orig.exe --version --> 7.14.2 c:\programme\Boinc>boinccmd --quit --> can't connect to local host (like you already stated) c:\programme\Boinc>boinccmd --status --> can't connect to local host (like you already stated) So there is a problem with the MSVCR120.dll which is MSVCR100.dll on my system. | 
|  Joseph Stateson  Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,463,985,753 RAC: 25     | 
 I built that "mod" with Visual Studio 2013.  I am not sure what version the official release was built with.  That "100" DLL might be 2010 version.  Berkeley has a problem maintain compatibility with Linux and after 2013 Microsoft made so many changes that Berkeley gave up on using "the latest and greatest".   Just my opinion, worth 2c. But I know for a fact that VS2013 was the last version of Visual Studio compatible with BOINC. The VS 2013 runtimes are here https://www.techspot.com/downloads/6776-visual-c-redistributable-package.html Supposedly new releases are compatible with older so you might want to put in the latest version (listed as an option on that page) One of my kids is in Germany. He keeps ordering stuff from amazon, it arrives here in texas and I have to mail it through customs to him. Is the VAT so high that it is cheaper to order from USA amazon.com than get it direct from amazon.de ? The customs form warns against sending coffee or dried beef. Unfortunately that means problems getting Keurig K-Cups and Oma's beef jerky which is usually what he wants. [edit] Forgot to mention: running boinc manually was just to be able to spot warning messages. | 
| Send message Joined: 29 Jul 14 Posts: 19 Credit: 3,452,814,498 RAC: 0     | 
 Hi I see you guys mentioning editing "coproc" or something like that to download 900 WUs at a time instead of 150 - 300? I was wondering how you go about doing that, because I've been searching around for a bit and I can't find any instructions. Also how do you install the ubuntu version? I'm a complete novice with Linux and you mention you need to "use 0751 on program and 0664 on the xml" but I have no idea what that means. | 
|  Joseph Stateson  Send message Joined: 18 Nov 08 Posts: 291 Credit: 2,463,985,753 RAC: 25     | 
 Hi I see you guys mentioning editing "coproc" or something like that to download 900 WUs at a time instead of 150 - 300? I was wondering how you go about doing that, because I've been searching around for a bit and I can't find any instructions. Milkyway allows 300 downloads per GPU with a max of 900. There is no need to get 900 or 9000. The problem is when you run out of data there is a 10 minutes wait before anymore arrives which is a bummer. I am not sure if the boinc_linux program will work on your version of Linux. I had forgotten I had even made a Linux version. Before doing anything, make a backup of the boinc stuff you got. I assume you are using 7.2.14. Maybe 7.16.3 will work download all the linux files to a download directory such as /home/username/Download the executable goes to /usr/bin the config goes to /etc/boinc-client do something like the following using a terminal window and adding "sudo" in front of commands that complain and fixing any typos or adding anything I might have left out that is needed. sudo su /etc/init.d/boinc-client stop mkdir Download cd Download mkdir mw_fix cd mw_fix wget https://github.com/JStateson/MilkywayNewWork/archive/master.zip unzip ./master.zip chmod 0755 ./boinc_ubuntu chown root:root ./boinc_ubuntu chmod 0644 ./cc_config.xml mv /usr/bin/boinc /usr/bin/boinc_original mv /etc/boinc-client/cc_config.xml /etc/boinc-client/cc_config.xml.bu cp ./boinc_ubuntu /usr/bin/boinc cp ./cc_config.xml /etc/boinc-client/cc_config.xml /etc/init.d/boinc-client start within a 5-8 minutes you should see Milkyway download some addtional work and the number of "waiting" work units should hover just below the maximum (300, 600 or 900). Should never go to 0 again unless the project goes off-line. in the file cc_config.xml, if you edit it for: <mw_debug>1</mw_debug> then look at event viewer for messages about the bug fix for debugging purposes | 
 
        
        ©2025 Astroinformatics Group